low-rank methods for analysis of high-dimensional data (siam cse talk 2017)

24
Low-rank tensor methods for analysis of high dimensional data Alexander Litvinenko and Mike Espig Center for Uncertainty Quantification http://sri-uq.kaust.edu.sa/ Extreme Computing Research Center, KAUST Alexander Litvinenko and Mike Espig Low-rank tensor methods for analysis of high dimensional da

Upload: alexander-litvinenko

Post on 19-Mar-2017

66 views

Category:

Education


3 download

TRANSCRIPT

Page 1: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

Low-rank tensor methods for analysis of highdimensional data

Alexander Litvinenko and Mike Espig

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

http://sri-uq.kaust.edu.sa/

Extreme Computing Research Center, KAUST

Alexander Litvinenko and Mike Espig Low-rank tensor methods for analysis of high dimensional data

Page 2: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

KAUST

I received very rich collaboration experience as a co-organizator of:I 3 UQ workshops,I 2 Scalable Hierarchical Algorithms for eXtreme Computing

(SHAXC) workshopsI 1 HPC Conference (www.hpcsaudi.org, 2017)

Page 3: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

My previous work

After applying the stochastic Galerkin method, obtain:Ku = f, where all ingredients are represented in a tensor format

Compute max{u}, var(u), level sets of u, sign(u)[1] Efficient Analysis of High Dimensional Data in Tensor Formats,

Espig, Hackbusch, A.L., Matthies and Zander, 2012.

Research which ingredients influence on the tensor rank of K[2] Efficient low-rank approximation of the stochastic Galerkin matrix in tensor formats,

Wahnert, Espig, Hackbusch, A.L., Matthies, 2013.

Approximate κ(x , ω), stochastic Galerkin operator K in TensorTrain (TT) format, solve for u, postprocessing[3] Polynomial Chaos Expansion of random coefficients and the solution of stochastic

partial differential equations in the Tensor Train format, Dolgov, Litvinenko, Khoromskij, Matthies, 2016.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

-2 / 18

Page 4: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Typical quantities of interest

Keeping all input and intermediate data in a tensorrepresentation one wants to perform different tasks:

I evaluation for specific parameters (ω1, . . . , ωM),I finding maxima and minima,I finding ‘level sets’ (needed for histogram and probability

density).Example of level set: all elements of a high dimensional tensorfrom the interval [0.7,0.8].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

-1 / 18

Page 5: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Canonical and Tucker tensor formats

Definition and Examples of tensors

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

0 / 18

Page 6: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Canonical and Tucker tensor formats

[Pictures are taken from B. Khoromskij and A. Auer lecture course]

Storage: O(nd )→ O(dRn) and O(Rd + dRn).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

1 / 18

Page 7: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Definition of tensor of order d

Tensor of order d is a multidimensional array over a d-tupleindex set I = I1 × · · · × Id ,

A = [ai1...id : i` ∈ I`] ∈ RI , I` = {1, ...,n`}, ` = 1, ..,d .

A is an element of the linear space

Vn =d⊗`=1

V`, V` = RI`

equipped with the Euclidean scalar product 〈·, ·〉 : Vn ×Vn → R,defined as

〈A,B〉 :=∑

(i1...id )∈I

ai1...id bi1...id , for A, B ∈ Vn.

Let T :=⊗d

µ=1 Rnµ ,

RR(T ) :={∑R

i=1⊗d

µ=1 viµ ∈ T : viµ ∈ Rnµ}

,Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

2 / 18

Page 8: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Examples of rank-1 and rank-2 tensors

Rank-1:f (x1, ..., xd ) = exp(f1(x1) + ...+ fd (xd )) =

∏dj=1 exp(fj(xj))

Rank-2: f (x1, ..., xd ) = sin(∑d

j=1 xj), since

2i · sin(∑d

j=1 xj) = ei∑d

j=1 xj − e−i∑d

j=1 xj

Rank-d function f (x1, ..., xd ) = x1 + x2 + ...+ xd can beapproximated by rank-2: with any prescribed accuracy:

f ≈∏d

j=1(1 + εxj)

ε−∏d

j=1 1ε

+O(ε), as ε→ 0

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

3 / 18

Page 9: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Tensor and Matrices

Rank-1 tensor

A = u1 ⊗ u2 ⊗ ...⊗ ud =:d⊗µ=1

Ai1,...,id = (u1)i1 · ... · (ud )id

Rank-1 tensor A = u ⊗ v , matrix A = uvT , A = vuT , u ∈ Rn,v ∈ Rm,Rank-k tensor A =

∑ki=1 ui ⊗ vi , matrix A =

∑ki=1 uivT

i .Kronecker product of n × n and m ×m matrices is a new blockmatrix A⊗ B ∈ Rnm×nm, whose ij-th block is [AijB].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

4 / 18

Page 10: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Computing QoI in low-rank tensor format

Now, we consider how tofind maxima in a high-dimensional tensor

Page 11: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Maximum norm and corresponding index

Let u =∑r

j=1⊗d

µ=1 ujµ ∈ Rr , compute

‖u‖∞ := maxi:=(i1,...,id )∈I

|ui | = maxi:=(i1,...,id )∈I

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

∣∣∣∣∣∣ .Computing ‖u‖∞ is equivalent to the following e.v. problem.

Let i∗ := (i∗1 , . . . , i∗d ) ∈ I, #I =

∏dµ=1 nµ.

‖u‖∞ = |ui∗ | =

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

i∗µ

∣∣∣∣∣∣ and e(i∗) :=d⊗µ=1

ei∗µ ,

where ei∗µ ∈ Rnµ the i∗µ-th canonical vector in Rnµ (µ ∈ N≤d ).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

5 / 18

Page 12: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

Then

u � e(i∗) =

r∑j=1

d⊗µ=1

ujµ

� d⊗µ=1

ei∗µ

=r∑

j=1

d⊗µ=1

ujµ � ei∗µ

=r∑

j=1

d⊗µ=1

[(ujµ)i∗µei∗µ

]

=

r∑j=1

d∏µ=1

(ujµ)i∗µ

︸ ︷︷ ︸

ui∗=

d⊗µ=1

e(i∗µ) = ui∗e(i∗).

Thus, we obtained an “eigenvalue problem”:

u � e(i∗) = ui∗e(i∗).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

6 / 18

Page 13: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Computing ‖u‖∞, u ∈ Rr by vector iteration

By defining the following diagonal matrix

D(u) :=r∑

j=1

d⊗µ=1

diag((ujµ)`µ

)`µ∈N≤nµ

(1)

with representation rank r , obtain D(u)v = u � v .Now apply the well-known vector iteration method (with ranktruncation) to

D(u)e(i∗) = ui∗e(i∗),

obtain ‖u‖∞.[Approximate iteration, Khoromskij, Hackbusch, Tyrtyshnikov 05],

and [Espig, Hackbusch 2010]

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

7 / 18

Page 14: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

How to compute the mean value in CP format

Let u =∑r

j=1⊗d

µ=1 ujµ ∈ Rr , then the mean value u can becomputed as a scalar product

u =

⟨ r∑j=1

d⊗µ=1

ujµ

,

d⊗µ=1

1nµ

⟩ =r∑

j=1

d⊗µ=1

⟨ujµ, 1µ

⟩nµ

=

(2)

=r∑

j=1

d∏µ=1

1nµ

( nµ∑k=1

(ujµ)k

), (3)

where 1µ := (1, . . . ,1)T ∈ Rnµ .Numerical cost is O

(r ·∑d

µ=1 nµ)

.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

8 / 18

Page 15: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

How to compute the variance in CP format

Let u ∈ Rr and

u := u − ud⊗µ=1

1nµ

1 =r+1∑j=1

d⊗µ=1

ujµ ∈ Rr+1, (4)

then the variance var(u) of u can be computed as follows

var(u) =〈u, u〉∏dµ=1 nµ

=1∏d

µ=1 nµ

⟨r+1∑i=1

d⊗µ=1

uiµ

,

r+1∑j=1

d⊗ν=1

ujν

=r+1∑i=1

r+1∑j=1

d∏µ=1

1nµ

⟨uiµ, ujµ

⟩.

Numerical cost is O(

(r + 1)2 ·∑d

µ=1 nµ)

.

Page 16: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Computing QoI in low-rank tensor format

Now, we consider how tofind ‘level sets’,

for instance, all entries of tensor u from interval [a,b].

Page 17: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Definitions of characteristic and sign functions

1. To compute level sets and frequencies we needcharacteristic function.2. To compute characteristic function we need sign function.

The characteristic χI(u) ∈ T of u ∈ T in I ⊂ R is for every multi-index i ∈ I pointwise defined as

(χI(u))i :=

{1, ui ∈ I,0, ui /∈ I.

Furthermore, the sign(u) ∈ T is for all i ∈ I pointwise definedby

(sign(u))i :=

1, ui > 0;−1, ui < 0;0, ui = 0.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

10 / 18

Page 18: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

sign(u) is needed for computing χI(u)

LemmaLet u ∈ T , a,b ∈ R, and 1 =

⊗dµ=1 1µ, where

1µ := (1, . . . ,1)t ∈ Rnµ .(i) If I = R<b, then we have χI(u) = 1

2(1+ sign(b1− u)).

(ii) If I = R>a, then we have χI(u) = 12(1− sign(a1− u)).

(iii) If I = (a,b), then we haveχI(u) = 1

2(sign(b1− u)− sign(a1− u)).

Computing sign(u), u ∈ Rr , via hybrid Newton-Schulz iterationwith rank truncation after each iteration.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

11 / 18

Page 19: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Level Set, Frequency

Definition (Level Set, Frequency)Let I ⊂ R and u ∈ T . The level set LI(u) ∈ T of u respect to I ispointwise defined by

(LI(u))i :=

{ui ,ui ∈ I ;0,ui /∈ I ,

for all i ∈ I.The frequency FI(u) ∈ N of u respect to I is defined as

FI(u) := # suppχI(u).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

12 / 18

Page 20: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Computation of level sets and frequency

PropositionLet I ⊂ R, u ∈ T , and χI(u) its characteristic. We have

LI(u) = χI(u)� u

and rank(LI(u)) ≤ rank(χI(u)) rank(u).The frequency FI(u) ∈ N of u respect to I is

FI(u) = 〈χI(u),1〉 ,

where 1 =⊗d

µ=1 1µ, 1µ := (1, . . . ,1)T ∈ Rnµ .

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

13 / 18

Page 21: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Numerical Experiments

2D L-shape domain, N = 557 dofs.Total stochastic dimension is Mu = Mk + Mf = 20, there are|J | = 231 PCE coefficients

u =231∑j=1

uj,0 ⊗20⊗µ=1

ujµ ∈ R557 ⊗20⊗µ=1

R3.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

14 / 18

Page 22: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Level sets

Now we compute level sets

sign(b‖u‖∞1− u)for b ∈ {0.2, 0.4, 0.6, 0.8}.

I Tensor u has 320 ∗ 557 ≈ 2 · 1012 entries ≈ 16 TB ofmemory.

I The computing time of one level set was 10 minutes.I Intermediate ranks of sign(b‖u‖∞1− u) and of rank(uk )

were less than 24.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

15 / 18

Page 23: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Example: Canonical rank d , whereas TT rank 2

d-Laplacian over uniform tensor grid. It is known to have theKronecker rank-d representation,

∆d = A⊗IN⊗...⊗IN +IN⊗A⊗...⊗IN +...+IN⊗IN⊗...⊗A ∈ RI⊗d⊗I⊗d

(5)with A = ∆1 = tridiag{−1,2,−1} ∈ RN×N , and IN being theN × N identity. Notice that for the canonical rank we have rankkC(∆d ) = d , while TT-rank of ∆d is equal to 2 for anydimension due to the explicit representation

∆d = (∆1 I)×(

I 0∆1 I

)× ...×

(I 0

∆1 I

)×(

I∆1

)(6)

where the rank product operation ”×” is defined as a regularmatrix product of the two corresponding core matrices, theirblocks being multiplied by means of tensor product. The similarbound is true for the Tucker rank rankTuck (∆d ) = 2.

Page 24: Low-rank methods for analysis of high-dimensional data (SIAM CSE talk 2017)

4*

Advantages and disadvantages

Denote k - rank, d-dimension, n = # dofs in 1D:

1. CP: ill-posed approx. alg-m, O(dnk), hard to computeapprox.

2. Tucker: reliable arithmetic based on SVD, O(dnk + kd )

3. Hierarchical Tucker: based on SVD, storage O(dnk + dk3),truncation O(dnk2 + dk4)

4. TT: based on SVD, O(dnk2) or O(dnk3), stable5. Quantics-TT: O(nd )→ O(d logqn)