low-rank methods for analysis of high-dimensional data (siam cse talk 2017)

Post on 19-Mar-2017

68 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Low-rank tensor methods for analysis of highdimensional data

Alexander Litvinenko and Mike Espig

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

http://sri-uq.kaust.edu.sa/

Extreme Computing Research Center, KAUST

Alexander Litvinenko and Mike Espig Low-rank tensor methods for analysis of high dimensional data

4*

KAUST

I received very rich collaboration experience as a co-organizator of:I 3 UQ workshops,I 2 Scalable Hierarchical Algorithms for eXtreme Computing

(SHAXC) workshopsI 1 HPC Conference (www.hpcsaudi.org, 2017)

4*

My previous work

After applying the stochastic Galerkin method, obtain:Ku = f, where all ingredients are represented in a tensor format

Compute max{u}, var(u), level sets of u, sign(u)[1] Efficient Analysis of High Dimensional Data in Tensor Formats,

Espig, Hackbusch, A.L., Matthies and Zander, 2012.

Research which ingredients influence on the tensor rank of K[2] Efficient low-rank approximation of the stochastic Galerkin matrix in tensor formats,

Wahnert, Espig, Hackbusch, A.L., Matthies, 2013.

Approximate κ(x , ω), stochastic Galerkin operator K in TensorTrain (TT) format, solve for u, postprocessing[3] Polynomial Chaos Expansion of random coefficients and the solution of stochastic

partial differential equations in the Tensor Train format, Dolgov, Litvinenko, Khoromskij, Matthies, 2016.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

-2 / 18

4*

Typical quantities of interest

Keeping all input and intermediate data in a tensorrepresentation one wants to perform different tasks:

I evaluation for specific parameters (ω1, . . . , ωM),I finding maxima and minima,I finding ‘level sets’ (needed for histogram and probability

density).Example of level set: all elements of a high dimensional tensorfrom the interval [0.7,0.8].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

-1 / 18

4*

Canonical and Tucker tensor formats

Definition and Examples of tensors

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

0 / 18

4*

Canonical and Tucker tensor formats

[Pictures are taken from B. Khoromskij and A. Auer lecture course]

Storage: O(nd )→ O(dRn) and O(Rd + dRn).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

1 / 18

4*

Definition of tensor of order d

Tensor of order d is a multidimensional array over a d-tupleindex set I = I1 × · · · × Id ,

A = [ai1...id : i` ∈ I`] ∈ RI , I` = {1, ...,n`}, ` = 1, ..,d .

A is an element of the linear space

Vn =d⊗`=1

V`, V` = RI`

equipped with the Euclidean scalar product 〈·, ·〉 : Vn ×Vn → R,defined as

〈A,B〉 :=∑

(i1...id )∈I

ai1...id bi1...id , for A, B ∈ Vn.

Let T :=⊗d

µ=1 Rnµ ,

RR(T ) :={∑R

i=1⊗d

µ=1 viµ ∈ T : viµ ∈ Rnµ}

,Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

2 / 18

4*

Examples of rank-1 and rank-2 tensors

Rank-1:f (x1, ..., xd ) = exp(f1(x1) + ...+ fd (xd )) =

∏dj=1 exp(fj(xj))

Rank-2: f (x1, ..., xd ) = sin(∑d

j=1 xj), since

2i · sin(∑d

j=1 xj) = ei∑d

j=1 xj − e−i∑d

j=1 xj

Rank-d function f (x1, ..., xd ) = x1 + x2 + ...+ xd can beapproximated by rank-2: with any prescribed accuracy:

f ≈∏d

j=1(1 + εxj)

ε−∏d

j=1 1ε

+O(ε), as ε→ 0

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

3 / 18

4*

Tensor and Matrices

Rank-1 tensor

A = u1 ⊗ u2 ⊗ ...⊗ ud =:d⊗µ=1

Ai1,...,id = (u1)i1 · ... · (ud )id

Rank-1 tensor A = u ⊗ v , matrix A = uvT , A = vuT , u ∈ Rn,v ∈ Rm,Rank-k tensor A =

∑ki=1 ui ⊗ vi , matrix A =

∑ki=1 uivT

i .Kronecker product of n × n and m ×m matrices is a new blockmatrix A⊗ B ∈ Rnm×nm, whose ij-th block is [AijB].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

4 / 18

4*

Computing QoI in low-rank tensor format

Now, we consider how tofind maxima in a high-dimensional tensor

4*

Maximum norm and corresponding index

Let u =∑r

j=1⊗d

µ=1 ujµ ∈ Rr , compute

‖u‖∞ := maxi:=(i1,...,id )∈I

|ui | = maxi:=(i1,...,id )∈I

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

∣∣∣∣∣∣ .Computing ‖u‖∞ is equivalent to the following e.v. problem.

Let i∗ := (i∗1 , . . . , i∗d ) ∈ I, #I =

∏dµ=1 nµ.

‖u‖∞ = |ui∗ | =

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

i∗µ

∣∣∣∣∣∣ and e(i∗) :=d⊗µ=1

ei∗µ ,

where ei∗µ ∈ Rnµ the i∗µ-th canonical vector in Rnµ (µ ∈ N≤d ).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

5 / 18

Then

u � e(i∗) =

r∑j=1

d⊗µ=1

ujµ

� d⊗µ=1

ei∗µ

=r∑

j=1

d⊗µ=1

ujµ � ei∗µ

=r∑

j=1

d⊗µ=1

[(ujµ)i∗µei∗µ

]

=

r∑j=1

d∏µ=1

(ujµ)i∗µ

︸ ︷︷ ︸

ui∗=

d⊗µ=1

e(i∗µ) = ui∗e(i∗).

Thus, we obtained an “eigenvalue problem”:

u � e(i∗) = ui∗e(i∗).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

6 / 18

4*

Computing ‖u‖∞, u ∈ Rr by vector iteration

By defining the following diagonal matrix

D(u) :=r∑

j=1

d⊗µ=1

diag((ujµ)`µ

)`µ∈N≤nµ

(1)

with representation rank r , obtain D(u)v = u � v .Now apply the well-known vector iteration method (with ranktruncation) to

D(u)e(i∗) = ui∗e(i∗),

obtain ‖u‖∞.[Approximate iteration, Khoromskij, Hackbusch, Tyrtyshnikov 05],

and [Espig, Hackbusch 2010]

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

7 / 18

4*

How to compute the mean value in CP format

Let u =∑r

j=1⊗d

µ=1 ujµ ∈ Rr , then the mean value u can becomputed as a scalar product

u =

⟨ r∑j=1

d⊗µ=1

ujµ

,

d⊗µ=1

1nµ

⟩ =r∑

j=1

d⊗µ=1

⟨ujµ, 1µ

⟩nµ

=

(2)

=r∑

j=1

d∏µ=1

1nµ

( nµ∑k=1

(ujµ)k

), (3)

where 1µ := (1, . . . ,1)T ∈ Rnµ .Numerical cost is O

(r ·∑d

µ=1 nµ)

.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

8 / 18

4*

How to compute the variance in CP format

Let u ∈ Rr and

u := u − ud⊗µ=1

1nµ

1 =r+1∑j=1

d⊗µ=1

ujµ ∈ Rr+1, (4)

then the variance var(u) of u can be computed as follows

var(u) =〈u, u〉∏dµ=1 nµ

=1∏d

µ=1 nµ

⟨r+1∑i=1

d⊗µ=1

uiµ

,

r+1∑j=1

d⊗ν=1

ujν

=r+1∑i=1

r+1∑j=1

d∏µ=1

1nµ

⟨uiµ, ujµ

⟩.

Numerical cost is O(

(r + 1)2 ·∑d

µ=1 nµ)

.

4*

Computing QoI in low-rank tensor format

Now, we consider how tofind ‘level sets’,

for instance, all entries of tensor u from interval [a,b].

4*

Definitions of characteristic and sign functions

1. To compute level sets and frequencies we needcharacteristic function.2. To compute characteristic function we need sign function.

The characteristic χI(u) ∈ T of u ∈ T in I ⊂ R is for every multi-index i ∈ I pointwise defined as

(χI(u))i :=

{1, ui ∈ I,0, ui /∈ I.

Furthermore, the sign(u) ∈ T is for all i ∈ I pointwise definedby

(sign(u))i :=

1, ui > 0;−1, ui < 0;0, ui = 0.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

10 / 18

4*

sign(u) is needed for computing χI(u)

LemmaLet u ∈ T , a,b ∈ R, and 1 =

⊗dµ=1 1µ, where

1µ := (1, . . . ,1)t ∈ Rnµ .(i) If I = R<b, then we have χI(u) = 1

2(1+ sign(b1− u)).

(ii) If I = R>a, then we have χI(u) = 12(1− sign(a1− u)).

(iii) If I = (a,b), then we haveχI(u) = 1

2(sign(b1− u)− sign(a1− u)).

Computing sign(u), u ∈ Rr , via hybrid Newton-Schulz iterationwith rank truncation after each iteration.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

11 / 18

4*

Level Set, Frequency

Definition (Level Set, Frequency)Let I ⊂ R and u ∈ T . The level set LI(u) ∈ T of u respect to I ispointwise defined by

(LI(u))i :=

{ui ,ui ∈ I ;0,ui /∈ I ,

for all i ∈ I.The frequency FI(u) ∈ N of u respect to I is defined as

FI(u) := # suppχI(u).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

12 / 18

4*

Computation of level sets and frequency

PropositionLet I ⊂ R, u ∈ T , and χI(u) its characteristic. We have

LI(u) = χI(u)� u

and rank(LI(u)) ≤ rank(χI(u)) rank(u).The frequency FI(u) ∈ N of u respect to I is

FI(u) = 〈χI(u),1〉 ,

where 1 =⊗d

µ=1 1µ, 1µ := (1, . . . ,1)T ∈ Rnµ .

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

13 / 18

4*

Numerical Experiments

2D L-shape domain, N = 557 dofs.Total stochastic dimension is Mu = Mk + Mf = 20, there are|J | = 231 PCE coefficients

u =231∑j=1

uj,0 ⊗20⊗µ=1

ujµ ∈ R557 ⊗20⊗µ=1

R3.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

14 / 18

4*

Level sets

Now we compute level sets

sign(b‖u‖∞1− u)for b ∈ {0.2, 0.4, 0.6, 0.8}.

I Tensor u has 320 ∗ 557 ≈ 2 · 1012 entries ≈ 16 TB ofmemory.

I The computing time of one level set was 10 minutes.I Intermediate ranks of sign(b‖u‖∞1− u) and of rank(uk )

were less than 24.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

15 / 18

4*

Example: Canonical rank d , whereas TT rank 2

d-Laplacian over uniform tensor grid. It is known to have theKronecker rank-d representation,

∆d = A⊗IN⊗...⊗IN +IN⊗A⊗...⊗IN +...+IN⊗IN⊗...⊗A ∈ RI⊗d⊗I⊗d

(5)with A = ∆1 = tridiag{−1,2,−1} ∈ RN×N , and IN being theN × N identity. Notice that for the canonical rank we have rankkC(∆d ) = d , while TT-rank of ∆d is equal to 2 for anydimension due to the explicit representation

∆d = (∆1 I)×(

I 0∆1 I

)× ...×

(I 0

∆1 I

)×(

I∆1

)(6)

where the rank product operation ”×” is defined as a regularmatrix product of the two corresponding core matrices, theirblocks being multiplied by means of tensor product. The similarbound is true for the Tucker rank rankTuck (∆d ) = 2.

4*

Advantages and disadvantages

Denote k - rank, d-dimension, n = # dofs in 1D:

1. CP: ill-posed approx. alg-m, O(dnk), hard to computeapprox.

2. Tucker: reliable arithmetic based on SVD, O(dnk + kd )

3. Hierarchical Tucker: based on SVD, storage O(dnk + dk3),truncation O(dnk2 + dk4)

4. TT: based on SVD, O(dnk2) or O(dnk3), stable5. Quantics-TT: O(nd )→ O(d logqn)

top related