data sparse approximation of karhunen-loeve expansion

Data sparse approximation of theKarhunen-Loeve expansion

Alexander Litvinenko,joint with B. Khoromskij (Leipzig) and H. Matthies(Braunschweig)

Institut fur Wissenschaftliches Rechnen, Technische Universitat Braunschweig,0531-391-3008, [email protected]

March 5, 2008

Outline

Introduction

KLE

Numerical techniquesFFTHierarchical MatricesSparse tensor approximation

Application

Conclusion

Stochastic PDE

We consider

− div(κ(x , ω)∇u) = f (x , ω) in D,u = 0 on ∂D,

with stochastic coefficients κ(x , ω), x ∈ D ⊆ Rd and ω belongs to the

space of random events Ω.

[Babuska, Ghanem, Matthies, Schwab, Vandewalle, ...].

Methods and techniques:

1. Response surface

2. Monte-Carlo

3. Perturbation

4. Stochastic Galerkin

Examples of covariance functions [Novak,(IWS),04]

The random field requires to specify its spatial correl. structure

covf (x , y) = E[(f (x , ·) − µf (x))(f (y , ·) − µf (y))],

where E is the expectation and µf (x) := E[f (x , ·)].

Let h =

√∑3

i=1

(√h2

i /ℓ2i + d2 − d

)2, where hi := xi − yi , i = 1, 2, 3,

ℓi are cov. lengths and d a parameter.

Gaussian cov(h) = σ2 · exp(−h2),

exponential cov(h) = σ2 · exp(−h),

spherical

cov(h) =

σ2 ·

(1 − 3

2hhr

− 12

h3

h3r

)for 0 ≤ h ≤ hr ,

0 for h > hr .

Outline

Introduction

KLE


Application

Conclusion

KLE

The spectral representation of the cov. function isCκ(x , y) =

∑∞i=0 λi ki(x)ki(y), where λi and ki(x) are the eigenvalues

and eigenfunctions.The Karhunen-Loeve expansion [Loeve, 1977] is the series

κ(x , ω) = µk (x) +∞∑

i=1

√λiki(x)ξi(ω), where

ξi (ω) are uncorrelated random variables and ki are basis functions inL2(D).Eigenpairs λi , ki are the solution of

Tki = λi ki , ki ∈ L2(D), i ∈ N, where.

T : L2(D) → L2(D),(Tu)(x) :=

∫D covk (x , y)u(y)dy .

Outline

Introduction

KLE


Application

Conclusion

Computation of eigenpairs by FFT

If the cov. function depends on (x − y) then on a uniform tensor gridthe cov. matrix C is (block) Toeplitz.Then C can be extended to the circulant one and the decomposition

C =1n

F HΛF (1)

may be computed like follows. Multiply (1) by F becomes

FC = ΛF ,

FC1 = ΛF1.

Since all entries of F1 are unity, obtain

λ = FC1.

FC1 may be computed very efficiently by FFT [Cooley, 1965] inO(n log n) FLOPS.C1 may be represented in a matrix or in a tensor format.

Multidimensional FFT

Lemma: The d-dim. FT F (d) can be represented as following

F (d) = (F(1)1 ⊗ I ⊗ I . . .)(I ⊗ F

(1)2 ⊗ I . . .) . . . (I ⊗ I . . . ⊗ F

(1)d ), (2)

and the complexity of F (d) is O(nd log n), where n is the number ofdofs in one direction.

Discrete eigenvalue problem

Let

Wij :=∑

k ,m

∫

Dbi(x)bk (x)dxCkm

∫

Dbj(y)bm(y)dy ,

Mij =

∫

Dbi(x)bj (x)dx.

Then we solve

Wfhℓ = λℓMfh

ℓ , where W := MCM

Approximate C in

low rank format the H-matrix format sparse tensor format

and use the Lanczos method to compute m largest eigenvalues.

Examples of H-matrix approximates ofcov(x , y) = e−2|x−y |

[Hackbusch et al. 99]

25 20

20 20

20 16

20 16

20 20

16 16

20 16

16 16

4 4

20 4 324 4

16 4 324 20

4 4

4 16

4 4

32 32

20 20

20 20 32

32 32

4 3

4 4 3220 4

16 4 32

32 4

32 32

4 32

32 32

32 4

32 324 4

4 4

20 16

4 4

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 44 4

20 4 32

32 32 4

4 432 4

32 32 4

4 432 32

4 32 4

4 432 32

32 32 4

44 20

4 4 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

20 20

20 20 32

32 32

4 4

20 4 32

32 324 20

4 4 32

32 32

20 20

20 20 32

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

4 4 44 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 32

4 4

4 4

4 4

4 4 44 4

32 4

32 32 4

4 4

4 4

4 4

4 4 44

32 4

32 32

4 4

4 4

4 4

4 4

4 4 432 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

4 4

4 4

4 4

4 4

4 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

44 4

4 4

4 4

4 4

4 432 32

4 32 4

44 3

4 4

4 4

4 4

432 32

4 32

4 4

44 4

4 4

4 4

4 4

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

44 4

4 4

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 324 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 432 32

32 32 4

4 432 4

32 32 4

4 432 32

4 32 4

4 432 32

32 32 4

432 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 4

32 32

32 4

32 4

32 4

32 32

32 4

32 4

32 32

4 32

32 32

4 32

32 32

4 4

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

25 11

11 20 12

1320 11

9 1613

1320 11

11 20 13

13 3213

13

20 8

10 20 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

1320 10

10 20 12

12 3213

1332 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

1332 13

13 3213

13

20 9

9 20 13

13 32 13

1332 13

13 32

13

13

32 13

13 32 13

1332 13

13 3213

1332 13

13 32 13

1332 13

13 32

Figure: H-matrix approximations ∈ Rn×n, n = 322, with standard (left) and

weak (right) admissibility block partitionings. The biggest dense (dark) blocks∈ R

n×n, max. rank k = 4 left and k = 13 right.

H - Matrices

Comp. complexity is O(kn log n) and storage O(kn log n).

To assemble low-rank blocks use ACA [Bebendorf, Tyrtyshnikov].

Dependence of the computational time and storage requirements ofCH on the rank k , n = 322.

k time (sec.) memory (MB) ‖C−CH‖2

‖C‖2

2 0.04 2e + 6 3.5e − 56 0.1 4e + 6 1.4e − 59 0.14 5.4e + 6 1.4e − 512 0.17 6.8e + 6 3.1e − 717 0.23 9.3e + 6 6.3e − 8

The time for dense matrix C is 3.3 sec. and the storage 1.4e + 8 MB.

H - Matrices

Let h =

√∑2

i=1

(√h2

i /ℓ2i + d2 − d

)2, where hi := xi − yi , i = 1, 2, 3,

ℓi are cov. lengths and d = 1.

exponential cov(h) = σ2 · exp(−h),The cov. matrix C ∈ R

n×n, n = 652.

ℓ1 ℓ2‖C−CH‖2

‖C‖2

0.01 0.02 3e − 20.1 0.2 8e − 31 2 2.8e − 610 20 3.7e − 9

Exponential Singularvalue decay [see also Schwab etal.]

0 100 200 300 400 500 600 700 800 900 10000

100

200

300

400

500

600

700

0 100 200 300 400 500 600 700 800 900 10000

1

2

3

4

5

6

7

8

9

10x 10

4

0 100 200 300 400 500 600 700 800 900 10000

200

400

600

800

1000

1200

1400

1600

1800

0 100 200 300 400 500 600 700 800 900 10000

0.5

1

1.5

2

2.5x 10

5

0 100 200 300 400 500 600 700 800 900 10000

50

100

150

0 100 200 300 400 500 600 700 800 900 10000

0.5

1

1.5

2

2.5

3

3.5

4x 10

4

Figure: 23 grid 48 × 64 × 40, (left) l = 1, l = 2, l = 1 and (right) l = 5,

Sparse tensor decompositions of kernelscov(x , y) = cov(x − y)

We want to approximate C ∈ RN×N , N = nd by

Cr =∑r

k=1 V 1k ⊗ ... ⊗ V d

k such that ‖C − Cr‖ ≤ ε.

The storage of C is O(N2) = O(n2d ) and the storage of Cr is O(rdn2).

To define V ik use e.g. SVD.

Approximate all V ik in the H-matrix format and become HKT format.

See basic arithmetics in [Hackbusch, Khoromskij, Tyrtyshnikov].

Assume f (x , y), x = (x1, x2), y = (y1, y2), then the equivalent approx.problem is f (x1, x2; y1, y2) ≈

∑rk=1 Φk (x1, y1)Ψk (x2, y2).

Numerical examples of tensor approximations

Gaussian kernel exp−|x − y |2 has the Kroneker rank 1.

The exponen. kernel e − |x − y | can be approximated by a tensorwith low Kroneker rank

r 1 2 3 4 5 6 10‖C−Cr‖∞

‖C‖∞11.5 1.7 0.4 0.14 0.035 0.007 2.8e − 8

‖C−Cr‖2

‖C‖26.7 0.52 0.1 0.03 0.008 0.001 5.3e − 9

Outline

Introduction

KLE


Application

Conclusion

Application: covariance of the solution

For SPDE with stochastic RHS the eigenvalue problem and spectraldecom. look like

Cf fℓ = λℓfℓ, Cf = Φf Λf ΦTf .

If we only want the covariance

Cu = (K ⊗ K)−1Cf = (K−1 ⊗ K−1)Cf = K−1Cf K−T ,

one may with the KLE of Cf = Φf Λf ΦTf reduce this to

Cu = K−1Cf K−T = K−1

Φf ΛΦTf K−T .

Application: higher order moments

Let operator K be deterministic and

Ku(θ) =∑

α∈J

Ku(α)Hα(θ) = f(θ) =∑

α∈J

f (α)Hα(θ), with

u(α) = [u(α)1 , ..., u(α)

N ]T . Projecting onto each Hα obtain

Ku(α) = f (α).

The KLE of f(θ) is

f(θ) = f +∑

ℓ

√λℓφℓ(θ)fl =

∑

ℓ

∑

α

√λℓφ

(α)ℓ Hα(θ)fl

=∑

α

Hα(θ)f (α),

where f (α) =∑

ℓ

√λℓφ

(α)ℓ fl .

Application: higher order moments

The 3-rd moment of u is

M(3)u = E

∑

α,β,γ

u(α) ⊗ u(β) ⊗ u(γ)HαHβHγ

=∑

α,β,γ

u(α)⊗u(β)⊗u(γ)cα,β,γ ,

cα,β,γ := E (Hα(θ)Hβ(θ)Hγ(θ)) = cα,β · γ!, and cα,β are constantsfrom the Hermitian algebra.

Using u(α) = K−1f (α) =∑

ℓ

√λℓφ

(α)ℓ K−1fl and uℓ := K−1fℓ, obtain

M(3)u =

∑

p,q,r

tp,q,r up ⊗ uq ⊗ ur , where

tp,q,r :=√

λpλqλr

∑

α,β,γ

φ(α)p φ

(β)q φ

(γ)r cα,βγ .

Outline

Introduction

KLE


Application

Conclusion

Conclusion

Covariance matrices allow data sparse low-rank approximations. With application of H-matrices

we extend the class of covariance functions to work with, allows non-regular discretisations of the cov. function on large

spatial grids.

Application of sparse tensor product allows computation of k -thmoments.

Plans for Feature

1. Convergence of the Lanczos method with H-matrices

2. Implement sparse tensor vector product for the Lanczos method

3. HKT idea for d ≥ 3 dimensions

Thank you for your attention!

Questions?

data sparse approximation of karhunen-loeve expansion

Education