data sparse approximation of karhunen-loeve expansion
TRANSCRIPT
Data sparse approximation of theKarhunen-Loeve expansion
Alexander Litvinenko,joint with B. Khoromskij (Leipzig) and H. Matthies(Braunschweig)
Institut fur Wissenschaftliches Rechnen, Technische Universitat Braunschweig,0531-391-3008, [email protected]
March 5, 2008
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
Stochastic PDE
We consider
− div(κ(x , ω)∇u) = f (x , ω) in D,u = 0 on ∂D,
with stochastic coefficients κ(x , ω), x ∈ D ⊆ Rd and ω belongs to the
space of random events Ω.
[Babuska, Ghanem, Matthies, Schwab, Vandewalle, ...].
Methods and techniques:
1. Response surface
2. Monte-Carlo
3. Perturbation
4. Stochastic Galerkin
Examples of covariance functions [Novak,(IWS),04]
The random field requires to specify its spatial correl. structure
covf (x , y) = E[(f (x , ·) − µf (x))(f (y , ·) − µf (y))],
where E is the expectation and µf (x) := E[f (x , ·)].
Let h =
√∑3
i=1
(√h2
i /ℓ2i + d2 − d
)2, where hi := xi − yi , i = 1, 2, 3,
ℓi are cov. lengths and d a parameter.
Gaussian cov(h) = σ2 · exp(−h2),
exponential cov(h) = σ2 · exp(−h),
spherical
cov(h) =
σ2 ·
(1 − 3
2hhr
− 12
h3
h3r
)for 0 ≤ h ≤ hr ,
0 for h > hr .
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
KLE
The spectral representation of the cov. function isCκ(x , y) =
∑∞i=0 λi ki(x)ki(y), where λi and ki(x) are the eigenvalues
and eigenfunctions.The Karhunen-Loeve expansion [Loeve, 1977] is the series
κ(x , ω) = µk (x) +∞∑
i=1
√λiki(x)ξi(ω), where
ξi (ω) are uncorrelated random variables and ki are basis functions inL2(D).Eigenpairs λi , ki are the solution of
Tki = λi ki , ki ∈ L2(D), i ∈ N, where.
T : L2(D) → L2(D),(Tu)(x) :=
∫D covk (x , y)u(y)dy .
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
Computation of eigenpairs by FFT
If the cov. function depends on (x − y) then on a uniform tensor gridthe cov. matrix C is (block) Toeplitz.Then C can be extended to the circulant one and the decomposition
C =1n
F HΛF (1)
may be computed like follows. Multiply (1) by F becomes
FC = ΛF ,
FC1 = ΛF1.
Since all entries of F1 are unity, obtain
λ = FC1.
FC1 may be computed very efficiently by FFT [Cooley, 1965] inO(n log n) FLOPS.C1 may be represented in a matrix or in a tensor format.
Multidimensional FFT
Lemma: The d-dim. FT F (d) can be represented as following
F (d) = (F(1)1 ⊗ I ⊗ I . . .)(I ⊗ F
(1)2 ⊗ I . . .) . . . (I ⊗ I . . . ⊗ F
(1)d ), (2)
and the complexity of F (d) is O(nd log n), where n is the number ofdofs in one direction.
Discrete eigenvalue problem
Let
Wij :=∑
k ,m
∫
Dbi(x)bk (x)dxCkm
∫
Dbj(y)bm(y)dy ,
Mij =
∫
Dbi(x)bj (x)dx.
Then we solve
Wfhℓ = λℓMfh
ℓ , where W := MCM
Approximate C in
low rank format the H-matrix format sparse tensor format
and use the Lanczos method to compute m largest eigenvalues.
Examples of H-matrix approximates ofcov(x , y) = e−2|x−y |
[Hackbusch et al. 99]
25 20
20 20
20 16
20 16
20 20
16 16
20 16
16 16
4 4
20 4 324 4
16 4 324 20
4 4
4 16
4 4
32 32
20 20
20 20 32
32 32
4 3
4 4 3220 4
16 4 32
32 4
32 32
4 32
32 32
32 4
32 324 4
4 4
20 16
4 4
32 32
4 32
32 32
32 32
4 32
32 32
4 32
20 20
20 20 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
4 44 4
20 4 32
32 32 4
4 432 4
32 32 4
4 432 32
4 32 4
4 432 32
32 32 4
44 20
4 4 32
32 32
4 4
432 4
32 32
4 4
432 32
4 32
4 4
432 32
32 32
4 4
20 20
20 20 32
32 32
4 4
20 4 32
32 324 20
4 4 32
32 32
20 20
20 20 32
32 32
32 4
32 32
32 4
32 32
32 4
32 32
32 4
32 32
32 32
4 32
32 32
4 32
32 32
4 32
32 32
4 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
4 4
4 4 44 4
20 4 32
32 32
32 4
32 32
4 32
32 32
32 4
32 32
4 4
4 4
4 4
4 4 44 4
32 4
32 32 4
4 4
4 4
4 4
4 4 44
32 4
32 32
4 4
4 4
4 4
4 4
4 4 432 4
32 32
32 4
32 32
32 4
32 32
32 4
32 32
4 4
4 4
4 4
4 4
4 20
4 4 32
32 32
4 32
32 32
32 32
4 32
32 32
4 32
44 4
4 4
4 4
4 4
4 432 32
4 32 4
44 3
4 4
4 4
4 4
432 32
4 32
4 4
44 4
4 4
4 4
4 4
32 32
4 32
32 32
4 32
32 32
4 32
32 32
4 32
44 4
4 4
20 20
20 20 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
4 4
20 4 32
32 32
32 4
32 32
4 32
32 32
32 4
32 324 20
4 4 32
32 32
4 32
32 32
32 32
4 32
32 32
4 32
20 20
20 20 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
4 432 32
32 32 4
4 432 4
32 32 4
4 432 32
4 32 4
4 432 32
32 32 4
432 32
32 32
4 4
432 4
32 32
4 4
432 32
4 32
4 4
432 32
32 32
4 4
32 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
32 4
32 32
32 4
32 4
32 4
32 32
32 4
32 4
32 32
4 32
32 32
4 32
32 32
4 4
32 32
4 4
32 32
32 32
32 32
32 32
32 32
32 32
32 32
32 32
25 11
11 20 12
1320 11
9 1613
1320 11
11 20 13
13 3213
13
20 8
10 20 13
13 32 13
1332 13
13 32
13
13
20 11
11 20 13
13 32 13
1320 10
10 20 12
12 3213
1332 13
13 32 13
1332 13
13 32
13
13
20 11
11 20 13
13 32 13
1332 13
13 3213
13
20 9
9 20 13
13 32 13
1332 13
13 32
13
13
32 13
13 32 13
1332 13
13 3213
1332 13
13 32 13
1332 13
13 32
Figure: H-matrix approximations ∈ Rn×n, n = 322, with standard (left) and
weak (right) admissibility block partitionings. The biggest dense (dark) blocks∈ R
n×n, max. rank k = 4 left and k = 13 right.
H - Matrices
Comp. complexity is O(kn log n) and storage O(kn log n).
To assemble low-rank blocks use ACA [Bebendorf, Tyrtyshnikov].
Dependence of the computational time and storage requirements ofCH on the rank k , n = 322.
k time (sec.) memory (MB) ‖C−CH‖2
‖C‖2
2 0.04 2e + 6 3.5e − 56 0.1 4e + 6 1.4e − 59 0.14 5.4e + 6 1.4e − 512 0.17 6.8e + 6 3.1e − 717 0.23 9.3e + 6 6.3e − 8
The time for dense matrix C is 3.3 sec. and the storage 1.4e + 8 MB.
H - Matrices
Let h =
√∑2
i=1
(√h2
i /ℓ2i + d2 − d
)2, where hi := xi − yi , i = 1, 2, 3,
ℓi are cov. lengths and d = 1.
exponential cov(h) = σ2 · exp(−h),The cov. matrix C ∈ R
n×n, n = 652.
ℓ1 ℓ2‖C−CH‖2
‖C‖2
0.01 0.02 3e − 20.1 0.2 8e − 31 2 2.8e − 610 20 3.7e − 9
Exponential Singularvalue decay [see also Schwab etal.]
0 100 200 300 400 500 600 700 800 900 10000
100
200
300
400
500
600
700
0 100 200 300 400 500 600 700 800 900 10000
1
2
3
4
5
6
7
8
9
10x 10
4
0 100 200 300 400 500 600 700 800 900 10000
200
400
600
800
1000
1200
1400
1600
1800
0 100 200 300 400 500 600 700 800 900 10000
0.5
1
1.5
2
2.5x 10
5
0 100 200 300 400 500 600 700 800 900 10000
50
100
150
0 100 200 300 400 500 600 700 800 900 10000
0.5
1
1.5
2
2.5
3
3.5
4x 10
4
Figure: 23 grid 48 × 64 × 40, (left) l = 1, l = 2, l = 1 and (right) l = 5,
Sparse tensor decompositions of kernelscov(x , y) = cov(x − y)
We want to approximate C ∈ RN×N , N = nd by
Cr =∑r
k=1 V 1k ⊗ ... ⊗ V d
k such that ‖C − Cr‖ ≤ ε.
The storage of C is O(N2) = O(n2d ) and the storage of Cr is O(rdn2).
To define V ik use e.g. SVD.
Approximate all V ik in the H-matrix format and become HKT format.
See basic arithmetics in [Hackbusch, Khoromskij, Tyrtyshnikov].
Assume f (x , y), x = (x1, x2), y = (y1, y2), then the equivalent approx.problem is f (x1, x2; y1, y2) ≈
∑rk=1 Φk (x1, y1)Ψk (x2, y2).
Numerical examples of tensor approximations
Gaussian kernel exp−|x − y |2 has the Kroneker rank 1.
The exponen. kernel e − |x − y | can be approximated by a tensorwith low Kroneker rank
r 1 2 3 4 5 6 10‖C−Cr‖∞
‖C‖∞11.5 1.7 0.4 0.14 0.035 0.007 2.8e − 8
‖C−Cr‖2
‖C‖26.7 0.52 0.1 0.03 0.008 0.001 5.3e − 9
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
Application: covariance of the solution
For SPDE with stochastic RHS the eigenvalue problem and spectraldecom. look like
Cf fℓ = λℓfℓ, Cf = Φf Λf ΦTf .
If we only want the covariance
Cu = (K ⊗ K)−1Cf = (K−1 ⊗ K−1)Cf = K−1Cf K−T ,
one may with the KLE of Cf = Φf Λf ΦTf reduce this to
Cu = K−1Cf K−T = K−1
Φf ΛΦTf K−T .
Application: higher order moments
Let operator K be deterministic and
Ku(θ) =∑
α∈J
Ku(α)Hα(θ) = f(θ) =∑
α∈J
f (α)Hα(θ), with
u(α) = [u(α)1 , ..., u(α)
N ]T . Projecting onto each Hα obtain
Ku(α) = f (α).
The KLE of f(θ) is
f(θ) = f +∑
ℓ
√λℓφℓ(θ)fl =
∑
ℓ
∑
α
√λℓφ
(α)ℓ Hα(θ)fl
=∑
α
Hα(θ)f (α),
where f (α) =∑
ℓ
√λℓφ
(α)ℓ fl .
Application: higher order moments
The 3-rd moment of u is
M(3)u = E
∑
α,β,γ
u(α) ⊗ u(β) ⊗ u(γ)HαHβHγ
=∑
α,β,γ
u(α)⊗u(β)⊗u(γ)cα,β,γ ,
cα,β,γ := E (Hα(θ)Hβ(θ)Hγ(θ)) = cα,β · γ!, and cα,β are constantsfrom the Hermitian algebra.
Using u(α) = K−1f (α) =∑
ℓ
√λℓφ
(α)ℓ K−1fl and uℓ := K−1fℓ, obtain
M(3)u =
∑
p,q,r
tp,q,r up ⊗ uq ⊗ ur , where
tp,q,r :=√
λpλqλr
∑
α,β,γ
φ(α)p φ
(β)q φ
(γ)r cα,βγ .
Outline
Introduction
KLE
Numerical techniquesFFTHierarchical MatricesSparse tensor approximation
Application
Conclusion
Conclusion
Covariance matrices allow data sparse low-rank approximations. With application of H-matrices
we extend the class of covariance functions to work with, allows non-regular discretisations of the cov. function on large
spatial grids.
Application of sparse tensor product allows computation of k -thmoments.
Plans for Feature
1. Convergence of the Lanczos method with H-matrices
2. Implement sparse tensor vector product for the Lanczos method
3. HKT idea for d ≥ 3 dimensions
Thank you for your attention!
Questions?