hierarchical matrix techniques for maximum likelihood covariance estimation

40
Hierarchical matrix techniques for maximum likelihood covariance estimation Alexander Litvinenko, Extreme Computing Research Center and Uncertainty Quantification Center, KAUST (joint work with M. Genton, Y. Sun and D. Keyes) Center for Uncertainty Quantification http://sri-uq.kaust.edu.sa/

Upload: alexander-litvinenko

Post on 23-Jan-2018

244 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Hierarchical matrix techniques for maximum likelihood covariance estimation

Hierarchical matrix techniques for maximumlikelihood covariance estimation

Alexander Litvinenko,Extreme Computing Research Center and Uncertainty

Quantification Center, KAUST(joint work with M. Genton, Y. Sun and D. Keyes)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

http://sri-uq.kaust.edu.sa/

Page 2: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

The structure of the talk

1. Motivation

2. Hierarchical matrices [Hackbusch 1999]:

3. Matern covariance function

4. Uncertain parameters of the covariance function:

4.1 Uncertain covariance length4.2 Uncertain smoothness parameter

5. Identification of these parameters via maximizing thelog-likelihood.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

2 / 38

Page 3: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Motivation, problem 1

Task: to predict temperature, velocity, salinity, estimate parameters ofcovariance

Grid: 50Mi locations on 50 levels, 4*(X*Y*Z) + X*Y= 4*500*500*50 +

500*500 = 50Mi.

High-resolution time-dependent data about Red Sea: zonal velocity and

temperatureCenter for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

3 / 38

Page 4: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Motivation, problem 2

Task: to predict moisture, compute covariance, estimate its parameters

Grid: 1830× 1329 = 2, 432, 070 locations with 2,153,888 observations

and 278,182 missing values.

−120 −110 −100 −90 −80 −70

25

30

35

40

45

50

Soil moisture

longitude

latit

ude

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

High-resolution daily soil moisture data at the top layer of the Mississippibasin, U.S.A., 01.01.2014 (Chaney et al., in review).

Important for agriculture, defense. Moisture is very heterogeneous.Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

4 / 38

Page 5: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Motivation, estimation of uncertain parameters

H-matrix rank

3 7 9

cov. le

ngth

0.02

0.025

0.03

0.035

0.04

0.045

0.05

0.055

0.06

Box-plots for ` = 0.0334 (domain [0, 1]2) vs different H-matrixranks k = {3, 7, 9}.Which H-matrix rank is sufficient for identification of parametersof a particular type of cov. matrix?

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

5 / 38

Page 6: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Motivation for H-matrices

General dense matrix requires O(n3) storage and time. Is veryexpensive.

If covariance matrix is structured (diagonal, Toeplitz, circulant)then we can apply e.g. FFT with O(nlogn), but if not ?

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

6 / 38

Page 7: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

H-matrix storage and complexity (p-proc. on shared mem.)

Operation Sequential Compl. Parallel Complexity(R.Kriemann 2005)

building(M) N = O(nlogn) Np +O(|V (T )\L(T )|)

storage(M) N = O(knlogn) N

Mx N = O(knlogn) Np + n√

p

αM ′ ⊕ βM ′′ N = O(k2nlogn) Np

αM ′ �M ′′ ⊕ βM N = O(k2nlog2n) Np +O(Csp(T )|V (T )|)

M−1 N = O(k2nlog2n) Np +O(nn2

min)

LU N = O(k2nlog2n) N

H-LU N = O(k2nlog2n) Np +O(k

2nlog2n

n1/d )

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

7 / 38

Page 8: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Hierarchical (H)-matrices

Introduction into Hierarchical (H)-matrix technique

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

8 / 38

Page 9: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Examples of H-matrix approximations

25 20

20 20

20 16

20 16

20 20

16 16

20 16

16 16

4 4

20 4 32

4 4

16 4 32

4 20

4 4

4 16

4 4

32 32

20 20

20 20 32

32 32

4 3

4 4 32

20 4

16 4 32

32 4

32 32

4 32

32 32

32 4

32 32

4 4

4 4

20 16

4 4

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 44 4

20 4 32

32 32 4

4 4

32 4

32 32 4

4 4

32 32

4 32 4

4 4

32 32

32 32 4

4

4 20

4 4 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

20 20

20 20 32

32 32

4 4

20 4 32

32 32

4 20

4 4 32

32 32

20 20

20 20 32

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

4 4 44 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 32

4 4

4 4

4 4

4 4 4

4 4

32 4

32 32 4

4 4

4 4

4 4

4 4 4

432 4

32 32

4 4

4 4

4 4

4 4

4 4 432 4

32 32

32 4

32 32

32 4

32 32

32 4

32 32

4 4

4 4

4 4

4 4

4 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

44 4

4 4

4 4

4 4

4 4

32 32

4 32 4

44 3

4 4

4 4

4 4

432 32

4 32

4 4

44 4

4 4

4 4

4 4

32 32

4 32

32 32

4 32

32 32

4 32

32 32

4 32

44 4

4 4

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

20 4 32

32 32

32 4

32 32

4 32

32 32

32 4

32 32

4 20

4 4 32

32 32

4 32

32 32

32 32

4 32

32 32

4 32

20 20

20 20 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

4 4

32 32

32 32 4

4 4

32 4

32 32 4

4 4

32 32

4 32 4

4 4

32 32

32 32 4

432 32

32 32

4 4

432 4

32 32

4 4

432 32

4 32

4 4

432 32

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 4

32 32

32 4

32 4

32 4

32 32

32 4

32 4

32 32

4 32

32 32

4 32

32 32

4 4

32 32

4 4

32 32

32 32

32 32

32 32

32 32

32 32

32 32

32 32

25 11

11 20 12

1320 11

9 1613

13

20 11

11 20 13

13 3213

13

20 8

10 20 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

13

20 10

10 20 12

12 3213

13

32 13

13 32 13

1332 13

13 32

13

13

20 11

11 20 13

13 32 13

1332 13

13 3213

13

20 9

9 20 13

13 32 13

1332 13

13 32

13

13

32 13

13 32 13

1332 13

13 3213

13

32 13

13 32 13

1332 13

13 32

Figure : Two examples of H-matrix approximations ∈ Rn×n, n = 210, ofthe discretised covariance function cov(x , y) = e−r , `1 = 0.15, `2 = 0.2,x , y ∈ [0, 1]2. The biggest dense (dark) blocks ∈ R32×32, max. rankk = 4 on the left and k = 13 on the right. The right block structure issimpler, but the rank is higher.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

9 / 38

Page 10: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Low-rank (rank-k) matrices

How do we compute green blocks?M ∈ Rn×m, U ≈ U ∈ Rn×k , V ≈ V ∈ Rm×k , k � min(n,m).The storage M = UΣV T is k(n + m) instead of n ·m for Mrepresented in the full matrix format.

VU ΣT=M

UVΣ∼

∼ ∼ T

=M∼

Figure : Reduced SVD, only k biggest singular values are taken.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

10 / 38

Page 11: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

H-matrices (Hackbusch ’99), main steps

1. Build cluster tree TI and block cluster tree TI×I .

I

I

I I

I

I

I I I I1

1

2

2

11 12 21 22

I11

I12

I21

I22

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

11 / 38

Page 12: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

H - Matrices

Let h =√∑2

i=1 h2i /`

2i , where hi := xi − yi , `i are cov. lengths and

d = 1.

exponential cov(h) = σ2 · exp(−h),The cov. matrix C ∈ Rn×n, n = 652.

`1 `2‖C−CH‖2

‖C‖2

0.01 0.02 3e − 20.1 0.2 8e − 31 2 2.8e − 6

10 20 3.7e − 9

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

12 / 38

Page 13: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Matern covariance functions

Matern covariance functions

Cθ =2σ2

Γ(ν)

( r

2`

)νKν

( r

`

), θ = (σ2, ν, `).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

13 / 38

Page 14: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Examples of Matern covariance matrices

Cν=3/2(r) =

(1 +

√3r

`

)exp

(−√

3r

`

)(1)

Cν=5/2(r) =

(1 +

√5r

`+

5r 2

3`2

)exp

(−√

5r

`

)(2)

ν = 1/2 exponential covariance function Cν=1/2(r) = exp(−r),ν →∞ Gaussian covariance function Cν=∞(r) = exp(−r 2),.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

14 / 38

Page 15: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Matern function for different parameters

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

Matern covariance (nu=1)

σ=0.5, l=0.5

σ=0.5, l=0.3

σ=0.5, l=0.2

σ=0.5, l=0.1

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

nu=0.15

nu=0.3

nu=0.5

nu=1

nu=2

nu=30

Computed in sglib, E. Zander.Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

15 / 38

Page 16: Hierarchical matrix techniques for maximum likelihood covariance estimation

Realisations of Matern random field for different parameters

To generate a realization κ(x , θ∗) of a RF κ(x , θ), one needs: 1)C = LLT ,2) generate a realization ξ(θ∗) of a random vector ξ(θ) and3) compute MV product L · ξ(θ∗).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

16 / 38

Page 17: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Numerical experiments with H-matrices

H-matrix approximations of covariance matrices,computing time and storage

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

17 / 38

Page 18: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Memory and computational times

n rank k size, MB t, sec. ε ε2

for C C C C C

4.0 · 103 10 48 3 0.8 0.08 7 · 10−3 2.0 · 10−4

1.05 · 104 18 439 19 7.0 0.4 7 · 10−4 1.0 · 10−4

2.1 · 104 25 2054 64 45.0 1.4 1 · 10−5 4.4 · 10−6

Table : The accuracy of the H-matrix approximation (weak admissibility)of the exp. covariance function, `1 = `3 = 0.1, `2 = 0.5, L-shape 3Ddomain (right) [Khoromskij et al’ 09]

.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

18 / 38

Page 19: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Identifying uncertain parameters

Identifying uncertain parameters

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

19 / 38

Page 20: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Identifying uncertain parameters

Given: a vector of measurements z = (z1, ..., zn)T with acovariance matrix C (θ∗) = C (σ2, ν, `).

Cθ =2σ2

Γ(ν)

( r

2`

)νKν

( r

`

), θ = (σ2, ν, `).

To identify: uncertain parameters (σ2, ν, `).Plan: Maximize the log-likelihood function

L(θ) = −1

2

(N log2π + log det{C (θ)}+ zTC (θ)−1z

),

On each iteration i we have a new matrix C (θi ).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

20 / 38

Page 21: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Other works

1. S. AMBIKASARAN, et al., Fast direct methods for gaussian processes and the analysis of NASA Keplermission, arXiv:1403.6015, (2014).

2. S. AMBIKASARAN, J. Y. LI, P. K. KITANIDIS, AND E. DARVE, Large-scale stochastic linear inversionusing hierarchical matrices, Computational Geosciences, (2013)

3. J. BALLANI AND D. KRESSNER, Sparse inverse covariance estimation with hierarchical matrices, (2015).

4. M. BEBENDORF, Why approximate LU decompositions of finite element discretizations of ellipticoperators can be computed with almost linear complexity, (2007).

5. S. BOERM AND J. GARCKE, Approximating gaussian processes with H2-matrices, 2007.

6. J. E. CASTRILLON, M. G. GENTON, AND R. YOKOTA, Multi-Level Restricted Maximum LikelihoodCovariance Estimation and Kriging for Large Non-Gridded Spatial Datasets, (2015).

7. J. DOELZ, H. HARBRECHT, AND C. SCHWAB, Covariance regularity and H-matrix approximation forrough random fields, ETH-Zuerich, 2014.

8. H. HARBRECHT et al, Efficient approximation of random fields for numerical applications, NumericalLinear Algebra with Applications, (2015).

9. C.-J. HSIEH, et al, Big QUIC: Sparse inverse covariance estimation for a million variables, 2013

10. J. QUINONERO-CANDELA, et al, A unifying view of sparse approximate gaussian process regression,(2005).

11. A. SAIBABA, S. AMBIKASARAN, J. YUE LI, P. KITANIDIS, AND E. DARVE, Application of hierarchicalmatrices to linear inverse problems in geostatistics, Oil & Gas Science (2012).

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

21 / 38

Page 22: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Convergence of the optimization method

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

22 / 38

Page 23: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Details of the identification

To maximize the log-likelihood function we use the Brent’s method[Brent’73] (combining bisection method, secant method andinverse quadratic interpolation).

1. C (θ) ≈ CH(θ, k).

2. H-Cholesky: CH(θ, k) = LLT

3. zTC−1z = zT (LLT )−1z = vT · v , where v is a solution ofL(θ, k)v(θ) := z(θ∗).

4. Let λi be diagonal elements of H-Cholesky factor L, then

log det{C} = log det{LLT} = log det{n∏

i=1

λ2i } = 2

n∑i=1

logλi ,

L(θ, k) = −N

2log(2π)−

N∑i=1

log{Lii (θ, k)} − 1

2(v(θ)T · v(θ)). (3)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

23 / 38

Page 24: Hierarchical matrix techniques for maximum likelihood covariance estimation

0 10 20 30 40−4000

−3000

−2000

−1000

0

1000

2000

parameter θ, truth θ*=12

Log−

likelih

ood(θ

)

Shape of Log−likelihood(θ)

log(det(C))

zTC

−1z

Log−likelihood

Figure : Minimum of negative log-likelihood (black) is atθ = (·, ·, `) ≈ 12 (σ2 and ν are fixed)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

24 / 38

Page 25: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

What will change?

Approximate C by CH

1. How the eigenvalues of C and CH differ ?

2. How det(C ) differs from det(CH) ?

3. How L differs from LH ?

4. How C−1 differs from (CH)−1 ?

5. How L(θ, k) differs from L(θ)?

6. What is optimal H-matrix rank?

7. How θH differs from θ?

For theory, estimates for the rank and accuracy see works ofBebendorf, Grasedyck, Le Borne, Hackbusch,...

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

25 / 38

Page 26: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Remark

For a small H-matrix rank k the H-matrix Cholesky of CH can benot so stable (talk of Ralf Zimmermann) when eigenvalues of Ccome very close to zero. A remedy is to increase the rank k .In our example for n = 652 we increased k from 7 to 9.

To avoid this instability, we can modify CHm = CH + δ2I . Assumeλi are eigenvalues of CH. Then eigenvalues of CHm will be λi + δ2.

log det(CHm ) = logn∏

i=1

(λi + δ2) =n∑

i=1

log(λi + δ2). (4)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

26 / 38

Page 27: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Error analysis

Theorem (Existence of H-matrix inverse in [Bebendorf’11,Ballani, Kressner’14)

Under certain conditions an H-matrix inverse exist

‖C−1H − C−1‖ ≤ ε‖C−1‖, (5)

theoretical estimations for rank kinv of C−1H are given.

Theorem (Error in log det)

Let E := C − CH, (CH)−1E := (CH)−1C − I and for the spectralradius

ρ((CH)−1E ) = ρ((CH)−1C − I) ≤ ε < ε. (6)

Then |log det(C )− log det(CH)| ≤ −plog(1− ε).

Proof: See [Ballani, Kressner 14], [Ipsen’05].

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

27 / 38

Page 28: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

How sensible is covariance matrix to H-matrix rank ?

It is not at all sensible.H-matrix approximation changes function L(θ, k) and estimationof θ very-very small.

θ 0.05 1.05 2.04 3.04 4.03 5.03 6.02 7.02 8.01 9 10L(exact) 1628 -2354 -1450 27 1744 3594 5529 7522 9559 11628 13727L(7) 1625 -2354 -1450 27 1745 3595 5530 7524 9560 11630 13726L(20) 1625 -2354 -1450 27 1745 3595 5530 7524 9561 11630 13725

Comparison of three likelihood functions, computed with differentH-matrix ranks: exact, H-rank 7, H-rank 20. Exponentialcovariance function, with covariance length ` = 0.9, domainG = [0, 1]2.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

28 / 38

Page 29: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

How sensible is covariance matrix to H-matrix rank ?

0 5 10−5000

0

5000

10000

15000

θ

−lo

glik

elih

ood

Figure : Three negative log-likelihood functions: exact, commuted withH-matrix rank 7 and 17. One can see that even with rank 7 one canachieve very accurate results.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

29 / 38

Page 30: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Decreasing of error bars with number of measurements

Error bars (mean +/- st. dev.) computed for different n.

Decreasing of error bars with increasing of number ofmeasurements/dimension, n = {172, 332, 652}. The mean andmedian are obtained after averaging 200 simulations.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

30 / 38

Page 31: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Convergence of H-matrix approximations

0 10 20 30 40 50 60 70 80 90 100−25

−20

−15

−10

−5

0

rank k

log(r

el.

error)

Spectral norm, L=0.1, nu=1

Frob. norm, L=0.1

Spectral norm, L=0.2

Frob. norm, L=0.2

Spectral norm, L=0.5

Frob. norm, L=0.5

0 10 20 30 40 50 60 70 80 90 100−16

−14

−12

−10

−8

−6

−4

−2

0

rank k

log(r

el.

error)

Spectral norm, L=0.1, nu=0.5

Frob. norm, L=0.1

Spectral norm, L=0.2

Frob. norm, L=0.2

Spectral norm, L=0.5

Frob. norm, L=0.5

ν = 1(left) and ν = 0.5 (right) for different cov. lengths` = {0.1, 02, 0.5}

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

31 / 38

Page 32: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Convergence of H-matrix approximations

0 10 20 30 40 50 60 70 80 90 100−16

−14

−12

−10

−8

−6

−4

−2

0

rank k

log(r

el.

error)

Spectral norm, nu=1.5, L=0.1

Spectral norm, nu=1

Spectral norm, nu=0.5

0 10 20 30 40 50 60 70 80 90 100−22

−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

rank k

log(r

el.

error)

Spectral norm, nu=1.5, L=0.5

Spectral norm, nu=1

Spectral norm, nu=0.5

ν = {1.5, 1, 0.5}, ` = 0.1 (left) and ν = {1.5, 1, 0.5}, ` = 0.5(right)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

32 / 38

Page 33: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Difference between two distributions, computed with C and CH

Kullback-Leibler divergence (KLD) DKL(P‖Q) is a measure of theinformation lost when distribution Q is used to approximate P:

DKL(P‖Q) =∑i

P(i) lnP(i)

Q(i), DKL(P‖Q) =

∫ ∞−∞

p(x) lnp(x)

q(x)dx ,

where p, q densities of P and Q. For multi-variate normaldistributions (µ0,Σ0) and (µ1,Σ1)

2DKL(N0‖N1) = tr(Σ−11 Σ0)+(µ1−µ0)TΣ−1

1 (µ1−µ0)−k− ln

(det Σ0

det Σ1

)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

33 / 38

Page 34: Hierarchical matrix techniques for maximum likelihood covariance estimation

k KLD ‖C − CH‖2 ‖C (CH)−1 − I‖2

` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75 ` = 0.25 ` = 0.75

5 0.51 2.3 4.0e-2 0.1 4.8 636 0.34 1.6 9.4e-3 0.02 3.4 228 5.3e-2 0.4 1.9e-3 0.003 1.2 8

10 2.6e-3 0.2 7.7e-4 7.0e-4 6.0e-2 3.112 5.0e-4 2e-2 9.7e-5 5.6e-5 1.6e-2 0.515 1.0e-5 9e-4 2.0e-5 1.1e-5 8.0e-4 0.0220 4.5e-7 4.8e-5 6.5e-7 2.8e-7 2.1e-5 1.2e-350 3.4e-13 5e-12 2.0e-13 2.4e-13 4e-11 2.7e-9

Table : Dependence of KLD on the approximation H-matrix rank k,Matern covariance with parameters ` = {0.25, 0.75} and ν = 0.5, domainG = [0, 1]2, ‖C(`=0.25,0.75)‖2 = {212, 568}.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

34 / 38

Page 35: Hierarchical matrix techniques for maximum likelihood covariance estimation

Figure : Dependence of H-matrix approximation error on parameter ν.

Relative error ‖C−CH‖2

‖CH‖2via smoothness parameter ν. H-matrix rank

k = 8, n = 16641, Matern covariance matrix.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

34 / 38

Page 36: Hierarchical matrix techniques for maximum likelihood covariance estimation

Figure : Dependence of H-matrix approximation error on parameter ν.

Relative error ‖C−CH‖2

‖CH‖2via covariance length `. H-matrix rank k = 8,

n = 16641, Matern covariance matrix.

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

35 / 38

Page 37: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Conclusion

I Covariance matrices can be approximated in H-matrix format.

I Influence of H-matrix approximation error on the estimatedparameters is small.

I With application of H-matricesI we extend the class of covariance functions to work with,I allows non-regular discretization of the covariance function on

large spatial grids.

I With the maximizing algorithm we are able to identify bothparameters: covariance lengths ` and the smoothness ν

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

35 / 38

Page 38: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Future plans

I ECRC, center of D. Keyes: Parallel H-Cholesky on differentarchitectures → very large covariance matrices on complicategrids

I Apply H-matrices for

1. Kriging estimate s := CsyC−1yy y

2. Estimation of variance σ, is the diagonal of conditional cov.matrix Css|y = diag

(Css − CsyC−1

yy Cys

),

3. Gestatistical optimal design ϕA := n−1traceCss|y ,

ϕC := cT(Css − CsyC−1

yy Cys

)c ,

I Identify all three parameters (σ2, `, ν) simultaneously

I Compare with the Bayesian Update (H. Matthies, H. Najm,K. Law, A. Stuart et al)

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

36 / 38

Page 39: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Literature

1. PCE of random coefficients and the solution of stochastic partialdifferential equations in the Tensor Train format, S. Dolgov, B. N.Khoromskij, A. Litvinenko, H. G. Matthies, 2015/3/11, arXiv:1503.032102. Efficient analysis of high dimensional data in tensor formats, M. Espig,W. Hackbusch, A. Litvinenko, H.G. Matthies, E. Zander Sparse Grids andApplications, 31-56, 40, 20133. Application of hierarchical matrices for computing the Karhunen-Loeveexpansion, B.N. Khoromskij, A. Litvinenko, H.G. Matthies, Computing84 (1-2), 49-67, 31, 20094. Efficient low-rank approximation of the stochastic Galerkin matrix intensor formats, M. Espig, W. Hackbusch, A. Litvinenko, H.G. Matthies,P. Waehnert, Comp. & Math. with Appl. 67 (4), 818-829, 2012

5. Numerical Methods for Uncertainty Quantification and Bayesian

Update in Aerodynamics, A. Litvinenko, H. G. Matthies, Book

”Management and Minimisation of Uncertainties and Errors in Numerical

Aerodynamics” pp 265-282, 2013

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

37 / 38

Page 40: Hierarchical matrix techniques for maximum likelihood covariance estimation

4*

Acknowledgement

1. Lars Grasedyck (RWTH Aachen) and Steffen Boerm (UniKiel) for HLIB (www.hlib.org)

2. KAUST Research Computing group, KAUST SupercomputingLab (KSL)

3. Stochastic Galerkin library (sglib from E. Zander). Type inyour terminalgit clone git://github.com/ezander/sglib.git

To initialize all variables, run startup.m You will find:generalised PCE, sparse grids, (Q)MC, stochastic Galerkin, linearsolvers, KLE, covariance matrices, statistics, quadratures(multivariate Chebyshev, Laguerre, Lagrange, Hermite ) etc

There are: many examples, many test, rich demos

Center for UncertaintyQuantification

Center for UncertaintyQuantification

Center for Uncertainty Quantification Logo Lock-up

38 / 38