frechet derivatives of matrix functions and applications

30
Fr´ echet Derivatives of Matrix Functions and Applications Samuel Relton [email protected] @sdrelton samrelton.com blog.samrelton.com Joint work with Nicholas J. Higham [email protected] @nhigham www.maths.man.ac.uk/˜higham nickhigham.wordpress.com University of Manchester, UK September 4, 2014 Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23

Upload: sam-relton

Post on 01-Dec-2014

374 views

Category:

Science


0 download

DESCRIPTION

I discuss some recent ideas using the Frechet derivative of matrix functions to analyze the mixed condition number, solve the nuclear activation sensitivity problem, and analyze the distribution of the algebraic error in the finite element method. Originally presented at the 4th IMA Conference on Numerical Linear Algebra and Optimization, Birmingham, UK. 4th September 2014. Joint work with Nicholas J. Higham, Wayne Arter, Zdenek Strakos, and Jan Papez.

TRANSCRIPT

Page 1: Frechet Derivatives of Matrix Functions and Applications

Frechet Derivatives of Matrix Functions and

Applications

Samuel [email protected] @sdrelton

samrelton.com blog.samrelton.com

Joint work with Nicholas J. [email protected] @nhigham

www.maths.man.ac.uk/˜higham nickhigham.wordpress.com

University of Manchester, UK

September 4, 2014

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23

Page 2: Frechet Derivatives of Matrix Functions and Applications

Outline

• Matrix Functions, their Derivatives, and the Condition Number

• Elementwise Sensitivity

• Physics: Nuclear Activation Sensitivity Problem

• Differential Equations: Predicting Algebraic Error in the FEM

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 2 / 23

Page 3: Frechet Derivatives of Matrix Functions and Applications

Matrix Functions

We are interested in functions f : Cn×n 7→ Cn×n e.g.

Matrix Exponential eA =∞∑k=0

Ak

k!

Matrix Cosine cos(A) =∞∑k=0

(−1)kA2k

(2k)!

• Define f (A) by Taylor series when f is analytic

• If A = XDX−1 then f (A) = Xf (D)X−1

• Differential equations: dudt = Au(t), u = etAu(0)

• Use cos(A) and sin(A) for second order ODEs

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23

Page 4: Frechet Derivatives of Matrix Functions and Applications

Matrix Functions

We are interested in functions f : Cn×n 7→ Cn×n e.g.

Matrix Exponential eA =∞∑k=0

Ak

k!

Matrix Cosine cos(A) =∞∑k=0

(−1)kA2k

(2k)!

• Define f (A) by Taylor series when f is analytic

• If A = XDX−1 then f (A) = Xf (D)X−1

• Differential equations: dudt = Au(t), u = etAu(0)

• Use cos(A) and sin(A) for second order ODEs

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23

Page 5: Frechet Derivatives of Matrix Functions and Applications

Frechet Derivatives

Let f : Cn×n 7→ Cn×n be a matrix function.

Definition (Frechet derivative)

The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E

f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).

• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis

• Higher order derivatives recently analyzed (Higham & R., 2014)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23

Page 6: Frechet Derivatives of Matrix Functions and Applications

Frechet Derivatives

Let f : Cn×n 7→ Cn×n be a matrix function.

Definition (Frechet derivative)

The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E

f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).

• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis

• Higher order derivatives recently analyzed (Higham & R., 2014)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23

Page 7: Frechet Derivatives of Matrix Functions and Applications

Sensitivity of Matrix Functions

f

f

SA

f (SA)

SX

f (SX )

The function f is well conditioned at A andill conditioned at X

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23

Page 8: Frechet Derivatives of Matrix Functions and Applications

Sensitivity of Matrix Functions

f

f

SA

f (SA)

SX

f (SX )

The function f is well conditioned at A andill conditioned at X

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23

Page 9: Frechet Derivatives of Matrix Functions and Applications

The Norm-wise Condition Number

The two condition numbers for a matrix function are:

condabs(f ,A) = max‖E‖=1

‖Lf (A,E )‖,

condrel(f ,A) = max‖E‖=1

‖Lf (A,E )‖ ‖A‖‖f (A)‖

.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 6 / 23

Page 10: Frechet Derivatives of Matrix Functions and Applications

Elementwise Sensitivity

If we change just one element Aij , how is f (A) affected?

Let Eij =[δij], then the difference between f (A) and f (A + εEij) is

‖f (A)− f (A + εEij)‖ ≈ ε‖Lf (A,Eij)‖.

• ‖Lf (A,Eij)‖ gives the sensitivity in (i , j) component

• Sometimes we want the t most sensitive elements for t = 5: 20

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 7 / 23

Page 11: Frechet Derivatives of Matrix Functions and Applications

A simple algorithm

To compute the most sensitive t entries of A:

1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖

Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops

• Trivially parallel but still very expensive when A is large

• Speed this up using block norm estimation (work in progress)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23

Page 12: Frechet Derivatives of Matrix Functions and Applications

A simple algorithm

To compute the most sensitive t entries of A:

1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖

Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops

• Trivially parallel but still very expensive when A is large

• Speed this up using block norm estimation (work in progress)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23

Page 13: Frechet Derivatives of Matrix Functions and Applications

The Nuclear Activation Sensitivity Problem

• Chemical reactions: u′(t) = Au(t)

• u(t) = eAtu(0) tells us theconcentration of each element at time t

• qTu(t) is the dosage at time t

• Aij represents the reaction betweenelements i and j (so ignore Aij = 0)

• Aij is subject to measurement errorWhat happens to qTu(t) when itchanges?

Implications for safety in radiation exposure models etc.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 9 / 23

Page 14: Frechet Derivatives of Matrix Functions and Applications

Nuclear Activation Solution - 1

If Aij is perturbed, this introduces a relative error in qTu(t) of

|qT (etA+εEij − etA)u(0)||qT etAu(0)|

≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|

We note that:

• The denominator is the same for all perturbations

• This requires computing a derivative in all directions Aij 6= 0

• Can we improve upon this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23

Page 15: Frechet Derivatives of Matrix Functions and Applications

Nuclear Activation Solution - 1

If Aij is perturbed, this introduces a relative error in qTu(t) of

|qT (etA+εEij − etA)u(0)||qT etAu(0)|

≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|

We note that:

• The denominator is the same for all perturbations

• This requires computing a derivative in all directions Aij 6= 0

• Can we improve upon this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23

Page 16: Frechet Derivatives of Matrix Functions and Applications

Nuclear activation solution - 2

Using vec(AXB) = (BT ⊗A)vec(X ) we see the sensitivity in direction Eij is

|qTLex (tA,Eij)u(0)| = |(u(0)T ⊗ qT )Kex (tA) vec(Eij)|.

Therefore the sensitivity in ALL n2 directions is

|[(u(0)T ⊗ qT )Kex (tA)]T | = |vec(Lex (tA, unvec(u(0)⊗ q)T )T |.

• Only 1 derivative needed for all sensitivities

• Found 2 bugs in existing commercial software!

• Extend for time dependent coefficients A = A(t)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 11 / 23

Page 17: Frechet Derivatives of Matrix Functions and Applications

Predicting Algebraic Error in an ODE

Let’s solve the model ODE

−u′′ = f (x), x ∈ (0, 1), u(0) = u(1) = 0

with the finite element method using piecewise linear basis functions φi .

• Exact solution u(x) = e−5(x−0.5)2 − e5/4 determines f (x)

• Generate a grid of n = 19 equally spaced points xi

• Generate system Ax = b where Aij =∫ 10 φiφj and bi = f (xi ).

A = diag(−1, 2,−1) in this case

• Solve with CG iteration

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 12 / 23

Page 18: Frechet Derivatives of Matrix Functions and Applications

Algebraic and discretization errors

• Let Vh be our finite element space (dimension 19)

• Let uh ∈ Vh be the best solution possible from Vh

• Let ukest be our numerical solution corresponding to k iterations of CG

• The discretization error is u − uh

• The algebraic error is uh − ukest

• The total error is u − ukest = alg. err. + disc. err.

• Sometimes alg err dominates the total err, how do we detect this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23

Page 19: Frechet Derivatives of Matrix Functions and Applications

Algebraic and discretization errors

• Let Vh be our finite element space (dimension 19)

• Let uh ∈ Vh be the best solution possible from Vh

• Let ukest be our numerical solution corresponding to k iterations of CG

• The discretization error is u − uh

• The algebraic error is uh − ukest

• The total error is u − ukest = alg. err. + disc. err.

• Sometimes alg err dominates the total err, how do we detect this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23

Page 20: Frechet Derivatives of Matrix Functions and Applications

Discretization error

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5x 10

−3 Discretization Error

u − uh

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 14 / 23

Page 21: Frechet Derivatives of Matrix Functions and Applications

Algebraic Error - 8 CG iterations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015

−0.01

−0.005

0

0.005

0.01

0.015Algebraic Error k = 8

Alg. Err.

Total Err.

Nodes 9–11 highlighted

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 15 / 23

Page 22: Frechet Derivatives of Matrix Functions and Applications

Algebraic Error - 9 CG iterations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−5

−4

−3

−2

−1

0

1

2

3

4

5x 10

−3 Algebraic Error k = 9

Alg. Err.

Total Err.

Nodes 9–11 highlighted

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 16 / 23

Page 23: Frechet Derivatives of Matrix Functions and Applications

Elementwise sensitivity analysis

• Taking f (A) = A−1 we can calculate the sensitivity of each element

• Lf (A,E ) = −A−1EA−1 so easily computed

• Ignore Aij = 0 since the two basis elements don’t overlap

• Results plotted on the following heat map

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 17 / 23

Page 24: Frechet Derivatives of Matrix Functions and Applications

Elementwise sensitivity analysis

Most sensitive elements of A when computing A−1

in 1−norm

2 4 6 8 10 12 14 16 18

2

4

6

8

10

12

14

16

18

0

0.1

0.2

0.3

0.4

0.5

0.6

Row/Cols 9–11 in the middle

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 18 / 23

Page 25: Frechet Derivatives of Matrix Functions and Applications

2D Peak Problem

0

0.2

0.4

0.6

0.8

1 0

0.2

0.4

0.6

0.8

1−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

Peak problem

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 19 / 23

Page 26: Frechet Derivatives of Matrix Functions and Applications

Algebraic Error Estimation

00.5

1 0

0.5

1

−2

−1

0

1

2

x 10−4

00.5

1 0

0.5

1

−1.5

−1

−0.5

0

0.5

1

1.5

x 10−7

Left: True algebraic error using 7 CG iterations.Right: Error in estimated algebraic error using 1st Frechet derivative.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 20 / 23

Page 27: Frechet Derivatives of Matrix Functions and Applications

Higher Order Derivatives to Estimate Alg. Err.

0 50 100 150 20010

−16

10−14

10−12

10−10

10−8

10−6

Componentwise error using kth order derivatives, k = 1, 3, 5.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 21 / 23

Page 28: Frechet Derivatives of Matrix Functions and Applications

Possible extensions

• Can this be used to modify the discretization mesh to obtain betteraccuracy? (See Papez, Liesen, and Strakos 2014)

• Currently too expensive: can we estimate the sensitivities?

• Can this be extended to f (A) = eA (exponential integrators)?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 22 / 23

Page 29: Frechet Derivatives of Matrix Functions and Applications

Conclusions

• Explained elementwise sensitivity of matrix functions

• New applications in nuclear physics and FEM analysis

• Former is basically solved, latter needs to be cheaper

Future work:

• Estimate sensitivities more efficiently (block norm estimation)

• Further comparison of nuclear physics solution to commercialalternative

• Further analysis of ODE problem

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 23 / 23

Page 30: Frechet Derivatives of Matrix Functions and Applications

Higher Order Frechet Derivatives

Higher order derivatives can be defined recursively:

L(k)f (A+Ek+1,E1, ... ,Ek)− L

(k)f (A,E1, ... ,Ek) =

L(k+1)f (A,E1, ... ,Ek ,Ek+1) + o(‖Ek+1‖)

Also have a simple method to compute them. For example:

f

A E1 E2 00 A 0 E2

0 0 A E1

0 0 A

=

f (A) Lf (A,E1) Lf (A,E2) L

(2)f (A,E1,E2)

0 f (A) 0 Lf (A,E2)0 0 f (A) Lf (A,E1)0 0 0 f (A)

More info in Higham & Relton, SIMAX 35(4), 2014.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 1