frechet derivatives of matrix functions and applications

Post on 01-Dec-2014

374 Views

Category:

Science

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

I discuss some recent ideas using the Frechet derivative of matrix functions to analyze the mixed condition number, solve the nuclear activation sensitivity problem, and analyze the distribution of the algebraic error in the finite element method. Originally presented at the 4th IMA Conference on Numerical Linear Algebra and Optimization, Birmingham, UK. 4th September 2014. Joint work with Nicholas J. Higham, Wayne Arter, Zdenek Strakos, and Jan Papez.

TRANSCRIPT

Frechet Derivatives of Matrix Functions and

Applications

Samuel Reltonsamuel.relton@maths.man.ac.uk @sdrelton

samrelton.com blog.samrelton.com

Joint work with Nicholas J. Highamhigham@maths.man.ac.uk @nhigham

www.maths.man.ac.uk/˜higham nickhigham.wordpress.com

University of Manchester, UK

September 4, 2014

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23

Outline

• Matrix Functions, their Derivatives, and the Condition Number

• Elementwise Sensitivity

• Physics: Nuclear Activation Sensitivity Problem

• Differential Equations: Predicting Algebraic Error in the FEM

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 2 / 23

Matrix Functions

We are interested in functions f : Cn×n 7→ Cn×n e.g.

Matrix Exponential eA =∞∑k=0

Ak

k!

Matrix Cosine cos(A) =∞∑k=0

(−1)kA2k

(2k)!

• Define f (A) by Taylor series when f is analytic

• If A = XDX−1 then f (A) = Xf (D)X−1

• Differential equations: dudt = Au(t), u = etAu(0)

• Use cos(A) and sin(A) for second order ODEs

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23

Matrix Functions

We are interested in functions f : Cn×n 7→ Cn×n e.g.

Matrix Exponential eA =∞∑k=0

Ak

k!

Matrix Cosine cos(A) =∞∑k=0

(−1)kA2k

(2k)!

• Define f (A) by Taylor series when f is analytic

• If A = XDX−1 then f (A) = Xf (D)X−1

• Differential equations: dudt = Au(t), u = etAu(0)

• Use cos(A) and sin(A) for second order ODEs

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23

Frechet Derivatives

Let f : Cn×n 7→ Cn×n be a matrix function.

Definition (Frechet derivative)

The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E

f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).

• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis

• Higher order derivatives recently analyzed (Higham & R., 2014)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23

Frechet Derivatives

Let f : Cn×n 7→ Cn×n be a matrix function.

Definition (Frechet derivative)

The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E

f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).

• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis

• Higher order derivatives recently analyzed (Higham & R., 2014)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23

Sensitivity of Matrix Functions

f

f

SA

f (SA)

SX

f (SX )

The function f is well conditioned at A andill conditioned at X

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23

Sensitivity of Matrix Functions

f

f

SA

f (SA)

SX

f (SX )

The function f is well conditioned at A andill conditioned at X

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23

The Norm-wise Condition Number

The two condition numbers for a matrix function are:

condabs(f ,A) = max‖E‖=1

‖Lf (A,E )‖,

condrel(f ,A) = max‖E‖=1

‖Lf (A,E )‖ ‖A‖‖f (A)‖

.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 6 / 23

Elementwise Sensitivity

If we change just one element Aij , how is f (A) affected?

Let Eij =[δij], then the difference between f (A) and f (A + εEij) is

‖f (A)− f (A + εEij)‖ ≈ ε‖Lf (A,Eij)‖.

• ‖Lf (A,Eij)‖ gives the sensitivity in (i , j) component

• Sometimes we want the t most sensitive elements for t = 5: 20

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 7 / 23

A simple algorithm

To compute the most sensitive t entries of A:

1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖

Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops

• Trivially parallel but still very expensive when A is large

• Speed this up using block norm estimation (work in progress)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23

A simple algorithm

To compute the most sensitive t entries of A:

1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖

Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops

• Trivially parallel but still very expensive when A is large

• Speed this up using block norm estimation (work in progress)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23

The Nuclear Activation Sensitivity Problem

• Chemical reactions: u′(t) = Au(t)

• u(t) = eAtu(0) tells us theconcentration of each element at time t

• qTu(t) is the dosage at time t

• Aij represents the reaction betweenelements i and j (so ignore Aij = 0)

• Aij is subject to measurement errorWhat happens to qTu(t) when itchanges?

Implications for safety in radiation exposure models etc.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 9 / 23

Nuclear Activation Solution - 1

If Aij is perturbed, this introduces a relative error in qTu(t) of

|qT (etA+εEij − etA)u(0)||qT etAu(0)|

≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|

We note that:

• The denominator is the same for all perturbations

• This requires computing a derivative in all directions Aij 6= 0

• Can we improve upon this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23

Nuclear Activation Solution - 1

If Aij is perturbed, this introduces a relative error in qTu(t) of

|qT (etA+εEij − etA)u(0)||qT etAu(0)|

≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|

We note that:

• The denominator is the same for all perturbations

• This requires computing a derivative in all directions Aij 6= 0

• Can we improve upon this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23

Nuclear activation solution - 2

Using vec(AXB) = (BT ⊗A)vec(X ) we see the sensitivity in direction Eij is

|qTLex (tA,Eij)u(0)| = |(u(0)T ⊗ qT )Kex (tA) vec(Eij)|.

Therefore the sensitivity in ALL n2 directions is

|[(u(0)T ⊗ qT )Kex (tA)]T | = |vec(Lex (tA, unvec(u(0)⊗ q)T )T |.

• Only 1 derivative needed for all sensitivities

• Found 2 bugs in existing commercial software!

• Extend for time dependent coefficients A = A(t)

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 11 / 23

Predicting Algebraic Error in an ODE

Let’s solve the model ODE

−u′′ = f (x), x ∈ (0, 1), u(0) = u(1) = 0

with the finite element method using piecewise linear basis functions φi .

• Exact solution u(x) = e−5(x−0.5)2 − e5/4 determines f (x)

• Generate a grid of n = 19 equally spaced points xi

• Generate system Ax = b where Aij =∫ 10 φiφj and bi = f (xi ).

A = diag(−1, 2,−1) in this case

• Solve with CG iteration

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 12 / 23

Algebraic and discretization errors

• Let Vh be our finite element space (dimension 19)

• Let uh ∈ Vh be the best solution possible from Vh

• Let ukest be our numerical solution corresponding to k iterations of CG

• The discretization error is u − uh

• The algebraic error is uh − ukest

• The total error is u − ukest = alg. err. + disc. err.

• Sometimes alg err dominates the total err, how do we detect this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23

Algebraic and discretization errors

• Let Vh be our finite element space (dimension 19)

• Let uh ∈ Vh be the best solution possible from Vh

• Let ukest be our numerical solution corresponding to k iterations of CG

• The discretization error is u − uh

• The algebraic error is uh − ukest

• The total error is u − ukest = alg. err. + disc. err.

• Sometimes alg err dominates the total err, how do we detect this?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23

Discretization error

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5x 10

−3 Discretization Error

u − uh

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 14 / 23

Algebraic Error - 8 CG iterations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015

−0.01

−0.005

0

0.005

0.01

0.015Algebraic Error k = 8

Alg. Err.

Total Err.

Nodes 9–11 highlighted

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 15 / 23

Algebraic Error - 9 CG iterations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−5

−4

−3

−2

−1

0

1

2

3

4

5x 10

−3 Algebraic Error k = 9

Alg. Err.

Total Err.

Nodes 9–11 highlighted

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 16 / 23

Elementwise sensitivity analysis

• Taking f (A) = A−1 we can calculate the sensitivity of each element

• Lf (A,E ) = −A−1EA−1 so easily computed

• Ignore Aij = 0 since the two basis elements don’t overlap

• Results plotted on the following heat map

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 17 / 23

Elementwise sensitivity analysis

Most sensitive elements of A when computing A−1

in 1−norm

2 4 6 8 10 12 14 16 18

2

4

6

8

10

12

14

16

18

0

0.1

0.2

0.3

0.4

0.5

0.6

Row/Cols 9–11 in the middle

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 18 / 23

2D Peak Problem

0

0.2

0.4

0.6

0.8

1 0

0.2

0.4

0.6

0.8

1−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

Peak problem

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 19 / 23

Algebraic Error Estimation

00.5

1 0

0.5

1

−2

−1

0

1

2

x 10−4

00.5

1 0

0.5

1

−1.5

−1

−0.5

0

0.5

1

1.5

x 10−7

Left: True algebraic error using 7 CG iterations.Right: Error in estimated algebraic error using 1st Frechet derivative.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 20 / 23

Higher Order Derivatives to Estimate Alg. Err.

0 50 100 150 20010

−16

10−14

10−12

10−10

10−8

10−6

Componentwise error using kth order derivatives, k = 1, 3, 5.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 21 / 23

Possible extensions

• Can this be used to modify the discretization mesh to obtain betteraccuracy? (See Papez, Liesen, and Strakos 2014)

• Currently too expensive: can we estimate the sensitivities?

• Can this be extended to f (A) = eA (exponential integrators)?

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 22 / 23

Conclusions

• Explained elementwise sensitivity of matrix functions

• New applications in nuclear physics and FEM analysis

• Former is basically solved, latter needs to be cheaper

Future work:

• Estimate sensitivities more efficiently (block norm estimation)

• Further comparison of nuclear physics solution to commercialalternative

• Further analysis of ODE problem

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 23 / 23

Higher Order Frechet Derivatives

Higher order derivatives can be defined recursively:

L(k)f (A+Ek+1,E1, ... ,Ek)− L

(k)f (A,E1, ... ,Ek) =

L(k+1)f (A,E1, ... ,Ek ,Ek+1) + o(‖Ek+1‖)

Also have a simple method to compute them. For example:

f

A E1 E2 00 A 0 E2

0 0 A E1

0 0 A

=

f (A) Lf (A,E1) Lf (A,E2) L

(2)f (A,E1,E2)

0 f (A) 0 Lf (A,E2)0 0 f (A) Lf (A,E1)0 0 0 f (A)

More info in Higham & Relton, SIMAX 35(4), 2014.

Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 1

top related