frechet derivatives of matrix functions and applications
Post on 01-Dec-2014
374 Views
Preview:
DESCRIPTION
TRANSCRIPT
Frechet Derivatives of Matrix Functions and
Applications
Samuel Reltonsamuel.relton@maths.man.ac.uk @sdrelton
samrelton.com blog.samrelton.com
Joint work with Nicholas J. Highamhigham@maths.man.ac.uk @nhigham
www.maths.man.ac.uk/˜higham nickhigham.wordpress.com
University of Manchester, UK
September 4, 2014
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23
Outline
• Matrix Functions, their Derivatives, and the Condition Number
• Elementwise Sensitivity
• Physics: Nuclear Activation Sensitivity Problem
• Differential Equations: Predicting Algebraic Error in the FEM
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 2 / 23
Matrix Functions
We are interested in functions f : Cn×n 7→ Cn×n e.g.
Matrix Exponential eA =∞∑k=0
Ak
k!
Matrix Cosine cos(A) =∞∑k=0
(−1)kA2k
(2k)!
• Define f (A) by Taylor series when f is analytic
• If A = XDX−1 then f (A) = Xf (D)X−1
• Differential equations: dudt = Au(t), u = etAu(0)
• Use cos(A) and sin(A) for second order ODEs
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23
Matrix Functions
We are interested in functions f : Cn×n 7→ Cn×n e.g.
Matrix Exponential eA =∞∑k=0
Ak
k!
Matrix Cosine cos(A) =∞∑k=0
(−1)kA2k
(2k)!
• Define f (A) by Taylor series when f is analytic
• If A = XDX−1 then f (A) = Xf (D)X−1
• Differential equations: dudt = Au(t), u = etAu(0)
• Use cos(A) and sin(A) for second order ODEs
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23
Frechet Derivatives
Let f : Cn×n 7→ Cn×n be a matrix function.
Definition (Frechet derivative)
The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E
f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).
• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis
• Higher order derivatives recently analyzed (Higham & R., 2014)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23
Frechet Derivatives
Let f : Cn×n 7→ Cn×n be a matrix function.
Definition (Frechet derivative)
The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E
f (A + E )− f (A)− Lf (A,E ) = o(‖E‖).
• Applications include manifold optimization, Markov models,bladder cancer, image processing, and network analysis
• Higher order derivatives recently analyzed (Higham & R., 2014)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23
Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )
The function f is well conditioned at A andill conditioned at X
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23
Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )
The function f is well conditioned at A andill conditioned at X
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23
The Norm-wise Condition Number
The two condition numbers for a matrix function are:
condabs(f ,A) = max‖E‖=1
‖Lf (A,E )‖,
condrel(f ,A) = max‖E‖=1
‖Lf (A,E )‖ ‖A‖‖f (A)‖
.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 6 / 23
Elementwise Sensitivity
If we change just one element Aij , how is f (A) affected?
Let Eij =[δij], then the difference between f (A) and f (A + εEij) is
‖f (A)− f (A + εEij)‖ ≈ ε‖Lf (A,Eij)‖.
• ‖Lf (A,Eij)‖ gives the sensitivity in (i , j) component
• Sometimes we want the t most sensitive elements for t = 5: 20
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 7 / 23
A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖
Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops
• Trivially parallel but still very expensive when A is large
• Speed this up using block norm estimation (work in progress)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23
A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n2 for j = 1: n3 if Aij 6= 04 Compute and store ‖Lf (A,Eij)‖5 end if6 end for7 end for8 Take the largest t values of ‖Lf (A,Eij)‖
Cost: Up to O(n5) flops since computing Lf (A,E ) costs O(n3) flops
• Trivially parallel but still very expensive when A is large
• Speed this up using block norm estimation (work in progress)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23
The Nuclear Activation Sensitivity Problem
• Chemical reactions: u′(t) = Au(t)
• u(t) = eAtu(0) tells us theconcentration of each element at time t
• qTu(t) is the dosage at time t
• Aij represents the reaction betweenelements i and j (so ignore Aij = 0)
• Aij is subject to measurement errorWhat happens to qTu(t) when itchanges?
Implications for safety in radiation exposure models etc.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 9 / 23
Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qTu(t) of
|qT (etA+εEij − etA)u(0)||qT etAu(0)|
≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|
We note that:
• The denominator is the same for all perturbations
• This requires computing a derivative in all directions Aij 6= 0
• Can we improve upon this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23
Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qTu(t) of
|qT (etA+εEij − etA)u(0)||qT etAu(0)|
≈ ε|qTLex (tA,Eij)u(0)||qT etAu(0)|
We note that:
• The denominator is the same for all perturbations
• This requires computing a derivative in all directions Aij 6= 0
• Can we improve upon this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23
Nuclear activation solution - 2
Using vec(AXB) = (BT ⊗A)vec(X ) we see the sensitivity in direction Eij is
|qTLex (tA,Eij)u(0)| = |(u(0)T ⊗ qT )Kex (tA) vec(Eij)|.
Therefore the sensitivity in ALL n2 directions is
|[(u(0)T ⊗ qT )Kex (tA)]T | = |vec(Lex (tA, unvec(u(0)⊗ q)T )T |.
• Only 1 derivative needed for all sensitivities
• Found 2 bugs in existing commercial software!
• Extend for time dependent coefficients A = A(t)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 11 / 23
Predicting Algebraic Error in an ODE
Let’s solve the model ODE
−u′′ = f (x), x ∈ (0, 1), u(0) = u(1) = 0
with the finite element method using piecewise linear basis functions φi .
• Exact solution u(x) = e−5(x−0.5)2 − e5/4 determines f (x)
• Generate a grid of n = 19 equally spaced points xi
• Generate system Ax = b where Aij =∫ 10 φiφj and bi = f (xi ).
A = diag(−1, 2,−1) in this case
• Solve with CG iteration
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 12 / 23
Algebraic and discretization errors
• Let Vh be our finite element space (dimension 19)
• Let uh ∈ Vh be the best solution possible from Vh
• Let ukest be our numerical solution corresponding to k iterations of CG
• The discretization error is u − uh
• The algebraic error is uh − ukest
• The total error is u − ukest = alg. err. + disc. err.
• Sometimes alg err dominates the total err, how do we detect this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23
Algebraic and discretization errors
• Let Vh be our finite element space (dimension 19)
• Let uh ∈ Vh be the best solution possible from Vh
• Let ukest be our numerical solution corresponding to k iterations of CG
• The discretization error is u − uh
• The algebraic error is uh − ukest
• The total error is u − ukest = alg. err. + disc. err.
• Sometimes alg err dominates the total err, how do we detect this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23
Discretization error
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5x 10
−3 Discretization Error
u − uh
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 14 / 23
Algebraic Error - 8 CG iterations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015
−0.01
−0.005
0
0.005
0.01
0.015Algebraic Error k = 8
Alg. Err.
Total Err.
Nodes 9–11 highlighted
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 15 / 23
Algebraic Error - 9 CG iterations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−5
−4
−3
−2
−1
0
1
2
3
4
5x 10
−3 Algebraic Error k = 9
Alg. Err.
Total Err.
Nodes 9–11 highlighted
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 16 / 23
Elementwise sensitivity analysis
• Taking f (A) = A−1 we can calculate the sensitivity of each element
• Lf (A,E ) = −A−1EA−1 so easily computed
• Ignore Aij = 0 since the two basis elements don’t overlap
• Results plotted on the following heat map
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 17 / 23
Elementwise sensitivity analysis
Most sensitive elements of A when computing A−1
in 1−norm
2 4 6 8 10 12 14 16 18
2
4
6
8
10
12
14
16
18
0
0.1
0.2
0.3
0.4
0.5
0.6
Row/Cols 9–11 in the middle
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 18 / 23
2D Peak Problem
0
0.2
0.4
0.6
0.8
1 0
0.2
0.4
0.6
0.8
1−0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
Peak problem
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 19 / 23
Algebraic Error Estimation
00.5
1 0
0.5
1
−2
−1
0
1
2
x 10−4
00.5
1 0
0.5
1
−1.5
−1
−0.5
0
0.5
1
1.5
x 10−7
Left: True algebraic error using 7 CG iterations.Right: Error in estimated algebraic error using 1st Frechet derivative.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 20 / 23
Higher Order Derivatives to Estimate Alg. Err.
0 50 100 150 20010
−16
10−14
10−12
10−10
10−8
10−6
Componentwise error using kth order derivatives, k = 1, 3, 5.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 21 / 23
Possible extensions
• Can this be used to modify the discretization mesh to obtain betteraccuracy? (See Papez, Liesen, and Strakos 2014)
• Currently too expensive: can we estimate the sensitivities?
• Can this be extended to f (A) = eA (exponential integrators)?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 22 / 23
Conclusions
• Explained elementwise sensitivity of matrix functions
• New applications in nuclear physics and FEM analysis
• Former is basically solved, latter needs to be cheaper
Future work:
• Estimate sensitivities more efficiently (block norm estimation)
• Further comparison of nuclear physics solution to commercialalternative
• Further analysis of ODE problem
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 23 / 23
Higher Order Frechet Derivatives
Higher order derivatives can be defined recursively:
L(k)f (A+Ek+1,E1, ... ,Ek)− L
(k)f (A,E1, ... ,Ek) =
L(k+1)f (A,E1, ... ,Ek ,Ek+1) + o(‖Ek+1‖)
Also have a simple method to compute them. For example:
f
A E1 E2 00 A 0 E2
0 0 A E1
0 0 A
=
f (A) Lf (A,E1) Lf (A,E2) L
(2)f (A,E1,E2)
0 f (A) 0 Lf (A,E2)0 0 f (A) Lf (A,E1)0 0 0 f (A)
More info in Higham & Relton, SIMAX 35(4), 2014.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 1
top related