taylor's theorem for matrix functions and pseudospectral bounds on the condition number
TRANSCRIPT
Taylor’s Theorem for Matrix Functions and
Pseudospectral Bounds on the Condition
NumberSamuel Relton
[email protected] @sdrelton
samrelton.com blog.samrelton.com
Joint work with Edvin [email protected]
University of StrathclydeJune 23rd, 2015
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 1 / 21
Outline
• Taylor’s Theorem for Scalar Functions
• Matrix Functions, their Derivatives, and the Condition Number
• Taylor’s Theorem for Matrix Functions
• Pseudospectral Bounds on the Condition Number
• Numerical Experiments
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 2 / 21
Taylor’s Theorem - 1
Theorem (Taylor’s Theorem)
When f : R→ R is k times continuously differentiable at a ∈ R thereexists Rk : R→ R such that
f (x) =k∑
j=0
f (j)(a)
j!(x − a)j + Rk(x).
Different expressions for the remainder term Rk(x) include
• the Lagrange form.
• the Cauchy form.
• the contour integral form.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 3 / 21
Taylor’s Theorem - 2
We can extend this to complex analytic functions.
If f (z) is complex analytic in an open set D ⊂ C then for any a ∈ D
f (z) =k∑
j=0
f (k)(a)
j!(z − a)j + Rk(z),
where
Rk(z) =(z − a)k+1
2πi
∫Γ
f (w)dw
(w − a)k+1(w − z),
and Γ is a closed curve in D containing a.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 4 / 21
Matrix Functions
We are interested in extending this to matrix functions f : Cn×n → Cn×n.
For example:
• the matrix exponential
eA =∞∑j=0
Aj
j!.
• the matrix cosine
cos(A) =∞∑j=0
(−1)jA2j
(2j)!.
Applications include:
• Differential equations: dudt = Au(t), u(t) = etAu(0).
• Second order ODEs with sine and cosine.
• Ranking importance of nodes in a graph etc. . .
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 5 / 21
Frechet derivatives
Let f : Cn×n 7→ Cn×n be a matrix function.
Definition (Frechet derivative)
The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E
f (A + E )− f (A)− Lf (A, E ) = o(‖E‖).
• Lf (A, E ) is just a linear approximation to f (A + E )− f (A).
• Higher order derivatives are defined recursively (Higham & R., 2014).
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 6 / 21
Condition Numbers
A condition number describes the sensitivity of f at A to smallperturbations arising from rounding error etc.
The absolute condition number is given by
condabs(f , A) := limε→0
sup‖E‖≤ε
‖f (A + E )− f (A)‖ε
= max‖E‖=1
‖Lf (A, E )‖,
whilst the relative condition number is
condrel(f , A) := condabs(f , A)‖A‖‖f (A)‖
.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 7 / 21
Matrix Functions and Taylor’s Theorem - 1
Previous results combining these two ideas include:
• an expansion around αI
f (A) =∞∑j=0
f (j)(α)
j!(A− αI )j .
• an expansion in terms of derivatives
f (A + E ) =∞∑j=0
1
j!
d j
dt j
∣∣∣∣t=0
f (A + tE ).
Note that:
• neither expansion has an explicit remainder term.
• d j
dt j
∣∣∣t=0
f (A + tE ) = Lf (A, E ) when j = 1.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 8 / 21
Matrix Functions and Taylor’s Theorem - 2Let us take D
[j]f (A, E ) := d j
dt j
∣∣∣t=0
f (A + tE ) then we have the following.
Theorem (Taylor’s Theorem for Matrix Functions)
Let f : Cn×n → Cn×n we analytic in an open set D ⊂ C with A, Esatisfying Λ(A),Λ(A + E ) ⊂ D. Then
f (A + E ) = Tk(A, E ) + Rk(A, E ),
where
Tk(A, E ) =k∑
j=0
1
j!D [j](A, E ),
and
Rk(A, E ) =1
2πi
∫Γ
f (z)(zI − A− E )−1[E (zI − A)−1]k+1dz ,
where Γ is a closed contour enclosing Λ(A) and Λ(A + E ).
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 9 / 21
Matrix Functions and Taylor’s Theorem - 3
As an example take f (z) = z−1.
D[1]z−1(A, E ) = −A−1EA−1,
D[2]z−1(A, E ) = 2A−1EA−1EA−1.
Therefore we have
(A + E )−1 =1
0!A−1 − 1
1!A−1EA−1 +
2
2!A−1EA−1EA−1
+
∫Γ
1
z(zI − A− E )−1[E (zI − A)−1]3dz .
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 10 / 21
Applying Pseudospectral Theory - 1Recall that the ε-pseudospectrum of X is the set
Λε(X ) = {z ∈ C : ‖(zI − X )−1‖ ≥ ε−1}.
The ε-psuedospectral radius is ρε = max |z | for z ∈ Λε(X ).
-1 0 1 2 3
-3
-2
-1
0
1
2
3
-2.5
-2
-1.5
-1
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 11 / 21
Applying Pseudospectral Theory - 1
Recall that the ε-pseudospectrum of X is the set
Λε(X ) = {z ∈ C : ‖(zI − X )−1‖ ≥ ε−1}.
The ε-psuedospectral radius is ρε = max |z | for z ∈ Λε(X ).
Using this we can bound the remainder term by
‖Rk(A, E )‖ ≤ ‖E‖k+1Lε
2πεk+1maxz∈Γε
|f (z)|,
where
• Γε is a contour enclosing Λε(A) and Λε(A + E ).
• Lε is the length of the contour Γε.
• ε is a parameter to be chosen.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 11 / 21
Applying Pseudospectral Theory - 2
Applying this to R0(A, E ) gives a bound on the condition number.
condabs(f , A) ≤ Lε
2πε2maxz∈Γε
|f (z)|,
where Γε encloses Λε(A) and has length Lε.
Interesting because:
• Usually only lower bounds on condition number are known.
• Computing (or estimating) this efficiently could be of considerableinterest in practice or for algorithm design.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 12 / 21
The Condition Number of At - 1
This upper bound is extremely efficient to compute for the matrix functiongiven by f (x) = x t for t ∈ (0, 1).
Our experiments will
• determine how tight the upper bound is as ε changes.
• see how fast evaluating the upper bound is in comparison tocomputing it exactly.
Other methods for this problem are:
• “CN Exact” – computes condition number exactly.
• “CN Normest” – lower bound using norm estimator.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 13 / 21
The Condition Number of At - 2
This function has a branch cut along the negative real line, meaning weneed to choose a keyhole contour. Overall:
condabs(x t , A) ≤2(π + 1)ρ1+t
ε√n
2πε2,
where ρε is the ε-pseudospectral radius, computed using code by Gugliemiand Overton.
Note: There is an upper limit for ε where the pseudospectrum intersectsthe branch cut. We need to take ε smaller than this value.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 14 / 21
Test matrix - Grcar matrix
-1 0 1 2 3
-3
-2
-1
0
1
2
3
-2.5
-2
-1.5
-1
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 15 / 21
CN Bound as ε varies
0
10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 10 1
10 -15
10 -10
10 -5
10 0 CN PseudoCN ExactCN Normest
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 16 / 21
Test matrix - Almost neg. eigenvalues
-1 -0.5 0 0.5-0.3
-0.2
-0.1
0
0.1
0.2
-5
-4.5
-4
-3.5
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 17 / 21
CN Bound ε varies
0
10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 10 1
10 -15
10 -10
10 -5
10 0 CN PseudoCN ExactCN Normest
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 18 / 21
Runtime Comparison - Timings
n0 50 100 150 200
runt
ime
(s)
10 -4
10 -2
10 0
10 2
10 4
CN Normest t=1/5CN Pseudo t=1/5CN Normest t=1/10CN Pseudo t=1/10CN Normest t=1/15CN Pseudo t=1/15
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 19 / 21
Runtime Comparison - Speedup
n0 50 100 150 200
spee
dup
0
200
400
600
800
1000t=1/5t=1/10t=1/15
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 20 / 21
Conclusions
• Extended Taylor’s theorem to matrix functions.
• Applied pseudospectral theory to bound remainder term.
• Bounds are very efficient to compute for At .
• If bound is unsatisfactorily large can revert to a more precise method.
Future work:
• Apply to algorithm design.
• Find other classes of functions for which this is efficient.
Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 21 / 21