taylor's theorem for matrix functions and pseudospectral bounds on the condition number

Taylor’s Theorem for Matrix Functions and

Pseudospectral Bounds on the Condition

NumberSamuel Relton

[email protected] @sdrelton

samrelton.com blog.samrelton.com

Joint work with Edvin [email protected]

University of StrathclydeJune 23rd, 2015

Sam Relton (UoM) Taylor’s Theorem for f (A) June 23rd, 2015 1 / 21

mailto:[email protected]

http://www.twitter.com/sdrelton

http://www.samrelton.com

http://blog.samrelton.com

mailto:[email protected]

Outline

• Taylor’s Theorem for Scalar Functions

• Matrix Functions, their Derivatives, and the Condition Number

• Taylor’s Theorem for Matrix Functions

• Pseudospectral Bounds on the Condition Number

• Numerical Experiments


Taylor’s Theorem - 1

Theorem (Taylor’s Theorem)

When f : R→ R is k times continuously differentiable at a ∈ R thereexists Rk : R→ R such that

f (x) =k∑

j=0

f (j)(a)

j!(x − a)j + Rk(x).

Different expressions for the remainder term Rk(x) include

• the Lagrange form.

• the Cauchy form.

• the contour integral form.


Taylor’s Theorem - 2

We can extend this to complex analytic functions.

If f (z) is complex analytic in an open set D ⊂ C then for any a ∈ D

f (z) =k∑

j=0

f (k)(a)

j!(z − a)j + Rk(z),

where

Rk(z) =(z − a)k+1

2πi

∫Γ

f (w)dw

(w − a)k+1(w − z),

and Γ is a closed curve in D containing a.


Matrix Functions

We are interested in extending this to matrix functions f : Cn×n → Cn×n.

For example:

• the matrix exponential

eA =∞∑j=0

Aj

j!.

• the matrix cosine

cos(A) =∞∑j=0

(−1)jA2j

(2j)!.

Applications include:

• Differential equations: dudt = Au(t), u(t) = etAu(0).

• Second order ODEs with sine and cosine.

• Ranking importance of nodes in a graph etc. . .


Frechet derivatives

Let f : Cn×n 7→ Cn×n be a matrix function.

Definition (Frechet derivative)

The Frechet derivative of f at A is the unique linear functionLf (A, ·) : Cn×n 7→ Cn×n such that for all E

f (A + E )− f (A)− Lf (A, E ) = o(‖E‖).

• Lf (A, E ) is just a linear approximation to f (A + E )− f (A).

• Higher order derivatives are defined recursively (Higham & R., 2014).


Condition Numbers

A condition number describes the sensitivity of f at A to smallperturbations arising from rounding error etc.

The absolute condition number is given by

condabs(f , A) := limε→0

sup‖E‖≤ε

‖f (A + E )− f (A)‖ε

= max‖E‖=1

‖Lf (A, E )‖,

whilst the relative condition number is

condrel(f , A) := condabs(f , A)‖A‖‖f (A)‖

.


Matrix Functions and Taylor’s Theorem - 1

Previous results combining these two ideas include:

• an expansion around αI

f (A) =∞∑j=0

f (j)(α)

j!(A− αI )j .

• an expansion in terms of derivatives

f (A + E ) =∞∑j=0

1

j!

d j

dt j

∣∣∣∣t=0

f (A + tE ).

Note that:

• neither expansion has an explicit remainder term.

• d j

dt j

∣∣∣t=0

f (A + tE ) = Lf (A, E ) when j = 1.


Matrix Functions and Taylor’s Theorem - 2Let us take D

[j]f (A, E ) := d j

dt j

∣∣∣t=0

f (A + tE ) then we have the following.

Theorem (Taylor’s Theorem for Matrix Functions)

Let f : Cn×n → Cn×n we analytic in an open set D ⊂ C with A, Esatisfying Λ(A),Λ(A + E ) ⊂ D. Then

f (A + E ) = Tk(A, E ) + Rk(A, E ),

where

Tk(A, E ) =k∑

j=0

1

j!D [j](A, E ),

and

Rk(A, E ) =1

2πi

∫Γ

f (z)(zI − A− E )−1[E (zI − A)−1]k+1dz ,

where Γ is a closed contour enclosing Λ(A) and Λ(A + E ).


Matrix Functions and Taylor’s Theorem - 3

As an example take f (z) = z−1.

D[1]z−1(A, E ) = −A−1EA−1,

D[2]z−1(A, E ) = 2A−1EA−1EA−1.

Therefore we have

(A + E )−1 =1

0!A−1 − 1

1!A−1EA−1 +

2

2!A−1EA−1EA−1

+

∫Γ

1

z(zI − A− E )−1[E (zI − A)−1]3dz .


Applying Pseudospectral Theory - 1Recall that the ε-pseudospectrum of X is the set

Λε(X ) = {z ∈ C : ‖(zI − X )−1‖ ≥ ε−1}.

The ε-psuedospectral radius is ρε = max |z | for z ∈ Λε(X ).

-1 0 1 2 3

-3

-2

-1

0

1

2

3

-2.5

-2

-1.5

-1


Applying Pseudospectral Theory - 1

Recall that the ε-pseudospectrum of X is the set

Λε(X ) = {z ∈ C : ‖(zI − X )−1‖ ≥ ε−1}.

The ε-psuedospectral radius is ρε = max |z | for z ∈ Λε(X ).

Using this we can bound the remainder term by

‖Rk(A, E )‖ ≤ ‖E‖k+1Lε

2πεk+1maxz∈Γε

|f (z)|,

where

• Γε is a contour enclosing Λε(A) and Λε(A + E ).

• Lε is the length of the contour Γε.

• ε is a parameter to be chosen.


Applying Pseudospectral Theory - 2

Applying this to R0(A, E ) gives a bound on the condition number.

condabs(f , A) ≤ Lε

2πε2maxz∈Γε

|f (z)|,

where Γε encloses Λε(A) and has length Lε.

Interesting because:

• Usually only lower bounds on condition number are known.

• Computing (or estimating) this efficiently could be of considerableinterest in practice or for algorithm design.


The Condition Number of At - 1

This upper bound is extremely efficient to compute for the matrix functiongiven by f (x) = x t for t ∈ (0, 1).

Our experiments will

• determine how tight the upper bound is as ε changes.

• see how fast evaluating the upper bound is in comparison tocomputing it exactly.

Other methods for this problem are:

• “CN Exact” – computes condition number exactly.

• “CN Normest” – lower bound using norm estimator.


The Condition Number of At - 2

This function has a branch cut along the negative real line, meaning weneed to choose a keyhole contour. Overall:

condabs(x t , A) ≤2(π + 1)ρ1+t

ε√n

2πε2,

where ρε is the ε-pseudospectral radius, computed using code by Gugliemiand Overton.

Note: There is an upper limit for ε where the pseudospectrum intersectsthe branch cut. We need to take ε smaller than this value.


Test matrix - Grcar matrix

-1 0 1 2 3

-3

-2

-1

0

1

2

3

-2.5

-2

-1.5

-1


CN Bound as ε varies

0

10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 10 1

10 -15

10 -10

10 -5

10 0 CN PseudoCN ExactCN Normest


Test matrix - Almost neg. eigenvalues

-1 -0.5 0 0.5-0.3

-0.2

-0.1

0

0.1

0.2

-5

-4.5

-4

-3.5


CN Bound ε varies

0

10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 0 10 1

10 -15

10 -10

10 -5

10 0 CN PseudoCN ExactCN Normest


Runtime Comparison - Timings

n0 50 100 150 200

runt

ime

(s)

10 -4

10 -2

10 0

10 2

10 4

CN Normest t=1/5CN Pseudo t=1/5CN Normest t=1/10CN Pseudo t=1/10CN Normest t=1/15CN Pseudo t=1/15


Runtime Comparison - Speedup

n0 50 100 150 200

spee

dup

0

200

400

600

800

1000t=1/5t=1/10t=1/15


Conclusions

• Extended Taylor’s theorem to matrix functions.

• Applied pseudospectral theory to bound remainder term.

• Bounds are very efficient to compute for At .

• If bound is unsatisfactorily large can revert to a more precise method.

Future work:

• Apply to algorithm design.

• Find other classes of functions for which this is efficient.


taylor's theorem for matrix functions and pseudospectral bounds on the condition number

Science