on admissible eigenvalue approximations from krylov ... · on admissible eigenvalue approximations...

119
On admissible eigenvalue approximations from Krylov subspace methods for non-normal matrices Jurjen Duintjer Tebbens joint work with Gérard Meurant Crouzeix’s conjecture workshop AIM, San Jose, US, August 3, 2017. 1

Upload: others

Post on 05-Feb-2020

34 views

Category:

Documents


0 download

TRANSCRIPT

On admissible eigenvalue approximations from

Krylov subspace methods for non-normal matrices

Jurjen Duintjer Tebbens

joint work with

Gérard Meurant

Crouzeix’s conjecture workshopAIM, San Jose, US, August 3, 2017.

1

Outline

Outline

1 Introduction

2 Ritz values

3 Harmonic Ritz values, GMRES and FOM polynomials

2

Outline

1 Introduction

3

Introduction

This talk is about some iterative methods designed for large, sparse matrices tofind eigenvalues, eigenvectors and solutions of linear systems.

4

Introduction

This talk is about some iterative methods designed for large, sparse matrices tofind eigenvalues, eigenvectors and solutions of linear systems.

More precisely, it is about Krylov subspace methods generating during the kthiteration the kth Krylov subspace

Kk(A, v) ≡ span{v, Av, . . . , Ak−1v}for a given a nonsingular matrix and nonzero vector

A ∈ Cn×n, v ∈ Cn (w.l.o.g. ‖v‖ = 1).

4

Introduction

This talk is about some iterative methods designed for large, sparse matrices tofind eigenvalues, eigenvectors and solutions of linear systems.

More precisely, it is about Krylov subspace methods generating during the kthiteration the kth Krylov subspace

Kk(A, v) ≡ span{v, Av, . . . , Ak−1v}for a given a nonsingular matrix and nonzero vector

A ∈ Cn×n, v ∈ Cn (w.l.o.g. ‖v‖ = 1).

Note that

all vectors w ∈ Kk(A, v) = span{v, Av, ..., Ak−1v} are of the form

w = α0v + α1Av + · · · + αk−1Ak−1v = pk−1(A)v

for a polynomial pk−1 of degree k − 1.

4

Introduction

This talk is about some iterative methods designed for large, sparse matrices tofind eigenvalues, eigenvectors and solutions of linear systems.

More precisely, it is about Krylov subspace methods generating during the kthiteration the kth Krylov subspace

Kk(A, v) ≡ span{v, Av, . . . , Ak−1v}for a given a nonsingular matrix and nonzero vector

A ∈ Cn×n, v ∈ Cn (w.l.o.g. ‖v‖ = 1).

Note that

all vectors w ∈ Kk(A, v) = span{v, Av, ..., Ak−1v} are of the form

w = α0v + α1Av + · · · + αk−1Ak−1v = pk−1(A)v

for a polynomial pk−1 of degree k − 1.

we always consider both the matrix A and a starting vector v and theirinterplay can be crucial (also if we are in properties like the field of valuesor matrix polynomials which depend on the matrix only).

4

Introduction

This talk is about some iterative methods designed for large, sparse matrices tofind eigenvalues, eigenvectors and solutions of linear systems.

More precisely, it is about Krylov subspace methods generating during the kthiteration the kth Krylov subspace

Kk(A, v) ≡ span{v, Av, . . . , Ak−1v}for a given a nonsingular matrix and nonzero vector

A ∈ Cn×n, v ∈ Cn (w.l.o.g. ‖v‖ = 1).

Note that

all vectors w ∈ Kk(A, v) = span{v, Av, ..., Ak−1v} are of the form

w = α0v + α1Av + · · · + αk−1Ak−1v = pk−1(A)v

for a polynomial pk−1 of degree k − 1.

we always consider both the matrix A and a starting vector v and theirinterplay can be crucial (also if we are in properties like the field of valuesor matrix polynomials which depend on the matrix only).

We will restrict ourselves to non-normal matrices (for which Crouzeix’sconjecture has not yet been proved).

4

Introduction

With non-normal matrices, we can very roughly divide the available methodsinto those using short recurrences (lower compuational costs) and those usinglong recurrences (enhanced stability properties).

5

Introduction

With non-normal matrices, we can very roughly divide the available methodsinto those using short recurrences (lower compuational costs) and those usinglong recurrences (enhanced stability properties).

We focus on the last class which is based on the Arnoldi orthogonalizationprocess [Arnoldi - 1951] for computing orthogonal bases of the Krylov subspaces.

5

Introduction

With non-normal matrices, we can very roughly divide the available methodsinto those using short recurrences (lower compuational costs) and those usinglong recurrences (enhanced stability properties).

We focus on the last class which is based on the Arnoldi orthogonalizationprocess [Arnoldi - 1951] for computing orthogonal bases of the Krylov subspaces.

In the kth iteration of the process (without breakdown) it computes thedecomposition

AVk = Vk+1Hk,

where the columns of Vk = [v1, . . . , vk] (the Arnoldi vectors) contain anorthogonal basis for the kth Krylov subspace Kk(A, v) and Hk is rectangularupper Hessenberg.

5

Introduction

With non-normal matrices, we can very roughly divide the available methodsinto those using short recurrences (lower compuational costs) and those usinglong recurrences (enhanced stability properties).

We focus on the last class which is based on the Arnoldi orthogonalizationprocess [Arnoldi - 1951] for computing orthogonal bases of the Krylov subspaces.

In the kth iteration of the process (without breakdown) it computes thedecomposition

AVk = Vk+1Hk,

where the columns of Vk = [v1, . . . , vk] (the Arnoldi vectors) contain anorthogonal basis for the kth Krylov subspace Kk(A, v) and Hk is rectangularupper Hessenberg.

By deleting the last row we get the square matrix

Hk = V ∗

k AVk ∈ Ck×k;

Hk is the orthogonal restriction of A onto Kk(A, v) and A is a dilation of Hk.

5

Introduction

Essentially,

- for eigenpair approximations of A, the Arnoldi method [Arnoldi - 1951],

[Saad - 1980] uses the eigenvalues and eigenvectors of

Hk

and the first k Arnoldi vectors. The eigenvalue approximations are calledRitz values, the eigenvector approximations Ritz vectors.

6

Introduction

Essentially,

- for eigenpair approximations of A, the Arnoldi method [Arnoldi - 1951],

[Saad - 1980] uses the eigenvalues and eigenvectors of

Hk

and the first k Arnoldi vectors. The eigenvalue approximations are calledRitz values, the eigenvector approximations Ritz vectors.

- for approximate solutions to linear systems Ax = b, the GMRES method[Saad, Schultz - 1986] solves least squares problems

minz

‖‖b‖e1 − Hkz‖

and the first k Arnoldi vectors.

6

Introduction

Essentially,

- for eigenpair approximations of A, the Arnoldi method [Arnoldi - 1951],

[Saad - 1980] uses the eigenvalues and eigenvectors of

Hk

and the first k Arnoldi vectors. The eigenvalue approximations are calledRitz values, the eigenvector approximations Ritz vectors.

- for approximate solutions to linear systems Ax = b, the GMRES method[Saad, Schultz - 1986] solves least squares problems

minz

‖‖b‖e1 − Hkz‖

and the first k Arnoldi vectors.

Both the GMRES and the Arnoldi method are very popular methods thatare successful for a large variety of problem classes.

6

Introduction

Essentially,

- for eigenpair approximations of A, the Arnoldi method [Arnoldi - 1951],

[Saad - 1980] uses the eigenvalues and eigenvectors of

Hk

and the first k Arnoldi vectors. The eigenvalue approximations are calledRitz values, the eigenvector approximations Ritz vectors.

- for approximate solutions to linear systems Ax = b, the GMRES method[Saad, Schultz - 1986] solves least squares problems

minz

‖‖b‖e1 − Hkz‖

and the first k Arnoldi vectors.

Both the GMRES and the Arnoldi method are very popular methods thatare successful for a large variety of problem classes.

Nevertheless, convergence behavior of the two methods is not fullyunderstood, analysis is particularly challenging with highly non-normalinput matrices.

6

Introduction

Often one tries to use the tools that are successful for analysis ofhermitian counterparts of GMRES and Arnoldi like the ConjugateGradients and the Lanczos method.

7

Introduction

Often one tries to use the tools that are successful for analysis ofhermitian counterparts of GMRES and Arnoldi like the ConjugateGradients and the Lanczos method.

For example, the basic tool for explaining Krylov subspace methods forhermitian linear systems is the eigenvalue distribution.

7

Introduction

Often one tries to use the tools that are successful for analysis ofhermitian counterparts of GMRES and Arnoldi like the ConjugateGradients and the Lanczos method.

For example, the basic tool for explaining Krylov subspace methods forhermitian linear systems is the eigenvalue distribution.

However, for GMRES it is known for some time that if GMRES generatesa certain residual norm history, the same history can be generated withany nonzero spectrum [Greenbaum, Strakoš - 1994].

7

Introduction

Often one tries to use the tools that are successful for analysis ofhermitian counterparts of GMRES and Arnoldi like the ConjugateGradients and the Lanczos method.

For example, the basic tool for explaining Krylov subspace methods forhermitian linear systems is the eigenvalue distribution.

However, for GMRES it is known for some time that if GMRES generatesa certain residual norm history, the same history can be generated withany nonzero spectrum [Greenbaum, Strakoš - 1994].

Complemented with the fact that GMRES can generate arbitrarynon-increasing residual norms, this gives the result that anynon-increasing convergence curve is possible with any nonzero spectrum[Greenbaum , Pták, Strakoš - 1996].

7

Introduction

Often one tries to use the tools that are successful for analysis ofhermitian counterparts of GMRES and Arnoldi like the ConjugateGradients and the Lanczos method.

For example, the basic tool for explaining Krylov subspace methods forhermitian linear systems is the eigenvalue distribution.

However, for GMRES it is known for some time that if GMRES generatesa certain residual norm history, the same history can be generated withany nonzero spectrum [Greenbaum, Strakoš - 1994].

Complemented with the fact that GMRES can generate arbitrarynon-increasing residual norms, this gives the result that anynon-increasing convergence curve is possible with any nonzero spectrum[Greenbaum , Pták, Strakoš - 1996].

A complete description of the class of matrices and right hand sides withprescribed convergence and eigenvalues was given in [Arioli , Pták, Strakoš

- 1998].

7

Convergence analysis for Krylov subspace methods

Theorem [Greenbaum, Pták & Strakoš, 1996]. Let

1 = ‖b‖2 = f0 ≥ f1 ≥ f2 · · · ≥ fn−1 > 0

be any non-increasing sequence of real positive values and let

λ1, . . . , λn

be any set of nonzero complex numbers.

8

Convergence analysis for Krylov subspace methods

Theorem [Greenbaum, Pták & Strakoš, 1996]. Let

1 = ‖b‖2 = f0 ≥ f1 ≥ f2 · · · ≥ fn−1 > 0

be any non-increasing sequence of real positive values and let

λ1, . . . , λn

be any set of nonzero complex numbers. Then there exists a class of matrices

A ∈ Cn×n and right-hand sides b ∈ Cn such that the residual vectors rk

generated by the GMRES method applied to A and b satisfy

‖rk‖2 = fk, 0 ≤ k ≤ n, and eig(A) = {λ1, . . . , λn}.

8

Convergence analysis for Krylov subspace methods

Theorem [Greenbaum, Pták & Strakoš, 1996]. Let

1 = ‖b‖2 = f0 ≥ f1 ≥ f2 · · · ≥ fn−1 > 0

be any non-increasing sequence of real positive values and let

λ1, . . . , λn

be any set of nonzero complex numbers. Then there exists a class of matrices

A ∈ Cn×n and right-hand sides b ∈ Cn such that the residual vectors rk

generated by the GMRES method applied to A and b satisfy

‖rk‖2 = fk, 0 ≤ k ≤ n, and eig(A) = {λ1, . . . , λn}.

For other popular methods like Bi-CG and QMR, it was proved as well thatconvergence behavior can be arbitrarily poor, independent from the eigenvaluedistribution [D.T. & Meurant, 2016].

8

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

9

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

The pseudo-spectrum, see e.g. [Trefethen, Embree - 2005],

9

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

The pseudo-spectrum, see e.g. [Trefethen, Embree - 2005],

the field of values, see e.g. [Eiermann - 1993],

9

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

The pseudo-spectrum, see e.g. [Trefethen, Embree - 2005],

the field of values, see e.g. [Eiermann - 1993],

the numerical polynomial hull, see e.g. [Greenbaum - 2002],

9

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

The pseudo-spectrum, see e.g. [Trefethen, Embree - 2005],

the field of values, see e.g. [Eiermann - 1993],

the numerical polynomial hull, see e.g. [Greenbaum - 2002],

the Ritz values, i.e. the eigenvalues of the Hessenberg matrices generatedby the underlying Arnoldi process, see e.g. [van der Vorst, Vuik - 1993].

9

Introduction

Other objects have been successful in explaining GMRES for particularproblems, including:

The pseudo-spectrum, see e.g. [Trefethen, Embree - 2005],

the field of values, see e.g. [Eiermann - 1993],

the numerical polynomial hull, see e.g. [Greenbaum - 2002],

the Ritz values, i.e. the eigenvalues of the Hessenberg matrices generatedby the underlying Arnoldi process, see e.g. [van der Vorst, Vuik - 1993].

Although in practice eigenvalues do often influence convergence of GMRES,they cannot be used as a universal tool for explaining GMRES and such a toolis unlikely to exist.

9

Introduction

An important tool for hermitian eigenproblems solved with Krylov subspacemethods is the following interlacing property:

10

Introduction

An important tool for hermitian eigenproblems solved with Krylov subspacemethods is the following interlacing property:

Consider a tridiagonal Jacobi matrix Tm and its leading principal submatrix Tk

for some k < m. If the ordered eigenvalues of T k are

ρ(k)1 < ρ

(k)2 < . . . < ρ

(k)k ,

then in every open interval between two subsequent eigenvalues

(ρ(k)i−1, ρ

(k)i ), i = 2, . . . , k,

there lies at least one eigenvalue of Tm.

10

Introduction

An important tool for hermitian eigenproblems solved with Krylov subspacemethods is the following interlacing property:

Consider a tridiagonal Jacobi matrix Tm and its leading principal submatrix Tk

for some k < m. If the ordered eigenvalues of T k are

ρ(k)1 < ρ

(k)2 < . . . < ρ

(k)k ,

then in every open interval between two subsequent eigenvalues

(ρ(k)i−1, ρ

(k)i ), i = 2, . . . , k,

there lies at least one eigenvalue of Tm.

This interlacing property enables, among others, to prove the persistencetheorem (see [Paige - 1971, 1976, 1980] or [Meurant, Strakoš - 2006]) which iscrucial for controlling the convergence of Ritz values in the Lanczos method.

10

Introduction

There are generalizations of the interlacing property to the non-hermitianbut normal case [Fan, Pall - 1957], [Thompson - 1966], [Ericsson - 1990],

[Malamud - 2005], [Carden - 2013].

11

Introduction

There are generalizations of the interlacing property to the non-hermitianbut normal case [Fan, Pall - 1957], [Thompson - 1966], [Ericsson - 1990],

[Malamud - 2005], [Carden - 2013].

There is no interlacing property for the principal submatrices of generalnon-normal matrices [de Oliveira - 1969], [Shomron, Parlett - 2009].

11

Introduction

There are generalizations of the interlacing property to the non-hermitianbut normal case [Fan, Pall - 1957], [Thompson - 1966], [Ericsson - 1990],

[Malamud - 2005], [Carden - 2013].

There is no interlacing property for the principal submatrices of generalnon-normal matrices [de Oliveira - 1969], [Shomron, Parlett - 2009].

This makes convergence analysis of the Arnoldi method for non-normalinput matrices rather delicate, just as it is for the GMRES method.

11

Introduction

There are generalizations of the interlacing property to the non-hermitianbut normal case [Fan, Pall - 1957], [Thompson - 1966], [Ericsson - 1990],

[Malamud - 2005], [Carden - 2013].

There is no interlacing property for the principal submatrices of generalnon-normal matrices [de Oliveira - 1969], [Shomron, Parlett - 2009].

This makes convergence analysis of the Arnoldi method for non-normalinput matrices rather delicate, just as it is for the GMRES method.

The GMRES and Arnoldi methods being closely related through theArnoldi process, can we show that arbitrary convergence behavior ofArnoldi is possible?

11

Introduction

There are generalizations of the interlacing property to the non-hermitianbut normal case [Fan, Pall - 1957], [Thompson - 1966], [Ericsson - 1990],

[Malamud - 2005], [Carden - 2013].

There is no interlacing property for the principal submatrices of generalnon-normal matrices [de Oliveira - 1969], [Shomron, Parlett - 2009].

This makes convergence analysis of the Arnoldi method for non-normalinput matrices rather delicate, just as it is for the GMRES method.

The GMRES and Arnoldi methods being closely related through theArnoldi process, can we show that arbitrary convergence behavior ofArnoldi is possible?

By arbitrary behavior we mean arbitrary Ritz values for all iterations (wedo not consider eigenvectors). Note that this involves many moreconditions than prescribing one residual norm per GMRES iteration.

11

2. Prescribed convergence for Arnoldi’s method

Notation: Let the kth Hessenberg matrix Hk generated in Arnoldi’s methodhave the eigenvalue ρ and eigenvector y,

Hk y = ρ y.

Then

12

2. Prescribed convergence for Arnoldi’s method

Notation: Let the kth Hessenberg matrix Hk generated in Arnoldi’s methodhave the eigenvalue ρ and eigenvector y,

Hk y = ρ y.

Then

ρ is a Ritz value for A

Vky is a Ritz vector for A

12

2. Prescribed convergence for Arnoldi’s method

Notation: Let the kth Hessenberg matrix Hk generated in Arnoldi’s methodhave the eigenvalue ρ and eigenvector y,

Hk y = ρ y.

Then

ρ is a Ritz value for A

Vky is a Ritz vector for A

With the Arnoldi decomposition AVk = Vk+1Hk, we obtain for the Ritz-valueRitz-vector pair {ρ, Vky} the residual norm:

‖A(Vky)−ρ(Vky)‖ = ‖A(Vky)−VkHky‖ = ‖Vk+1Hky−VkHky‖ = hk+1,k|eTk y|.

12

2. Prescribed convergence for Arnoldi’s method

Notation: Let the kth Hessenberg matrix Hk generated in Arnoldi’s methodhave the eigenvalue ρ and eigenvector y,

Hk y = ρ y.

Then

ρ is a Ritz value for A

Vky is a Ritz vector for A

With the Arnoldi decomposition AVk = Vk+1Hk, we obtain for the Ritz-valueRitz-vector pair {ρ, Vky} the residual norm:

‖A(Vky)−ρ(Vky)‖ = ‖A(Vky)−VkHky‖ = ‖Vk+1Hky−VkHky‖ = hk+1,k|eTk y|.

Often for small hk+1,k|eTk y|, the Arnoldi method takes {ρ, Vky} as an

approximate eigenvalue-eigenvector pair of A. Note that a small value

hk+1,k|eTk y| needs not imply that ρ is close to a true eigenvalue of A, see e.g.

[Chatelin - 1993], [Godet-Thobie - 1993]; convergence analysis cannot be based on

this value but focusses instead on the quality of approximate invariant

subspaces [Beattie, Embree, Sorensen - 2005].

12

2. Prescribed convergence for Arnoldi’s method

Theorem 1 [DT, Meurant - 2012]. Let the set

R = { ρ(1)1 ,

(ρ(2)1 , ρ

(2)2 ) ,

...

(ρ(n−1)1 , . . . , ρ

(n−1)n−1 ) ,

(λ1 , . . . . . . . . . , λn) },

represent any choice of n(n + 1)/2 complex Ritz values

13

2. Prescribed convergence for Arnoldi’s method

Theorem 1 [DT, Meurant - 2012]. Let the set

R = { ρ(1)1 ,

(ρ(2)1 , ρ

(2)2 ) ,

...

(ρ(n−1)1 , . . . , ρ

(n−1)n−1 ) ,

(λ1 , . . . . . . . . . , λn) },

represent any choice of n(n + 1)/2 complex Ritz values and denote by C(k) the

companion matrix of the polynomial with roots ρ(k)1 , . . . , ρ

(k)k , i.e.

C(k) =

0 . . . 0 −α0

1 0 . . . 0 −α1

. . ....

...1 −αk−1

,

k∏

j=1

(z − ρ(k)j ) = zk +

k−1∑

j=0

αjzj .

13

2. Prescribed convergence for Arnoldi’s method

If we define the unit upper triangular matrix U(S) through

U(S) = In −

0 C(1)e1

0

...

C(2)e2

0

...

...

C(n−1)en−1

0

,

14

2. Prescribed convergence for Arnoldi’s method

If we define the unit upper triangular matrix U(S) through

U(S) = In −

0 C(1)e1

0

...

C(2)e2

0

...

...

C(n−1)en−1

0

,

then the upper Hessenberg matrix

H(R) = U(S)−1C(n)U(S)

has the spectrum λ1, . . . , λn and its kth leading principal submatrix hasspectrum

ρ(k)1 , . . . , ρ

(k)k , k = 1, . . . , n − 1.

It has unit subdiagonal.

14

2. Prescribed convergence for Arnoldi’s method

Proof: The k × k leading principal submatrix of H(R) is

[Ik, 0] H(R)

[

Ik

0

]

= [Ik, 0] U(S)−1C(n)U(S)

[

Ik

0

]

= [U−1k , uk+1, . . . , un]

[

0Uk

0

]

= [U−1k , uk+1]

[

0Uk

]

,

where Uk denotes the k × k leading principal submatrix of U(S) and uj

denotes the vector of the first k entries of the jth column of U(S)−1 for j > k.

15

2. Prescribed convergence for Arnoldi’s method

Proof: The k × k leading principal submatrix of H(R) is

[Ik, 0] H(R)

[

Ik

0

]

= [Ik, 0] U(S)−1C(n)U(S)

[

Ik

0

]

= [U−1k , uk+1, . . . , un]

[

0Uk

0

]

= [U−1k , uk+1]

[

0Uk

]

,

where Uk denotes the k × k leading principal submatrix of U(S) and uj

denotes the vector of the first k entries of the jth column of U(S)−1 for j > k.

Its spectrum is also the spectrum of the matrix

Uk[U−1k , uk+1]

[

0Uk

]

U−1k = [Ik, Ukuk+1]

[

0Ik

]

,

which is a companion matrix with last column Ukuk+1.

15

2. Prescribed convergence for Arnoldi’s method

From

ek+1 = Uk+1U−1k+1ek+1 =

[

Uk −C(k)ek

0 1

] [

uk+1

1

]

=

[

Ukuk+1 − C(k)ek

1

]

we obtain Ukuk+1 = C(k)ek. �

16

2. Prescribed convergence for Arnoldi’s method

From

ek+1 = Uk+1U−1k+1ek+1 =

[

Uk −C(k)ek

0 1

] [

uk+1

1

]

=

[

Ukuk+1 − C(k)ek

1

]

we obtain Ukuk+1 = C(k)ek. �

Remark: The matrixH(R) = U(S)−1C(n)U(S).

is the unique upper Hessenberg matrix H(R) with the prescribed spectrum andRitz values and the entry one along the subdiagonal (see also [Parlett, Strang -2008] where H(R) is constructed in a different way).

16

2. Prescribed convergence for Arnoldi’s method

From

ek+1 = Uk+1U−1k+1ek+1 =

[

Uk −C(k)ek

0 1

] [

uk+1

1

]

=

[

Ukuk+1 − C(k)ek

1

]

we obtain Ukuk+1 = C(k)ek. �

Remark: The matrixH(R) = U(S)−1C(n)U(S).

is the unique upper Hessenberg matrix H(R) with the prescribed spectrum andRitz values and the entry one along the subdiagonal (see also [Parlett, Strang -2008] where H(R) is constructed in a different way).

Note that U(S) transforms the matrix C(n) with all Ritz values zero to thematrix H(R) with prescribed Ritz values. It is composed of (columns of)companion matrices and we will call U(S) the Ritz value companion transform.

16

2. Prescribed convergence for Arnoldi’s method

Thus the Ritz values generated in the Arnoldi method can exhibit anyconvergence behavior: It suffices to apply the Arnoldi process with the initialArnoldi vector e1 and the matrix H(R) with arbitrarily prescribed Ritz values.Then the method generates the Hessenberg matrix H(R) itself.

17

2. Prescribed convergence for Arnoldi’s method

Thus the Ritz values generated in the Arnoldi method can exhibit anyconvergence behavior: It suffices to apply the Arnoldi process with the initialArnoldi vector e1 and the matrix H(R) with arbitrarily prescribed Ritz values.Then the method generates the Hessenberg matrix H(R) itself.

Question: Can the same prescribed Ritz values be generated with positiveentries other than one on the subdiagonal?

17

2. Prescribed convergence for Arnoldi’s method

Thus the Ritz values generated in the Arnoldi method can exhibit anyconvergence behavior: It suffices to apply the Arnoldi process with the initialArnoldi vector e1 and the matrix H(R) with arbitrarily prescribed Ritz values.Then the method generates the Hessenberg matrix H(R) itself.

Question: Can the same prescribed Ritz values be generated with positiveentries other than one on the subdiagonal?

For σ1, σ2, . . . , σn−1 > 0 consider the diagonal similarity transformation

H ≡ diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj) H(R)

(

diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj)

)

−1.

17

2. Prescribed convergence for Arnoldi’s method

Thus the Ritz values generated in the Arnoldi method can exhibit anyconvergence behavior: It suffices to apply the Arnoldi process with the initialArnoldi vector e1 and the matrix H(R) with arbitrarily prescribed Ritz values.Then the method generates the Hessenberg matrix H(R) itself.

Question: Can the same prescribed Ritz values be generated with positiveentries other than one on the subdiagonal?

For σ1, σ2, . . . , σn−1 > 0 consider the diagonal similarity transformation

H ≡ diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj) H(R)

(

diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj)

)

−1.

Then the subdiagonal of H has the entries σ1, σ2, . . . , σn−1 and all leadingprincipal submatrices of H are similar the corresponding leading principalsubmatrices of H(R).

17

2. Prescribed convergence for Arnoldi’s method

This immediately leads to a parametrization of the matrices and initial Arnoldivectors that generate a given set of Ritz values R:

18

2. Prescribed convergence for Arnoldi’s method

This immediately leads to a parametrization of the matrices and initial Arnoldivectors that generate a given set of Ritz values R:

Theorem 2 [DT, Meurant - 2012]. Assume we are given a set of tuples

R = { ρ(1)1 ,

(ρ(2)1 , ρ

(2)2 ) ,

...

(ρ(n−1)1 , . . . , ρ

(n−1)n−1 ) ,

(λ1 , . . . . . . . . . , λn)} ,

of complex numbers and n − 1 positive real numbers

σ1, . . . , σn−1.

18

2. Prescribed convergence for Arnoldi’s method

This immediately leads to a parametrization of the matrices and initial Arnoldivectors that generate a given set of Ritz values R:

Theorem 2 [DT, Meurant - 2012]. Assume we are given a set of tuples

R = { ρ(1)1 ,

(ρ(2)1 , ρ

(2)2 ) ,

...

(ρ(n−1)1 , . . . , ρ

(n−1)n−1 ) ,

(λ1 , . . . . . . . . . , λn)} ,

of complex numbers and n − 1 positive real numbers

σ1, . . . , σn−1.

If A is a matrix of order n and v a unit nonzero n-dimensional vector, then thefollowing assertions are equivalent:

18

2. Prescribed convergence for Arnoldi’s method

1. The Hessenberg matrix generated by the Arnoldi method applied to Aand initial Arnoldi vector v has eigenvalues λ1, . . . , λn, subdiagonalentries σ1, . . . , σn−1 and ρ

(k)1 , . . . , ρ

(k)k are the eigenvalues of its kth

leading principal submatrix for all k = 1, . . . , n − 1.

19

2. Prescribed convergence for Arnoldi’s method

1. The Hessenberg matrix generated by the Arnoldi method applied to Aand initial Arnoldi vector v has eigenvalues λ1, . . . , λn, subdiagonalentries σ1, . . . , σn−1 and ρ

(k)1 , . . . , ρ

(k)k are the eigenvalues of its kth

leading principal submatrix for all k = 1, . . . , n − 1.

2. The matrix A and initial vector v are of the form

A = V DσU(S)−1C(n)U(S)D−1σ V ∗, v = V e1,

19

2. Prescribed convergence for Arnoldi’s method

1. The Hessenberg matrix generated by the Arnoldi method applied to Aand initial Arnoldi vector v has eigenvalues λ1, . . . , λn, subdiagonalentries σ1, . . . , σn−1 and ρ

(k)1 , . . . , ρ

(k)k are the eigenvalues of its kth

leading principal submatrix for all k = 1, . . . , n − 1.

2. The matrix A and initial vector v are of the form

A = V DσU(S)−1C(n)U(S)D−1σ V ∗, v = V e1,

where V is unitary, U(S) is the Ritz value companion transform,

Dσ = diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj),

and C(n) is the companion matrix of the polynomial with rootsλ1, . . . , λn.

19

2. Prescribed convergence for Arnoldi’s method

1. The Hessenberg matrix generated by the Arnoldi method applied to Aand initial Arnoldi vector v has eigenvalues λ1, . . . , λn, subdiagonalentries σ1, . . . , σn−1 and ρ

(k)1 , . . . , ρ

(k)k are the eigenvalues of its kth

leading principal submatrix for all k = 1, . . . , n − 1.

2. The matrix A and initial vector v are of the form

A = V DσU(S)−1C(n)U(S)D−1σ V ∗, v = V e1,

where V is unitary, U(S) is the Ritz value companion transform,

Dσ = diag (1, σ1, σ1σ2, . . . , Πn−1j=1 σj),

and C(n) is the companion matrix of the polynomial with rootsλ1, . . . , λn.

This also shows how little on the quality of the Ritz value ρ needs be said by

‖A(Vky) − ρ(Vky)‖ = hk+1,k|eTk y|.

Any distance from ρ to the spectrum of A is possible with any value of hk+1,k!

19

2. Prescribed convergence for Arnoldi’s method

Counterintuitive example 1: Convergence of interior Ritz values only:

R = { 3,

(3, 3) ,

(2, 3, 4) ,

(3, 3, 3, 3) ,

(1, 2, 3, 4, 5)} .

20

2. Prescribed convergence for Arnoldi’s method

Counterintuitive example 1: Convergence of interior Ritz values only:

R = { 3,

(3, 3) ,

(2, 3, 4) ,

(3, 3, 3, 3) ,

(1, 2, 3, 4, 5)} .

This gives the unit upper Hessenberg matrix

H(R) = U(S)−1C(5)U(S) =

3 0 0 0 01 3 1 0 1

1 3 −1 01 3 5

1 3

.

20

2. Prescribed convergence for Arnoldi’s method

Thus these Ritz values are generated by the Arnoldi method applied to

A = V diag (1, σ1, . . . , Πn−1j=1 σj)

3 0 0 0 01 3 1 0 1

1 3 −1 01 3 5

1 3

diag (1, σ1, . . . , Πn−1j=1 σj)−1V ∗

with initial vector v = V e1 and for any unitary V and positive valuesσ1, . . . , σn−1.

21

2. Prescribed convergence for Arnoldi’s method

Thus these Ritz values are generated by the Arnoldi method applied to

A = V diag (1, σ1, . . . , Πn−1j=1 σj)

3 0 0 0 01 3 1 0 1

1 3 −1 01 3 5

1 3

diag (1, σ1, . . . , Πn−1j=1 σj)−1V ∗

with initial vector v = V e1 and for any unitary V and positive valuesσ1, . . . , σn−1.

This is not a highly non-normal example, for instance with σi ≡ 1:

‖A‖‖A−1‖ = 9.7137,

and the eigenvector basis W of A has condition number

‖W ‖‖W −1‖ = 4.8003.

21

2. Prescribed convergence for Arnoldi’s method

Counterintuitive example 2: We can prescribe the “diverging” Ritz values

R = { 1,

(0, 2) ,

(−1, 1, 3) ,

(−2, 0, 2, 4) ,

(1, 1, 1, 1, 1)},

22

2. Prescribed convergence for Arnoldi’s method

Counterintuitive example 2: We can prescribe the “diverging” Ritz values

R = { 1,

(0, 2) ,

(−1, 1, 3) ,

(−2, 0, 2, 4) ,

(1, 1, 1, 1, 1)},

with corresponding unit upper Hessenberg matrix

H(R) = U(S)−1C(5)U(S) =

1 1 0 −3 01 1 3 0 −31

1 1 6 01 1 −10

1 1

.

22

2. Prescribed convergence for Arnoldi’s method

Counterintuitive example 2: We can prescribe the “diverging” Ritz values

R = { 1,

(0, 2) ,

(−1, 1, 3) ,

(−2, 0, 2, 4) ,

(1, 1, 1, 1, 1)},

with corresponding unit upper Hessenberg matrix

H(R) = U(S)−1C(5)U(S) =

1 1 0 −3 01 1 3 0 −31

1 1 6 01 1 −10

1 1

.

These Ritz values are generated by Arnoldi applied to

A = V H(R)V ∗, v = V e1

for unitary V.

22

2. Prescribed convergence for Arnoldi’s method

The same “diverging” Ritz values are generated with the exponentiallydecreasing values 2−1, 2−2, 2−3 and 2−4 on the subdiagonal of the Hessenbergmatrix:

A = V

1 2 0 −192 00.5 1 12 0 −15872

0.25 1 48 00.125 1 −160

0.0625 1

V ∗, v = V e1.

23

2. Prescribed convergence for Arnoldi’s method

The same “diverging” Ritz values are generated with the exponentiallydecreasing values 2−1, 2−2, 2−3 and 2−4 on the subdiagonal of the Hessenbergmatrix:

A = V

1 2 0 −192 00.5 1 12 0 −15872

0.25 1 48 00.125 1 −160

0.0625 1

V ∗, v = V e1.

Then the rounded residual norms ‖A(Vky) − ρ(Vky)‖ = hk+1,k|eTk y| seem to

indicate convergence:

{ 12,

(0.1118, 0.1118) ,

(0.011, 0.0052, 0.011) ,

(0.0006, 0.0001, 0.0001, 0.0006) } .

23

2. Prescribed convergence for Arnoldi’s method

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Blue dots: Eigenvalues of A. Red dots: Ritz values in the first iteration.

24

2. Prescribed convergence for Arnoldi’s method

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Blue dots: Eigenvalues of A. Red dots: Ritz values in the second

iteration.

25

2. Prescribed convergence for Arnoldi’s method

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Blue dots: Eigenvalues of A. Red dots: Ritz values in the third iteration.

26

2. Prescribed convergence for Arnoldi’s method

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Blue dots: Eigenvalues of A. Red dots: Ritz values in the fourth iteration.

27

2. Prescribed convergence for Arnoldi’s method

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Blue dots: Eigenvalues of A. Red dots: Ritz values in the one but last

iteration.

28

2. Prescribed convergence for Arnoldi’s method

This negative result shows that no convergence result can be proved forthe very popular Arnoldi method without special assumptions (see also[Embree - 2009] for the Arnoldi method with exact shifts).

29

2. Prescribed convergence for Arnoldi’s method

This negative result shows that no convergence result can be proved forthe very popular Arnoldi method without special assumptions (see also[Embree - 2009] for the Arnoldi method with exact shifts).

Obviously, any Ritz value ρ lies in the field of values of A:

Hky = ρy ⇒ y∗Hky = ρ ⇒ y∗V ∗

k AVky = ρ (‖y‖ = 1)

29

2. Prescribed convergence for Arnoldi’s method

This negative result shows that no convergence result can be proved forthe very popular Arnoldi method without special assumptions (see also[Embree - 2009] for the Arnoldi method with exact shifts).

Obviously, any Ritz value ρ lies in the field of values of A:

Hky = ρy ⇒ y∗Hky = ρ ⇒ y∗V ∗

k AVky = ρ (‖y‖ = 1)

The result therefore also shows how to construct matrices with a field ofvalues containing n(n + 1)/2 prescribed complex points

29

2. Prescribed convergence for Arnoldi’s method

This negative result shows that no convergence result can be proved forthe very popular Arnoldi method without special assumptions (see also[Embree - 2009] for the Arnoldi method with exact shifts).

Obviously, any Ritz value ρ lies in the field of values of A:

Hky = ρy ⇒ y∗Hky = ρ ⇒ y∗V ∗

k AVky = ρ (‖y‖ = 1)

The result therefore also shows how to construct matrices with a field ofvalues containing n(n + 1)/2 prescribed complex points

Thus in Crouzeix’s conjecture

‖p(A)‖ ≤ 2 maxz∈W (A)

|p(z)|,

the right-hand side is at least as large as

2 maxρ is a Ritz value

|p(ρ)|.

29

2. Prescribed convergence for Arnoldi’s method

This negative result shows that no convergence result can be proved forthe very popular Arnoldi method without special assumptions (see also[Embree - 2009] for the Arnoldi method with exact shifts).

Obviously, any Ritz value ρ lies in the field of values of A:

Hky = ρy ⇒ y∗Hky = ρ ⇒ y∗V ∗

k AVky = ρ (‖y‖ = 1)

The result therefore also shows how to construct matrices with a field ofvalues containing n(n + 1)/2 prescribed complex points

Thus in Crouzeix’s conjecture

‖p(A)‖ ≤ 2 maxz∈W (A)

|p(z)|,

the right-hand side is at least as large as

2 maxρ is a Ritz value

|p(ρ)|.

The following tries to gain insight in the conjecture for particularpolynomials arising in Krylov subspace methods.

29

3. Prescribed convergence for Arnoldi and GMRES

First, let us consider the GMRES polynomial.

30

3. Prescribed convergence for Arnoldi and GMRES

First, let us consider the GMRES polynomial. To solve a linear system

Ax = b, ‖b‖ = 1,

starting with initial guess x0 = 0, GMRES iterates xk minimize the residualvector rk = b − Axk :,

‖rk‖ = ‖b − Axk‖ = min ‖b − As‖ over all s ∈ Kk(A, b).

30

3. Prescribed convergence for Arnoldi and GMRES

First, let us consider the GMRES polynomial. To solve a linear system

Ax = b, ‖b‖ = 1,

starting with initial guess x0 = 0, GMRES iterates xk minimize the residualvector rk = b − Axk :,

‖rk‖ = ‖b − Axk‖ = min ‖b − As‖ over all s ∈ Kk(A, b).

The kth residual norm can be written as

‖rk‖ = ‖pGk (A)b‖ = min

π∈Π0

k

‖π(A)b‖,

where Π0k is the set of polynomials of degree at most k with the value one in

the origin.

30

3. Prescribed convergence for Arnoldi and GMRES

First, let us consider the GMRES polynomial. To solve a linear system

Ax = b, ‖b‖ = 1,

starting with initial guess x0 = 0, GMRES iterates xk minimize the residualvector rk = b − Axk :,

‖rk‖ = ‖b − Axk‖ = min ‖b − As‖ over all s ∈ Kk(A, b).

The kth residual norm can be written as

‖rk‖ = ‖pGk (A)b‖ = min

π∈Π0

k

‖π(A)b‖,

where Π0k is the set of polynomials of degree at most k with the value one in

the origin. The minimizing polynomial pGk is called kth GMRES polynomial.

30

3. Prescribed convergence for Arnoldi and GMRES

First, let us consider the GMRES polynomial. To solve a linear system

Ax = b, ‖b‖ = 1,

starting with initial guess x0 = 0, GMRES iterates xk minimize the residualvector rk = b − Axk :,

‖rk‖ = ‖b − Axk‖ = min ‖b − As‖ over all s ∈ Kk(A, b).

The kth residual norm can be written as

‖rk‖ = ‖pGk (A)b‖ = min

π∈Π0

k

‖π(A)b‖,

where Π0k is the set of polynomials of degree at most k with the value one in

the origin. The minimizing polynomial pGk is called kth GMRES polynomial.

For the kth GMRES polynomial, Crouzeix’s conjecture is

‖pGk (A)‖ ≤ 2 max

z∈W (A)|pG

k (z)|.

30

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

31

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

Can we say anything about the relation between GMRES residual norms and|pG

k (z)| on the field of values ?

31

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

Can we say anything about the relation between GMRES residual norms and|pG

k (z)| on the field of values ?

We can say something about the relation between GMRES residual norms andat least the Ritz values:

31

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

Can we say anything about the relation between GMRES residual norms and|pG

k (z)| on the field of values ?

We can say something about the relation between GMRES residual norms andat least the Ritz values: In general, there need not be any relation, they can beindependent from eachother.

31

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

Can we say anything about the relation between GMRES residual norms and|pG

k (z)| on the field of values ?

We can say something about the relation between GMRES residual norms andat least the Ritz values: In general, there need not be any relation, they can beindependent from eachother.

Let us try to explain why:

31

3. Prescribed convergence for Arnoldi and GMRES

Obviously,

‖rk‖ = ‖pGk (A)b‖ ≤ ‖pG

k (A)‖ ≤ 2 maxz∈W (A)

|pGk (z)|.

Can we say anything about the relation between GMRES residual norms and|pG

k (z)| on the field of values ?

We can say something about the relation between GMRES residual norms andat least the Ritz values: In general, there need not be any relation, they can beindependent from eachother.

Let us try to explain why: Writing xk in the Arnoldi basis,

xk = Vkyk ∈ Kk(A, r0),

and using the Arnoldi decomposition AVk = Vk+1Hk, we see that

‖b − Axk‖ = ‖b − AVkyk‖ = ‖Vk+1e1 − AVkyk‖= ‖Vk+1(e1 − Hkyk)‖ = min

y∈Ck

‖e1 − Hky‖.

31

3. Prescribed convergence for Arnoldi and GMRES

Thus the residual norms generated by the GMRES method are fully determinedby the Hessenberg matrix Hk.

32

3. Prescribed convergence for Arnoldi and GMRES

Thus the residual norms generated by the GMRES method are fully determinedby the Hessenberg matrix Hk.

We have seen that the subdiagonal entries of Hk can be chosenarbitrarily, for any prescribed Ritz values in the kth iteration.

32

3. Prescribed convergence for Arnoldi and GMRES

Thus the residual norms generated by the GMRES method are fully determinedby the Hessenberg matrix Hk.

We have seen that the subdiagonal entries of Hk can be chosenarbitrarily, for any prescribed Ritz values in the kth iteration.

Hence there is a chance we can modify the behavior of GMRES whilemaintaining the prescribed Ritz values.

32

3. Prescribed convergence for Arnoldi and GMRES

Thus the residual norms generated by the GMRES method are fully determinedby the Hessenberg matrix Hk.

We have seen that the subdiagonal entries of Hk can be chosenarbitrarily, for any prescribed Ritz values in the kth iteration.

Hence there is a chance we can modify the behavior of GMRES whilemaintaining the prescribed Ritz values.

Example from earlier: Consider the prescribed ’diverging’ Ritz values

R = { 1,

(0, 2) ,

(−1, 1, 3) ,

(−2, 0, 2, 4) ,

(1, 1, 1, 1, 1)} ,

and the prescribed subdiagonal entries of the generated Hessenberg matrix

σ1 = 2−1, σ2 = 2−2, σ3 = 2−3, σ4 = 2−4.

32

3. Prescribed convergence for Arnoldi and GMRES

The corresponding GMRES convergence curve is

‖r(0)‖ = 1, ‖r(1)‖ =

1

5, ‖r(2)‖ =

1

5, ‖r(3)‖ = 0.0052, ‖r(4)‖ = 0.0052.

Question: Can we force any GMRES convergence speed with arbitrary Ritzvalues by modifying the subdiagonal entries?

Not any, because there is a relation between GMRES stagnation and zero Ritzvalues: A singular Hessenberg matrix corresponds to stagnation in the parallelGMRES process, see [Brown - 1991]. In our example we have

ρ(1)1 = 1, ‖r(1)‖ =

1√5

(ρ(2)1 , ρ

(2)2 ) = (0, 2), ‖r(2)‖ =

1√5

(ρ(3)1 , ρ

(3)2 , ρ

(3)3 ) = (−1, 1, 3), ‖r(3)‖ = 0.0052

(ρ(4)1 , ρ

(4)2 , ρ

(4)3 , ρ

(4)4 ) = (−2, 0, 2, 4), ‖r(4)‖ = 0.0052.

33

3. Prescribed convergence for Arnoldi and GMRES

However, this is the only restriction Ritz values put on GMRES residual norms:

Theorem 3 [DT, Meurant - 2012]. Consider a set of tuples of complex numbers

R = { ρ(1)1 ,

(ρ(2)1 , ρ

(2)2 ) ,

...

(ρ(n−1)1 , . . . , ρ

(n−1)n−1 ) ,

(λ1 , . . . . . . . . . , λn)} ,

such that (λ1, . . . , λn) contains no zero number and n positive numbers

1 ≥ f(1) ≥ · · · ≥ f(n − 1) > 0,

such that the k-tuple (ρ(k)1 , . . . , ρ

(k)k ) contains a zero number if and only if

f(k − 1) = f(k).

34

3. Prescribed convergence for Arnoldi and GMRES

Let A be a square matrix of size n and let b be a nonzero n-dimensional vector.The following assertions are equivalent:

1. The GMRES method applied to A and right-hand side b with zero initialguess yields residuals r(k), k = 0, . . . , n − 1 such that

‖r(k)‖ = f(k), k = 0, . . . , n − 1,

A has eigenvaluesλ1, . . . , λn,

andρ

(k)1 , . . . , ρ

(k)k

are the Ritz values generated at the kth iteration for k = 1, . . . , n − 1.

35

3. Prescribed convergence for Arnoldi and GMRES

2. The matrix A and right hand side b are of the form

A = V U−1C(n)UV ∗, b = V e1,

where V is a unitary matrix,

U =

[

gT

0 T

]

where the first row gT of U is

g1 =1

f(0), gk =

1

f(k − 1)2− 1

f(k − 2)2, k = 2, . . . , n.

36

3. Prescribed convergence for Arnoldi and GMRES

2. The matrix A and right hand side b are of the form

A = V U−1C(n)UV ∗, b = V e1,

where V is a unitary matrix,

U =

[

gT

0 T

]

where the first row gT of U is

g1 =1

f(0), gk =

1

f(k − 1)2− 1

f(k − 2)2, k = 2, . . . , n.

and the remaining submatrix T of has entries satisfying

k∏

i=1

(λ − ρ(k)i ) =

1

tk,k

(

gk+1 +

k∑

i=1

ti,kλi

)

.

36

3. Prescribed convergence for Arnoldi and GMRES

2. The matrix A and right hand side b are of the form

A = V U−1C(n)UV ∗, b = V e1,

where V is a unitary matrix,

U =

[

gT

0 T

]

where the first row gT of U is

g1 =1

f(0), gk =

1

f(k − 1)2− 1

f(k − 2)2, k = 2, . . . , n.

and the remaining submatrix T of has entries satisfying

k∏

i=1

(λ − ρ(k)i ) =

1

tk,k

(

gk+1 +

k∑

i=1

ti,kλi

)

.

Note we exhausted all freedom modulo unitary transformation.

36

3. Prescribed convergence for Arnoldi and GMRES

Example: Standardly converging Ritz values and ’nearly stagnating’ GMRES:

R = { 5,

(1, 5) ,

(1, 4, 5) ,

(1, 3, 4, 5) ,

(1, 2, 3, 4, 5)} ,

‖r(0)‖ = 1, ‖r(1)‖ = 0.9, ‖r(2)‖ = 0.8,

‖r(3)‖ = 0.7, ‖r(4)‖ = 0.6, ‖r(5)‖ = 0 gives

37

3. Prescribed convergence for Arnoldi and GMRES

Example: Standardly converging Ritz values and ’nearly stagnating’ GMRES:

R = { 5,

(1, 5) ,

(1, 4, 5) ,

(1, 3, 4, 5) ,

(1, 2, 3, 4, 5)} ,

‖r(0)‖ = 1, ‖r(1)‖ = 0.9, ‖r(2)‖ = 0.8,

‖r(3)‖ = 0.7, ‖r(4)‖ = 0.6, ‖r(5)‖ = 0 gives

A = V

5 0 0 0 010.3237 1 0 0 0

0.8458 4 0 03.312 3 0

2.4169 2

V ∗, b = V e1.

37

3. Prescribed convergence for Arnoldi and GMRES

Again, this is not a highly non-normal example:

‖A‖‖A−1‖ = 28.9498,

and the eigenvector basis W of A has condition number

‖W ‖‖W −1‖ = 57.735.

The residual norms ‖A(Vky) − ρ(Vky)‖ = hk+1,k|eTk y| for the Ritz pairs are

10.3237,

(0.8458, 0.7886) ,

(0.8987, 3.312, 2.0509) ,

(0.9906, 2.4169, 2.3137, 1.7303) .

respectively, i.e. they give misleading information.

38

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

39

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

We have proved two more analogue results:

Any GMRES residual norms are possible with any harmonic Ritz values inall iterations.

39

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

We have proved two more analogue results:

Any GMRES residual norms are possible with any harmonic Ritz values inall iterations. This is perhaps even more surprising, because the harmonicRitz values are the roots of GMRES polynomials.

39

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

We have proved two more analogue results:

Any GMRES residual norms are possible with any harmonic Ritz values inall iterations. This is perhaps even more surprising, because the harmonicRitz values are the roots of GMRES polynomials. However, harmonic Ritzvalues are not in general in the field of values of A, so this has noimplications for Crouzeix’s conjecture for the GMRES polynomial.

39

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

We have proved two more analogue results:

Any GMRES residual norms are possible with any harmonic Ritz values inall iterations. This is perhaps even more surprising, because the harmonicRitz values are the roots of GMRES polynomials. However, harmonic Ritzvalues are not in general in the field of values of A, so this has noimplications for Crouzeix’s conjecture for the GMRES polynomial.

Any FOM residual norms are possible with any Ritz values in all iterations.

39

3. Prescribed convergence for Arnoldi and GMRES

Summarizing, any GMRES residual norms are possible with any Ritz values inall iterations.

We have proved two more analogue results:

Any GMRES residual norms are possible with any harmonic Ritz values inall iterations. This is perhaps even more surprising, because the harmonicRitz values are the roots of GMRES polynomials. However, harmonic Ritzvalues are not in general in the field of values of A, so this has noimplications for Crouzeix’s conjecture for the GMRES polynomial.

Any FOM residual norms are possible with any Ritz values in all iterations.

The FOM method differs from the GMRES method in that the residual norm isnot minimized, but the kth FOM residual vector is characterized through

rFk ⊥ Kk(A, b).

39

3. Prescribed convergence for Arnoldi and GMRES

The corresponding residual norms are related through to formula

1

‖rFk ‖ =

1

‖rGk ‖2

− 1

‖rGk−1‖2

.

Note that FOM residual norms need not be non-increasing and are not definedif the corresponding GMRES iterate stagnates.

40

3. Prescribed convergence for Arnoldi and GMRES

The corresponding residual norms are related through to formula

1

‖rFk ‖ =

1

‖rGk ‖2

− 1

‖rGk−1‖2

.

Note that FOM residual norms need not be non-increasing and are not definedif the corresponding GMRES iterate stagnates.

What seems interesting to me in the context of the Crouzeix’s conjecture isthat the Ritz values are the roots of the FOM polynomials:

40

3. Prescribed convergence for Arnoldi and GMRES

The corresponding residual norms are related through to formula

1

‖rFk ‖ =

1

‖rGk ‖2

− 1

‖rGk−1‖2

.

Note that FOM residual norms need not be non-increasing and are not definedif the corresponding GMRES iterate stagnates.

What seems interesting to me in the context of the Crouzeix’s conjecture isthat the Ritz values are the roots of the FOM polynomials:

FOM polynomials might lead to a way to test if the conjecture can bedisproved. For the kth FOM polynomial pF

k we have,

0 = 2 maxρ is a Ritz value

|pFk (ρ)| < ‖rF

k ‖ = ‖pFk (A)b‖ ≤ ‖pF

k (A)‖.

40

3. Prescribed convergence for Arnoldi and GMRES

The corresponding residual norms are related through to formula

1

‖rFk ‖ =

1

‖rGk ‖2

− 1

‖rGk−1‖2

.

Note that FOM residual norms need not be non-increasing and are not definedif the corresponding GMRES iterate stagnates.

What seems interesting to me in the context of the Crouzeix’s conjecture isthat the Ritz values are the roots of the FOM polynomials:

FOM polynomials might lead to a way to test if the conjecture can bedisproved. For the kth FOM polynomial pF

k we have,

0 = 2 maxρ is a Ritz value

|pFk (ρ)| < ‖rF

k ‖ = ‖pFk (A)b‖ ≤ ‖pF

k (A)‖.

Note the residual norms can be chosen arbitrarely large...

40

3. Prescribed convergence for Arnoldi and GMRES

The corresponding residual norms are related through to formula

1

‖rFk ‖ =

1

‖rGk ‖2

− 1

‖rGk−1‖2

.

Note that FOM residual norms need not be non-increasing and are not definedif the corresponding GMRES iterate stagnates.

What seems interesting to me in the context of the Crouzeix’s conjecture isthat the Ritz values are the roots of the FOM polynomials:

FOM polynomials might lead to a way to test if the conjecture can bedisproved. For the kth FOM polynomial pF

k we have,

0 = 2 maxρ is a Ritz value

|pFk (ρ)| < ‖rF

k ‖ = ‖pFk (A)b‖ ≤ ‖pF

k (A)‖.

Note the residual norms can be chosen arbitrarely large...

The construction to prescribe Ritz values and FOM residual norms is thefollowing:

40

3. Prescribed convergence for Arnoldi and GMRES

The matrix A and right hand side b are of the form

A = V U−1C(n)UV ∗, b = V e1,

where V is a unitary matrix,

U =

[

gT

0 T

]

where to force FOM residual norms f(0), . . . , f(n − 1), f(i) > 0, the first rowgT of U can be chosen as

gk =1

f(k − 1), k = 1, . . . , n

and the remaining submatrix T of has entries satisfying

k∏

i=1

(λ − ρ(k)i ) =

1

tk,k

(

gk+1 +

k∑

i=1

ti,kλi

)

.

41

Thank you for your attention.

42

Related publications

A. Greenbaum, V. Pták and Z. Strakoš, Any nonincreasing convergence curve is possible for GMRES,

SIAM J. Matrix Anal. Appl., 17 (1996), pp. 465–469.

M. Arioli, V. Pták and Z. Strakoš, Krylov sequences of maximal length and convergence of GMRES, BIT

Num. Maths., 38 (1996), pp. 636–643.

J. Duintjer Tebbens and G. Meurant, Any Ritz value behavior is possible for Arnoldi and for GMRES,

SIAM J. Matrix Anal. Appl., 33 (2012), pp. 958–978.

J. Duintjer Tebbens and G. Meurant, Prescribing the behavior of early terminating GMRES and Arnoldi

iterations, Numer. Algorithms, 65 (2014), pp. 69–90.

J. Duintjer Tebbens, G. Meurant, H. Sadok and Z. Strakoš, On investigating GMRES convergence using

unitary matrices, Lin. Alg. Appl., 450 (2014), pp. 83–107.

G. Meurant and J. Duintjer Tebbens, The role eigenvalues play in forming GMRES residual norms with

non-normal matrices, Numer. Algorithms, 68 (2015), pp. 143–165.

J. Duintjer Tebbens and G. Meurant, On the convergence of QOR and QMR Krylov methods for solving

nonsymmetric linear systems, BIT Num. Maths., 56 (2016), pp. 77–97.

43