analisis libro 1

7/27/2019 Analisis Libro 1

1/99

Script for the lectures on:Numerical Linear Algebra

Einfuhrung in die numerische Mathematik

Prof. Dr. P.E. Kloeden

Institut fur Mathematik

Johann Wolfgang Goethe Universitat

Zimmer 101, Robert-Mayer-Strae 10

Telefon: (069) 798 28622 Sekretariat (069) 798 22422

email: [email protected]

February 9, 2009


2/99

Contents

1 Computer Arithmetic 3

2 Vector and matrix norms 7

2.1 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Error estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 Condition number of a matrix . . . . . . . . . . . . . . . . 12

2.2.2 Fixed point theorem and successive iterations . . . . . . . 13

3 Linear systems of equations 15

3.1 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1.2 Row interchange . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Formulation as matrix multiplication . . . . . . . . . . . . . . . . 18

4 The LU decomposition 21

4.1 Row interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Post iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.4 The LU decomposition of the transposed matrix . . . . . . . . . 28

5 Matrices with a special structure 30

5.1 Symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.1.1 Gaussian elimination without row interchange . . . . . . . 30

5.1.2 The LDLT and Cholesky decompositions . . . . . . . . . 31

5.2 Positive definite symmetric matrices . . . . . . . . . . . . . . . . 335.3 Diagonally dominant matrices . . . . . . . . . . . . . . . . . . . . 36

5.4 Band matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.5 Tridiagonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . 41

6 The QR decomposition 44

6.1 Householder matrices . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.2 Construction of the QR factors . . . . . . . . . . . . . . . . . . . 49

6.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1


3/99

2

7 Iterative methods 53

7.1 Relaxation methods . . . . . . . . . . . . . . . . . . . . . . . . . 587.2 The SOR method . . . . . . . . . . . . . . . . . . . . . . . . . . 63

8 Krylov space methods 65

8.1 Krylov spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

8.1.1 Properties of Krylov spaces . . . . . . . . . . . . . . . . . 66

8.2 The OR-approach for symmetric, positive definite matrices . . . 67

8.2.1 Existence, uniqueness and minimality . . . . . . . . . . . 67

8.2.2 The OR approach for an A-conjugate basis . . . . . . . . 68

8.3 The CG method for positive definite matrices . . . . . . . . . . . 70

8.3.1 Computing A-conjugate search directions in Kn(A, b) . . 708.3.2 The Algorithm for the CG method . . . . . . . . . . . . . 728.3.3 The CG method for the normal equations . . . . . . . . . 73

8.4 The GMRES method and Arnoldi process . . . . . . . . . . . . . 73

8.4.1 The Arnoldi process . . . . . . . . . . . . . . . . . . . . . 74

8.4.2 A matrix version of the Arnoldi process . . . . . . . . . . 76

9 Calculating eigenvalues 78

9.1 The location of eigenvalues . . . . . . . . . . . . . . . . . . . . . 78

9.1.1 Gerschgorins theorem . . . . . . . . . . . . . . . . . . . . 81

9.2 The power method . . . . . . . . . . . . . . . . . . . . . . . . . . 85

9.3 The QR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 90

9.3.1 The QR transformation of Hessenberg matrices . . . . . 929.3.2 Convergence in the simplest case . . . . . . . . . . . . . . 96


4/99

Chapter 1

Computer Arithmetic

Literature Oevel, Kap. 1.2.3

The number field of a computer is only finite. Hence, in general, we can only

calculate numbers approximately with a computer.

Computers use a floating point representation for numbers, i.e. the numbers

have the form

x =

sign0. x1x2 . . . xd

mantissa10t

base +exponentThe length of the mantissa (here: d) determines the accuracy of the com-

puter representation of a number. For a base b (here b = 10, for computers

b = 2 is typical) we have xi {0, 1, . . . , b 1} and x1 = 0. The exponent is alsobounded: N + 1 t N.

Two immediate consequences of the boundedness of the exponent are overflow

and underflow :

overflow if|x| > 10N, the calculation stops!

underflow if |x| < 10N, the calculation continues with x = 0 (possiblywith difficulties later).

Often we can avoid difficulties with a clever reformulation, e.g.

a = 10N1 = a2 = 102N2 > 10N, overflow!but

a2 + 1 = |a|

1 + a2

3


5/99

CHAPTER 1. COMPUTER ARITHMETIC 4

= 10N1

1 + 1022N

, underflow 0

= 10N1

1 + 0

= 10N1

underflow can also be dangerous. let a = 10N1

1a2 + 1 a =

1

a

1 + a2 a overflow!

=

1

a awith underflow

=1

0!

but with a little bit of algebra:

1a2 + 1 a =

1a2 + 1 a

a2 + 1 + aa2 + 1 + a

=

a2 + 1 + a

= a

1 + a2 + a

2 10N1 after underflow

Overflow and Underflow are extreme cases.

There is a a more general complication: although each x

[1/10, 1) has a

unique decimal representation

x = 0 x1 . . . xdxd+1 . . . =i=1

xi 10i (x1 = 0),

many x have a representation with more than d digits, e.g.

x =1

3= 0 333 . . . 3 . . . .

In a computer such numbers are replaced by numbers with d digits.


6/99


Truncation x = 0.x1x2 . . . xd . . . 0.x1x2 . . . xd

Rounding x = 0.x1x2 . . . xdxd+1 . . . 0.x1 . . . xd1xdwhere

xd = xd if xd+1 < 5xd + 1 if xd+1 5

Example

0.1234 4 = 0.1234 (Truncation or Rounding),but

0.1234 5 = 0.1234 Truncation, 0.12345 = 0.1235 Rounding

The representation error by rounding is often smaller than by truncation

|x T r(x)| 10 10d1x [1/10, 1)

|x R(x)| 5 10d1

A similar situation holds for more general floating point numbers

x = 0.x1 . . . xd xd+1 10t, x1 = 0

d digits

T r(x) = 0.x1 . . . xd 10t

R(x) = 0.x1 . . .

xd 10t

Here the relative error is more appropriate, e.g.x T r(x)x 10 10d1

Remark Rounding is preferred, but one should nevertheless be careful, e.g.

with 5 digit arithmetic (N 1)


7/99


37654 + 25.874 =37679.874

37679 = 0.874 exact != 37680 (Rounding) 37679 = 1 ! not a single digit is accurate

but

37654 37679 =25

+25.874 = 25 + 25.874 = 0.874 !

i.e. the order in which numbers are handle can be important better with

similar magnitudes.

In addition: Rounding errors can accumulate. How does this happen: this de-

pends on the type of calculation, as we shall see later !


8/99

Chapter 2

Vector and matrix norms

Literatur Oevel, Kap. 5.13; Stummel/Hainer, Kap. 5

Problems:

many applications in practice involve linear systems of equations Ax = b

= Error estimate, convergence ??

Ax Ay Kx y

e.g., line of best fit problems lead to equations for measurement errors ofthe form

Ax b minx

Starting point: Vector space Rn over a field K = C or K = R

Example x = (x1, x2)T R2

0

x

7


9/99

CHAPTER 2. VECTOR AND MATRIX NORMS 8

x2 =

x21 + x12 length of x

Definition: A normed space (E, ) consists of a vector space E over a field Kand a norm : E R, where

(i) x 0 for all x(ii) x = 0 x = 0

(iii) x = || x for all K(iv) x + y x + y triangle inequality

Examples: For x = (x1, . . . , xn)T Rn

x2 = |x1|2 + . . . + |xn|2 euclidian norm x = max{|x1|, . . . , |xn|} maximum norm x1 = |x1| + . . . + |xn| summation norm (or Manhattan norm)

are norms on Rn and the inequalities

x1 x2 xhold for all x Rn. In addition, there are the p-norms:

|xp := p

|x1|p + . . . + |xn|p, p = 1, 2, 3, . . . .

Geometrical visualisation

d(x, y) = x y is the distance between two points x and y

Unit ball S(p)1 = {x Rn : xp = 1}, p = 1,2,3, . . . .

Unit balls for p = 1, 2 and


10/99


theorem All norms on Rn are equivalent, i.e. for two

norms a and b there exist numbers c, C > 0 withcxa xb Cxa, x Rn

proof:

We will show that an arbitrary norm on Rn is equivalent to the sum-mation norm 1.

Every vector x Rn has a unique coordinate representation x = ni=1 xiei,where ei is the ith unit vector, i.e., with ei,j = i,j (Kronecker delta). Thus we

have

x n

i=1

|xi|ei C1n

i=1

|xi| = x1,

where C1 = maxi=1,...,n ei.The mapping x x is Lipschitz continuous w.r.t. (i.e. with respect to)

the norm x y1:

|x y | x y C1x y1

and S1 = {x Rn : x1 = 1} is compact w.r.t. the norm x y1. Thereforethere exists a constant C0 > 0 with

C0 := minxS1

x,

where C0 > 0, because x = 0 if and only if x = 0. For an arbitrary x Rn \ {0} we have xx1 S1 and xx1

= 1x1 x C0 C0x1 x

Example: x x2

nx

Definition: A sequence {x(k)}k Rn is said to converge to x Rn, when

x(k)i xi k

for i = 1,. . ., n, i.e., component-wise convergence.

theorem:

x(k) x, k x(k) x 0 k


11/99


proof: Consider W.L.O.G. (i.e., without loss of generality) x = x =max{|x1|, . . . , |xn|}.

2.1 Matrix norms

The space

Kmn := {A : A = [ai,j ] matrix with m rows and n columns, ai,j K}

is a vector space over the field K

Consider matrices A, B Kmn and a scalar K:

A + B Kmn, A Kmn

Task: Define a norm A on Kmn !

1st Approach: vectorize A = [ai,j ], i.e., reformulate the m n-matrix mn-dim. vector through

a = ai,j , = i + m(j 1), i = 1, . . . , m, j = 1, . . . , n .

Maximum norm: Amax = maxi maxj |ai,j |,

Frobenius norm: A =

mi=1

nj=1

|ai,j |2

2nd Approach: the induced (or natural) matrix norm:

A = supx=0

Axx = maxx=1Ax,

which is often called the operator norm.

Definition: A matrix norm M is said to be consistent with the vector norm V, ifAxV AMxV, A Knn, x Kn

Remark: After matrix multiplication a vector x will become at most A timesbigger.

Remark: The induced matrix norm is the smallest of all consistent norms.

Examples: A Rnn x = (x1, . . . , xn)T Rn


12/99


(1) Maximum norm x = max{|x1|, . . . , |xn|}

Ax = maxi

nj=1

ai,jxj

maxi

nj=1

|ai,j | A

maxj

|xj | x

row summation norm A = maxi

nj=1

|ai,j |.

(2) Summation norm x1 = |x1| + . . . + |xn|

Column summation norm

A1 = maxj

ni=1

|ai,j |

(3) A2 =

largest eigenvalue of ATA

theorem All matrix norms onKnn are equivalent.

theorem A(k) converges to A component-wise A(k) A 0 for k .

proof: As for Vector norms.

theorem The induced matrix norms on Knn aresubmultiplicative, i.e.

A B A B, A, B Knn

proof:

ABxx =

ABxx

BxBx =

ABxBx

Bxx

y=Bx=

Ayy

Bxx A B

Definition: Let A be a square matrix: Then

(A) = max{|| : eigenvalue of A}

is called the spectral radius of A


13/99


theorem The spectral radius (A) is the smallest of all submultiplicative matrix

norms onKnn

, i.e., A (A)for all matrix norms .

proof: See Oevel, page 158.

Remark: (A) = A2 for symmetric matrices A.

2.2 Error estimates

Let us consider the effects of computational errors in solving a system of linear

equations Ax = b

a-posteriori error Let xnum be the numerical solution of Ax = b

Define the defect d = Axnum b

The error r = x xnum satisfies Ar = d.

Let

x

be a vector norm and

A

the induced matrix norm. Then

(1) Ar = d d A r r A1d

(2) r = A1d r A1 d

2.2.1 Condition number of a matrix

We have

A1 d x xnum A1 dbut it is better to use the relative error

1K

db x xnumx Kdb

where K = A A1 = cond(A) (i.e. divid here by A1b = x):

Definition: cond(A) is called the condition number of A.

Remarks:

(1) 1 AA1 A A1 = cond(A).


14/99


(2) cond(A) is a measure of the degree of invertibility of A

a measure of the quality of the problem

(3) A orthogonal condition number w.r.t. 2 always has the minimalvalue 1.

(4) 1A1 =

maxy=0A1y

y1

y=Ax= minx=0

Axx = minx=1 Ax

cond (A) =maxx=1

Axmin

x=1

Ax .

2.2.2 Fixed point theorem and successive iterations

A contraction is a mapping f : Rd Rd withf(x) f(y) Kx y x, y Rd,

for a constant K < 1.

theorem LetT be a dd-matrix andc Rd. The linear mapping(x) = T x + c is a contraction if and only if T < 1.

proof

T = maxz=0

T zz = maxx=y

T x T yx y .

theorem The successive iterations x(i+1) = T x(i) + c converge

for all x(0) if and only if (T) < 1.

A fixed point theorem ensures that a mapping (x) has a unique fixed point,

i.e. a point x Rd with x = (x) and that the successive iterations x(i+1)= (x(i)) converge to this fixed point. In the fixed point theorem of Banach the

mapping is a contraction.

We use the following estimates for the iterative solution of a system of linearequations Ax = b, which we rewrite as

x = T x + c

e.g. write x = (I A)x + b, so T = I A but there are other and betterpossibilities (later!)

a priori error (what we know at the start)

x(i) x Ti

1 T

x(1) x(0)


15/99


a posteriori error (what we know in the course of a calculation)

x(i) x T1 T

x(i) x(i1)We use these estimate in particular for a stop command let > 0 be the

desired precision and i = i() the first whole number for which the right hand

side above is smaller than or equal to . If we stop after i() iterations, then we

have (at least) the precision .


16/99

Chapter 3

Linear systems of equations

Literatur Oevel, Kap. 5; Schwarz Kap.1; Stummel/Hainer, Kap. 6

Consider a linear systems of equations with n equations and n unknowns

a1,1x1 + . . . + a1,nxn = b1... . . . . . .

... =...

an,1x1 + . . . + an,nxn = bn

or in matrix-vector formAx = b,

where A = [ai,j ] is an n n invertible matrix and

x =

x1...

xn

, b =

b1...

bn

are n-dimensional vectors.This system has a unique solution

x = A1b,

which we can represent explicitly using Cramers rule, i.e. with

A1 =1

det(A) adjunct matrix (A)T

n2 determinants

The Cramer formula is not very practical as a solution method and is often

almost impossible to use

det(A) =p

sign(p) a1,p1 . . . an,pn ,

15


17/99

CHAPTER 3. LINEAR SYSTEMS OF EQUATIONS 16

where the summation is over all permutations p = (p1, . . . , pn) of{1, 2, . . . , n}.There are n! permutations, so we need 0(n!) arithmetic operations, e.g.

100! 9 10157

Such a summation is often numerically unstable due to possible cancellation

errors.

3.1 Gaussian elimination

A practical alternative is the Gaussian elimination method, by means of which

we convert the original system of equations to an easily solved triangular systemusing successive linear transformations

(1)

a1,1x1 + a1,2x2 + . . . + a1,nxn = b1a2,1x1 + a2,2x2 + . . . + a2,nxn = b2...

an,1x1 + an,2x2 + . . . + an,nxn = bn

Let a1,1 = 0. Then we eliminate x1 from the last n 1 equations with thelinear transformation

ai,j

ai,j = ai,j

ai,1a1,1 a1,j

j = 1, . . . , n

bi bi = bi ai,1a1,1 b1

forr i = 2, . . ., n

i.e. we subtractai1a11

times the first equation from the equations for i =

2, . . . , n.

Then we obtain the equivalent system of equations (i.e. with the same

solution):

(2)

a1,1x1 + a1,2x2 + . . . + a1,nxn = b1

a2,2x2 + . . . + a2,nxn = b

2

...

an,2x2 + . . . + an,nxn = b

n

We repeat the procedure for the last n1 equations with the n1 unknownsx2, . . ., xn (under the assumption that a

2,2 = 0) and so on for the last n

equations with the n unknowns xn, . . ., xn for = 1,. . .,n 1.

After the final step we obtain an equivalent triangular system of the form


18/99


a(n)1,1x1 + a(n)1,2x2 + a(n)1,3x3 + . . . + a(n)1,nxn = b(n)1a(n)2,2x2 + a

(n)2,3x3 + . . . + a

(n)2,n = b

(n)2

a(n)3,3x3 + . . . + a

(n)3,nxn = b

(n)3

...

a(n)n,nxn = b

(n)n

which we can solve through backwards substitution.xn = b

(n)n /a

(n)n,n

xi =

b(n)i

nj=i+1 a

(n)i,j xj

/a

(n)i,i

for i = n

1, . . . , 1.

For this we need

n1i=1

nj=i+1

1 =

n1i=1

(n i) = n(n 1)2

additions/subtractions and

1 +

n1i=1

1 + nj=i+1

1

= 1 + n1i=1

(n i + 1) = n(n + 1)2

multiplications/divisions.

in total n2 operations.

The Gaussian elimination method needs

n1j=1

ni=j+1

1 + n+1k=j+1

1

= 16

(n 1)n(2n + 5) 0(n3)

multiplications/divisions and a similar number of additions/subtractions.

Thus for a large n Gaussian elimination with backwards substitution needs

0(n3) arithmetic operations.

Compare with 0(n!) for Cramers formula

3.1.1 Summary

Write Ax = b as A(1)x = b(1), i.e. with a(1)i,j ai,j and b(1)i bi.

We have replaced A(1)x = b(1) by successively simplified systems:


19/99


A(1)

x = b(1)

A(2)x = b(2)

A(3)x = b(3)

A(k)x = b(k)

=

=

=

=

using the following linear transformations

a(+1)i,j a()i,j i = 1, . . . , , j = 1, . . . , n + 1

and

a

(+1)

i,j = a

()

i,j a()i,

a(), a

()

,j , i = + 1, . . . , n, j = 1, . . . , n + 1

where we have written a()i,n+1 = b

()i .

In fact we only have to calculate the a(+1)i,j with i + 1 for j + 1,

because the other components are all equal to 0 or will become equal to 0.

3.1.2 Row interchange

Above we have asumed that a(), = 0 fur = 1, . . ., n. This does not always

hold, but (due the assumption that det(A) = 0) we can always exchange theth row for which a(

), = 0 with a row under it with a

()j, = 0 where j >

For simplicity we will assume for now that a row interchange is not necessary.

3.2 Formulation as matrix multiplication

We can represent the elimination procedure by matix multiplication

A(k+1) = F(k)A(k)


20/99


where

F(k) =

1 1

1

k+1,k 1...

n,k 1

(the other components are all equal to 0) with

i,j =a(j)i,j

a

(j)

j,j

, i = j, . . . , n , j = 1, . . . , n 1.

Thus we have

A(n) = F(n1) . . . F (1)A(1)

or

A(1) = F(1)1

. . . F (n1)1

A(n) invertible since det(F(k)) = 1

But

F(k)1

=

1 1

1

k+1,k 1...

n,k 1

where the other components are all equal to 0, and

L := F(1)1

. . . F (n1)1

=

1

2,1 13,1 3,2 1

......

.... . .

. . . 1 n,1 n,2 n,3 . . . n,n1 1

a lower triangular matrix,

i.e. A = LU, where


21/99


R = U = A(n) =

a(n)1,1 a

(n)1,2 . . . a

(n)1,n

a(n)2,2 . . . a

(n)2,n

. . ....

a(n)n,n

an upper triangular matrix.

(German: Links-/Rechts- dreiecksmatrizen with A = LR)

This LU decomposition has many advantages

(1)

det(A) = det(L)det(U) = 1.a(n)1,1a

(n)2,2 . . . a

(n)n,n

(2) We can quickly solve Ax = b for all b without having to repeat the elimi-

nation procedure:

b = Ax = LU x = Ly with U x = y

solve (1) Ly = b (forwards substitution)

then (2) U x = y (backwards substitution)

Here forwards substitution for Ly = b means

y1 = b1, yi = bi i1j=1

i,j yj , i = 2, . . . , n

Forwards and backwards substitution both need O(n2) arithmetic opera-

tions.

(3) To calculate the inverse matrix A1

we solve the n systems of equations

Ax(j) = e(j) jth unit vector, j = 1, . . . , n

i.e. Ly(j) = e(j), then U x(j) = y(j)

A1 = [x(1)| . . . |x(n)].

This method needs 0(n3) arithmetic operations.


22/99

Chapter 4

The LU decomposition

Literatur Oevel, Kap. 5.5; Schwarz Kap.1.1; Stummel/Hainer, Kap. 6.1

Consider an n n invertible matrix A = [ai,j ]. If row interchange is notnecessary, the Gaussian elimination procedure

A = A(1) A(2) A(n)

leads to the following LU decomposition:

A = LU =

1 2,1 1

3,1 3,2. . .

. . . . . . . . . 1

n,1 n,2 . . . n,n1 1

a(n)1,1 . . . a

(n)1,n

a(n)2,1 . . . a(n)2,n

. . ....

a(n)n,n

where the i,j are defined by

i,j =a(j)i,j

a(j)j,j

, i = j + 1, . . . , n, j = 1, . . . , n 1.

theorem The LU decomposition is unique.

proof: Let L1, U1 and L2, U2 be two LU decompositions of the matrix A. Then

A = L1U1 and A = L2U2

and therefore

L1U1 = L2U2

or

L12 L1 = U2U11

21


23/99

CHAPTER 4. THE LU DECOMPOSITION 22

But L1, L2 and therefore L12 , L1 and L

12 L1 are lower triangular matri-

ces. Similarly, U1, U2 and therefore U1

1 , U2 and U2U1

1 are upper triangularmatrices. Their products can only be equal when they are diagonal matrices, i.e.

L12 L1 = U2U11 = D =

d1,1

d2,2. . .

dn,n

But L1, L2 and therefore L

12 , L1 and L

12 L1 all have 1 as their diagonal

components. Thus D = I, the n n identity matrix, i.e.

L12 L1 = I = U2U11

or L1 = L2 und U1 = U2. Hence the LU decomposition is unique.

In the kth step A(k) A(k+1) of the elimination process we apply the lineartransformations

a(k+1)i,j a(k)i,j , i = 1, . . . , k, j = 1, . . . , n

and

a(k+1)i,j = a

(k)i,j i,ka(k)k,j , i = k + 1, . . . , n, j = 1, . . . , n

But the new a(k+1)i,j are all equal to 0 for j = 1, . . ., k + 1 when i = k + 1,

. . ., n. Thus we can replace these components immediately by 0 without doing

a calculation (which may have a round off error). Then the corresponding ijth

components are always equal to 0 in the following steps.

Thus it is not necessary to store these 0 components of the matrix. In-

stead we can store the i,j components in these free places. (We do not have

to store the diagonal components since we know that they are equal to 1).

In the following steps of the elimination procedure these i,j components are

not transformed.

Consequently, we need to store only one n

n matrix instead of two.

A(1) A(2)

A(3)

l2,1l2,1

ln,1ln,1

l3,2

ln,2


24/99


4.1 Row interchange

An interchange of the th and jth rows with j > corresponds to a matrix

multiplication by the permutation matrix

P =

1

1

0 1 () 1

1 1 0 (j)

1

1

where det(P) = 1.

If a(k)k,k = 0 then we can find such a permutation matrix P

(k). We replace

A(k) by P(k)A(k) and continue with the elimination procedure. If a(k)k,k = 0, then

we take P(k) I. At the end we obtain an LU decomposition

LU = P A

where P = P(n1) . . . P (2)P(1) with

det P = (1)A , A = #{k : P(k) = I}

det(A) = (1)A a(n)1,1a(n)2,2 . . . a(n)n,n.

Thus in the derivation of the LU decomposition we have to exchange the whole

row

[i,1, . . . , i,k1, a(k)i,k , . . . , a

(k)i,n ]

if a row interchange is necessary. In this case we should introduce a permutation vector

p(k) =

p(

k)1

...

p(k)n

in order to retain a list of the corresponding row ordering, i.e. elimination with

row interchange gives


25/99


a(1)ij a

(2)ij

a(n)ij

A

(1)

= A A

(2)

A(n)

= R

p(1)

p

(2)

p

(n)

after interchange, elimination

1

2

n

li1

lij

......

...

p(2)1

p(2)2

p(2)n

p(n)1

p(n)2

p(n)n

with the permutation vectors

p(1) =

1

2...

n

, p(2) =

p(2)1

p(2)2...

p(2)n

, . . . , p(n) =

p(n)1

p(n)2...

p(n)n

At the end we define a permutation matrix P = [pi,j ] by

pi,j =

1 if j = p

(n)i

0 otherwise

(This permutation matrix and the permutation matrix P = P(n1) . . . P (2)P(1)

defined above are the same).

This way we obtain

P A = LU

with

L =

1

2,1 1...

. . .

n,1 n,2 ... 1

and U =

a(n)1,1 . . . a

(n)1,n

a(n)2,2 . . . a

(n)2,n

. . ....

. . . a(n)n,n

Example: Oevel, page 115, example 5.6

A =

0 0 1 1

2 2 2 2

1 2 2 2

1 2 3 6

, P =

0 1 0 0

0 0 1 0

1 0 0 0

0 0 0 1

P A =

2 2 2 2

1 2 2 2

0 0 1 1

1 2 3 6

= LR =

1 0 0 0

1/2 1 0 0

0 0 1 0

1/2 1 1 1

2 2 2 2

0 1 1 1

0 0 1 1

0 0 0 3


26/99


proof

A =

0 0 1 1

2 2 2 2

1 2 2 2

1 2 3 6

(p|A) =

1 0 0 1 1

2 2 2 2 2

3 1 2 2 2

4 1 2 3 6

P

2 2 2 2 2

1 0 0 1 1

3 1 2 2 2

4 1 2 3 6

E

2 2 2 2 2

1 0 0 1 1

3 1/2 1 1 1

4 1/2 1 2 5

P

2 2 2 2 2

3 1/2 1 1 1

1 0 0 1 1

4 1/2 1 2 5

i,j

E

2 2 2 2 2

3 1/2 1 1 1

1 0 0 1 1

4 1/2 1 1 3

E

2 2 2 2 2

3 1/2 1 1 1

1 0 0 1 1

4 1/2 1 1 3

i.e., with the combined permutation

p(1) =

1

2

3

4

p(2) =

2

3

1

4

and hence

P =

0 1 0 0

0 0 1 0

1 0 0 0

0 0 0 1

LU =

1 0 0 0

1/2 1 0 0

0 0 1 0

1/2 1 1 1

2 2 2 2

0 1 1 1

0 0 1 1

0 0 0 3

P A =

2 2 2 2

1 2 2 2

0 0 1 1

1 2 3 6

= LU


27/99


Here det P = 1, so

det A = det Pdet A = det P A = det LU = det L det U = 1 (2 1 1 3) = 6.

(This provides a quick and accurate way to calculate the determinant of the

matrix A)

4.2 Pivoting

Consider the elimination step A(k) A(k+1). If we assume that a(k)k,k = 0, thenwe use a

(k)k,k as the pivot element in the elimination process.

If a(k)k,k = 0, then we swap the kth row with the jth row for some j > k.

Then we use the the new component a(k)k,k (in fact the old component a

(k)j,k = 0)

as the pivot element.

How should we choose j ?

In addition, we can also encounter difficluties due to round off error when

a(k)k,k = 0, especially when it is very small or very large. In this situation it is

also useful to look for a new pivot element.

There are various pivoting strategies, which involve column as well as row

interchanges, for choosing an appropriate pivot element a(k)p,q with p, q k.

Such strategies are usually very expensive to use. (See text books for some

examples).

4.3 Post iteration

Literatur Schwarz: Seite 27-29, Stummel/Hainer: Seite 118-119

Consider the system of equations

Ax = b, A is n n and invertible

Let x be the exact solution, i.e. Ax b, and let x be the numerical solutionwith the defect or residue

d = Ax b (= 0)Then the error r = x x satisfies system of equations

Ar = d,


28/99


i.e. with the same matrix A.

We can solve this system quickly because we have already determined the

LU decomposition of the matrix A.

Method of post iteration

Compute:

(i) the LU decomposition of A

(ii) a numerical solution x

(iii) the defect d = Ax b

(iv) a numerical solution r of Ar = d with a higher precision

Then x + r should be a better approximation of the exact solution x than x.

Example: See Example 1.7 in Schwarz.

We can use the error estimate

A1 b x A1 b

from Chapter 2 for the system Ax = b, where the vector and matrix norms are

consistent.

Now consider the system of equations Ar = d. Then we have the following

estimate of the absolute error

A1 d x x r

A1 d

and the relative error

1

K

db

x xx K

db ,

where K = A A1 is the condition number of the matrix A. The termx is unknown, but x is known and is often an adequate approximation forit in this relative error expression.


29/99


4.4 The LUdecomposition of the transposed ma-

trix

Consider an n n invertible matrix A = [ai,j ] and suppose that we can applyGauss elimination without row interchange, i.e.

A(1) A(k) A(k+1) A(n)

with the linear transformations

a

(k+1)

i,j a(k)

i,j , i = 1, . . . , k, j = 1, . . . , n

and

a(k+1)i,j = a

(k)i,j i,ka(k)k,j , i = k + 1, . . . , n, j = 1, . . . , n

where

i,j =a(j)i,j

a(j)j,j

, j = 1, . . . , n 1, i = j + 1, . . . , n

(No row interchange means that all a(j)j,j = 0.)

Then we obtain

A = LU =

1

2,1 1...

. . .

n,1 n,2 . . . n,n1 1

a(n)1,1 . . . . . . a

(n)1,n

a(n)2,2 . . . a

(n)2,n

. . ....

a(n)n,n

Moreover, this LU decomposition is unique!

Thus we have

AT = (LU)T = UTLT

=

a(n)1,1

a(n)1,2 a

(n)2,2

......

. . .

a(n)1,n a

(n)2,n . . . a

(n)n,n

1 2,1 . . . n,1

1 . . . n,2. . .

n,n1 1

Define


30/99


D =

a(n)1,1

a(n)2,2

. . .

a(n)n,n

All a

(n)j,j = 0 D1 exist with

D1 =

1/a(n)1,1

1/a(n)2,2

. . .

1/a(n)n,n

Thus we obtain

AT = UTLT = (UTD1)(DLT)]

= (D1U)T(LD)T

=

1 1

. . .

1

. . .

= LRwith L = (D1U)T and U = (LD)T, i.e. the LU decomposition of AT is

AT = LU.


31/99

Chapter 5

Matrices with a special

structure

Literatur Oevel, Kap. 5; Schwarz Kap.1; Stummel/Hainer, Kap. 5

The Gaussian elimination method is often a lot easier and more efficient

when the matrices have a special structure, e.g.

symmetric

band positive definite

diagonally dominant

Such matrices often arise in various applications, e.g. in numerical methods

for splines or partial differential equations.

5.1 Symmetric matrices

A symmetric matrix satisfies AT = A, i.e. ai,j = aj,i, i, j = 1, . . . , n.

5.1.1 Gaussian elimination without row interchange

The computational cost for Gaussian elimination, i.e. the LU decomposition,

for symmetric matrices is reduced by roughly a half.

This is easy to see when row interchange is not needed. Consider

30


32/99

CHAPTER 5. MATRICES WITH A SPECIAL STRUCTURE 31

a1,1x1 + a1,2x2 + . . . + a1,nxn = b1a2,1x1 + a2,2x2 + . . . + a2,nxn = b2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

an,1x1 + an,2x2 + . . . + an,nxn = bn

Using the linear transformations

ai,j = ai,j ai,1a1,1

a1,j, i = 2, . . . , n, j = 1, . . . , n + 1 (bi ai,n+1)

we obtain the reduced system of equations

a1,1x1 + a1,2x2 + . . . + a1,nxn = b1a2,2x2 + . . . + a2,nxn = b

2

...

an,2x2 + . . . + an,nxn = b

n

The last (n 1) (n 1) block here is also symmetric, because

aj,i = aj,i aj,1a1,1

a1,i = ai,j a1,ia1,1

aj,1 = ai,j a1,ia1,1

a1,j = ai,j

fur i, j = 2, . . . , n.

This means that we need to evaluate the components ai,j only for i = 2, . . . , nand j = i , . . . , n the other values then follow immediately by symmetry aj,i= ai,j .

If a row interchange is necessary, then at the same time we should also ex-

change the corresponding columns in order to retain the symmetry of the new

(n k 1) (n k 1) reduced matrix.

In this case we have

LU = P APT

(instead of = P A)

See Oevel: Example 5.7, page 120.

5.1.2 The LDLT and Cholesky decompositions

Let A be symmetric, i.e. AT = A or ai,j = aj,i for all i, j = 1, . . . , n. Then we

have

LU = AT = A = LU.


33/99


i.e. A = LU =

L

U

Let D be the diagonal matrix D = diag(a1,1, . . . , an,n).Then

AT = (LU)T = UTLT = UTD1DLT = (D1U)T(DLT) = LU,where D1U is an upper triangular matrix with 1 along the diagonal and DLT

is also an upper triangular matrix .

The LU decomposition of A is unique

L = L or (D1U)T = L and

U = U or DLT = U

U = DLTThus the LU decomposition of A is

A = LU = LDLT

This is the LDLT decomposition for this we assume that A is invertible and

symmetric, and that Gaussian elimination goes through without requiring any

row interchange.

Suppose in addition that all a(n)j,j > 0 and define

D =

a(n)1,1 a(n)2,2

. . .

a(n)n,n

and

L = L

D =

1,1 2,1 2,2

......

. . .

n,1 n,2 . . . n,n

i.e. with

j,j =

a(n)j,j , j = 1, . . . , n

i,j = i,j

a(n)j,j j = 1, . . . , n 1, i = j + 1, . . . , n

Then we have

A = LDLT = (L

D)(

DLT) = (L

D)(L

D)T = LLT


34/99


i.e. A = LLT this is called the Cholesky dcomposition of A

Here A is

invertible symmetric Gaussian elimination goes through without requiring any row interchange,

i.e. all a(n)j,j > 0.

When do these properties all hold?

5.2 Positive definite symmetric matrices

A symmetric matrix A = [ai,j ] Rnn is said to be positive definite if thequadratic form

Q(x) = xT Ax =n

i,j=1

ai,jxixj , x =

x1...

xn

is positive, i.e. if

(1) Q(x) 0 for all x Rn and(2) Q(x) = 0 if and only if x = 0

Remark: let A = [ai,j ] be positive definite. Then

(i) A is invertible

(ii) all aj,j > 0, j = 1, . . . , n

proof:

(1) Ax = 0 with x = 0 xTAx = 0 contradiction!

(ii) Q(e(j)) = aj,j > 0, wheree(j) =

0

1

0

jth element.

We will now assume that the matrix A = [ai,j ] positive definite.

From Linear Algebra we know that the eigenvalues of A are real and positive.


35/99


However, eigenvalues are difficult to calculate.

We want to be able to carry out Gaussian elimination without row inter-

change.

theorem (See Schwarz, Theorem 1.8, page 40)

A symmetric matrix A = [ai,j ] Rnn with a1,1 > 0 is positivedefinite if and only if the reduced matrix [ai,j ] R(n1)(n1) with


a1,j, i, j = 2, . . . , n

is positive definite

proof: By assumption we have a1,1 > 0 and ai,j = aj,i, i, j = 1, . . . , n

Q(x) = ni,j=1 ai,jxixj= a1,1x

21 + 2

ni=2 ai,1x1xi +

ni,j=2 ai,jxixj

= a1,1

x1 +

ni=2

ai,1a1,1

xi

2+n

i,j=2

ai,j ai,1

a1,1a1,j

xixj

= a1,1 x1 +ni=2

ai,1a1,1

xi2

+ni,j=2 a

i,jxixj

(a) Necessary Suppose that A = [ai,j ] is positive definite. Consider a vectorx = (x2, . . . , xn) = (0, . . . , 0) in Rn1 and setx1 =

ni=2

ai,1a1,1

xi.

Then for x = (x1|x) := (x1, x2, . . . , xn) = (0, 0, . . . , 0) in Rn we have0 < Q(x) = a1,1 02 +

n

i,j=2ai,jxixj

i.e. 0 0)

It follows from (i) that x2 = x3 = . . . = xn = 0.

Then (ii) gives

a1,1x21 = 0,

but a1,1 > 0, so x1 = 0, i.e. x = (0, . . . , 0) Rn.

Hence A is positive definite.

From this theorem and the above remarks we can use Gaussian elimination

without row interchange when the matrix A = [ai,j ] is positive definite.

Moreover,

a(j)j,j = a

(n)j,j > 0

holds for each j = 1, 2, . . . , n.

Thus the matrix A = [ai,j ] has a Cholesky decomposition

A = LLT.

Instead of computing L byL = L

D as the product of matrices, we can

calculate the i,j components directly with the following transformations:

Algorithm for the Cholesky decomposition

Define a(1)i,j ai,j , i, j = 1, 2, . . . , n.

Compute for k = 1, 2, . . . , n 1(1) k,k =

a(

k)k,k

(2) i,k =a(k)i,k

k,k, i = k + 1, . . . , n

(3) a(k+1)i,j = a

(k)i,j i,k j,k, i, j = k + 1, . . . , n

N.B. The a(k)i,j components for k > 1 here are not the same as those in the

corresponding step of the Gaussian elimination procedure.


37/99


5.3 Diagonally dominant matrices

The Gaussian elimination procedure is possible without row interchange for di-

agonally dominant matrices.

A matrix A = [ai,j ] Rnn is called diagonally dominant ifnj=1

j=i

|ai,j | |ai,i| , i = 1, . . . , n ,

holds.

Example: 4

4 matrix

A =

2 1 0 0

1 2 1 00 1 2 10 0 1 2

theorem (See Stummel/Hainer, Theorem 12, page 122)

If A = [ai,j ] Rnn is invertible and diagonally dominant, thenthe reduced matrix [ai,j ] R(n1)(n1) with

ai,j = ai,j ai,1a1,1 a

1,j , i, j = 2, . . . , n ,

is invertible and diagonally dominant with no diagonal component

equal to 0.

proof: If a diagonal element ai,i of A is eqal to zero, then the ith row of A

consists entirely of zeros which is not possible, because A is invertible.

In particular, we have a1,1 = 0.

Then for i = 2, . . . , n we have

nj=2

j=i

ai,j = nj=2

j=i

ai,j ai,1a1,1 a1,j

nj=2

j=i

|ai,j | + |ai,1||a1,1|nj=2

j=i

|a1,j|

(|ai,i| |ai,1|) + |ai,1||a1,1| (|a1,1| |a1,i|)


38/99


=

|ai,i

| ai,1

a1,1

a1,i

a1,i ai,1a1,1a1,i

=ai,i

[ai,j ] is diagonally dominant. This matrix is also invertible, because

det A = a1,1 det[ai,j ] = 0

and a1,1 = 0.

Then, as above, we must have all ai,i = 0.

If the matrix A is diagonally dominant and symmetric, then A has a LDLT

decomposition,

A = LDLT,

because all di,i = a(i)i,i = 0.

5.4 Band matrices

Literatur Oevel, Kap. 5.7; Schwarz, Kap. 1.3.2; Stummel/Hainer, Kap. 6.2.3

Let n 1 and 0 p, q n 1.

An n n matrix A = [ai,j ] is called a band matrix of band type (p, q), if

ai,j = 0 for j < i p and j > i + q i = 1, 2, . . . , n

+ . . . + + . . . +... ... ... . . . . . . . . .

. . . . . . . . . +. . .

. . .. . .

. . .. . .

...

. . . + . . .

q upper minor diagonals: components denoted by +


39/99


diagonal: components denoted by

p lower minor diagonals: components denoted by

The ai,j in the band are arbitrary (0 is allowed), but the other ai,j, which

are not in the band are all equal to 0.

The number 1 + p + q is called the band width of the band matrix of band

type (p, q).

Examples

(1) diagonal matrix (0, 0)

D =

d1,1

d2,2. . .

dn,n

(2) lower triangular matrix (n 1, 0)

L = 1,1

2,1 1,2.... . .

n,1, . . . . . . n,n

(3) upper triangular matrix (0, n 1)

U =

r1,1 r1,2 . . . r1,n

r2,2 r2,n. . .

...

rn,n

(4) upper Hessenberg matrix (1, n 1)

(5) tridiagonal matrix (1, 1)

If such a band matrix is also positive definite and symmetric or diagonally

dominant, then we can use Gaussian elimination without row interchange. The

reduced matrices are band matrices of the same type, etc.


40/99


theorem (See Stummel/Hainer, Theorem 26, page 127).

Let A = [ai,j ] be a band matrix of band type (p, q), for which the

elimination procedure goes through without row interchange.

Then all of the reduced matrices are band matrices of the same

type (p, q) and the factors L, U of the LU decomposition of A

band triangular matrices of band types (p, 0), (0, q).

proof: Consider the first step of the elimination process

A aijaij

with the linear transformations

a1,j = a1,j, j = 1, . . . , n

and


a1,j , i, j = 2, . . . , n

Let i 2 and j > i + q

ai,j = 0 and a1,j = 0 for j > i + q > 1 + q

ai,j = 0

Let i 2 and 2 j < i p

ai,j = 0 and ai,1 = 0

ai,j = 0 for 1 < j < i p

Together:

the reduced (n 1) (n 1) matrix [ai,j ] with i, j = 2, . . . , n is a bandmatrix of band type (p, q).

The other elimination steps follow successively by induction: A(k1) A(k)with

a(k)i,j a(k1)i,j , i = 1, . . . , k 1, j = 1, . . . , n

a(i)i,j , i = 1, . . . , k 1, j = i , . . . , n


41/99


and a(k)i,j = 0 for j > i + q and k j < i p, if i k.

But U = A(n) = [a(n)i,j ] with

a(n)i,j a(i)i,j , i = 1, . . . , n, j = i , . . . , n

a(n)i,j = 0 fur j > i + q, i = 1, . . . , n

U is a band matrix of type (0, q).

In addition,

i,k =a(

k)i,k

a(k)k,k

for i = k + 1, . . . , n and k = 1, . . . , n 1.

But a(k)i,k = 0 for j = k < i p with i k, i.e. for i > k+p and k = 1, . . . , n1

i,k = 0 for k < i p and k = 1, . . . , n 1, i.e.

L =

1 2,1 1

......

. . .

n,1 n,2 . . . n,n1 1

is a band matrix of type (p, 0).

The elimination process needs

n1j=1

min(j+p,n)i=j+1

1 + min(j+q1,n)k=j+1

1

multiplications/divisions and a similar number of additions/subtractions. From

this expression we see that

npq p

6(p2 + 3q2 + 3(p q) + 2) p q

npq q6

(q2 + 3p2 + 3(qp) + 2) q pmultiplications/divisions.

Fo p, q n O(npq)

For fixed p and q this order estimate depends only linearly on n.

Compare with 0(n3) for a general matrix, i.e. with p = q = n 1


42/99


5.5 Tridiagonal matrices

Literatur Schwarz, 1.3.3; Stummel/Hainer, 6.2.4

A tridiagonal matrix is a band matrix of band type (1, 1).

T =

t1,1 t1,2 t2,1 t2,2 t2,3

t3,2 t3,3. . .

. . . . . .

If Gaussian elimination is possible without row interchange, then L is a band

matrix of type (1, 0) and U is a band matrix of type (0, 1). We replace the entire

process by a simple recursion transformation.

Write:

T =

a1 b1 c1 a2 b2

c2 a3 b3. . .

. . .. . .

cn2 an1 bn1 cn1 an

Define d1 = a1 and then compute

dj = aj +pjbj1j = 2, . . . , n

pj = cj1dj1

The LU decomposition of T is given by

L =

1 p2 1

p3 1. . .

. . .

pn1 1 pn 1

, U =

d1 b1 d2 b2

. . .. . .

dn1 bn

1

dn

If A is also a symmetric matrix, then A has an LDLT decomposition.

A = LDLT

D =

d1

d2. . .

dn


43/99


Example:

T =

2 1

1 2 11 2 1

. . .. . .

. . .

i.e. with

aj 2 for j = 1, . . . , n and bj cj 1 for j = 1, . . . , n 1T is symmetric: TT T

diagonally dominant

| 1| |2|

| 1| + | 1| |2|Define d1 = a1 = 2

dj = aj +pjbj1j = 2, . . . , n

pj = cj1dj1

dj = 2 1dj1

dj = j + 1j

pj = 1dj1

= 1/ (j 1) + 1j 1 = j 1j

i.e.

dj =j + 1

j, j = 1, . . . , n

and

pj =j 1

jj = 2, . . . , n

Then we have

L =

1

12

1

23

1

. . .. . .

n1n

1

and

U =

2 1 32

143

1. . .

. . .n

n1 1 n+1

n


44/99


or for the LDLT decomposition L as above

D =

2

32

43

. . .

n+1n

A Cholesky decomposition A = LLT with L = LD(1/2) is also possible,

because the matrix A is positive definite.


45/99

Chapter 6

The QR decomposition

Literatur Oevel, Kap. 5.9

Consider an invertible nn matrix A = [ai,j ]. IfA has an LU decompositionA = LU, then we can solve the system of equations

Ax = b

quickly by applying forwards/backwards substitution to the simpler triangular

systems

Ly = b, U x = y.

There are other systems which are quickly solvable, e.g. systems with an

orthogonal coefficient matrix Q, i.e., for which QTQ = I Q1 = QT. Thenwe have

Qx = b x = QTb one matrix/vector multiplication

The QR decomposition of an invertible matrix A is

A = QR,

where Q is an orthogonal matrix and R is an upper triangular matrix.

Ax = b QRx = b Rx = QTb

i.e. with one matrix-vector multiplication, then backwards substitution.

Moreover this method is numerically stabler then the corresponding method

with the LU decomposition in the sense that the upper or right triangular sys-

tem is not worse conditioned than the original system.

44


46/99

CHAPTER 6. THE QR DECOMPOSITION 45

theorem LetA = QR. Then

cond2 (R) = cond2 (A)

proof: We have

A2 = maxx=0

Ax2x2

with the euclidian norm x2 =

xTx.

Thus

R2 = QTA2 because R = Q1A = QTA

= maxx=0

QT

Ax2x2

= maxx=0

xT(QTA)T(QTA)x

x2

= maxx=o

xTATQQTAx

x2

= maxx=0

xTATAx

x2 because QQT = I

as well as

R12 = A1Q2

= maxx=0

A1Qx2x2

= maxy=Qx

x=0

A1y2y2 because y

Ty = xTQTQx = xTx

= maxy=0

A1y2y2

= A12From this we obtain

cond2(R) = R2 R12

= A2 A12 = cond2(A).

Remarks:

(1) cond2(Q) = 1 for an orthogonal matrix Q,


47/99


(2) for an LU decomposition we often have

cond2(U) cond2(A)

(3) the QR decomposition is unique up to a multiplication Q = QD, R = DR

by a diagonal matrix

D = diag(1, 1, . . . , 1).

(See Oevel, Theorem 5.12, page 132.)

We will construct the factors Q and R with the help of Householder matrices.

6.1 Householder matrices

The Householder matrix H = H(v) of a vector

v =

v1...

vn

Rn \ {0}is the n n matrix defined by

H = H(v) = I 2vTv

vvT

wher

I = n n identity matrix

vTv =n

i=1 v2i = |v|22 scalar

vvT = n n matrix [vivj ] .Example:

v = 12 R2 H(v) = 1 00 1 25 1 1 1 22 1 2 2 =

1 2/5 4/54/5 1 8/5

=

3/5 4/5

4/5 3/5

theorem A Householder matrix is symmetric and orthogonal.


48/99


proof Write H = H(v) = I 2vTv

vvT = [hi,j] and

|v

|instead of

|v

|2. D Then

we havehi,j = i,j 2|v|2 vivj = hj,i,

where i,j is the Kronecker delta symbol, i.e.

i,j =

1 i = j

0 otherwise

i.e. HT = H H is symmetric.

We also have

HH = I 2|v|2 vvT I 2|v|2 vvT= I 4|v|2 vv

T +4

|v|4

vvT

vvT

But

vvT

vvT

= v

vTv

scalar

vT = |v|2vvT.

Thus we obtain

HH = I 4|v|2 vvT +

4

|v|2 vvT = I,

i.e. HH = I H1 = H.

But H = HT H1 = HT or HTH = HHT = I, i.e. H is also orthog-onal.

Consider a given vector

a =

a1a2...

an

Rn \ {0}

and a given index j {1, . . . , n} such that aj = 0

Define

1) C = c(j)(a) := sgn (aj)

a2j + a2j+1 + . . . + a

2n


49/99


2) v = v(j)(a) :=

0

...

0

c + ajaj+1

...

an

(where c + aj = 0 due the sign in 1)).

3) H = H(j)(a) :=

I 2vTv

vvT, if v = v(j)(a) = 0

I otherwise

theorem H = H(j)(a) is a Householder matrix with the

following properties:

(a) H(j)(a)a =

a1...

aj1c0...

0

j

(b) H(j)(a)b = b for each vector b =

b1b2...

bj10...

0

proof

2vTa = 2(c + aj) aj + a

2j+1 + . . . + a

2n

=

c2 + 2caj + a

2j

+ a2j+1 + . . . + a

2n

= (c + aj)2 + a2j+1 + . . . + a

2n = v

Tv


50/99


H(j)(a)a = I 2vTv

vvT a= Ia 2

vTvv

vTa

= a v =

a1...

aj1c0

...

0

j

It is clear that vTb = 0

H(j)(a)b =

I 2

vTvvvT

b

= Ib 2vTv

v

vTb

= b

6.2 Construction of the QR factors

Write A = A(1) =

a(1)1 |a(1)2 | . . . |a(1)n

, i.e. with the column vectors of A.

Define H(1) = H(1)

a(1)1

and A(2) = H(1)A(1).

Then we have

A(2) = H(1)

a(1)1 |a(1)2 | . . . |a(1)n

=

H(1)a

(1)1 |H(1)a(1)2 | . . . |H(1)a(1)n

=

0...

0

|a(2)2 | . . . |a(2)n

i.e. a(2)k = H(1)a(1)k for k = 2, . . . , n.


51/99


Define now

H(2)

= H(2)

(a(2)

2 )

and

A(3) = H(2)A(2)

=

H(2)

0...

0

|H(2)a(2)2 |H(2)a(2)3 | . . . |H(2)a(2)n

=

00...

0

|

0...

0

|a(3)3 | . . . |a(3)n

The first two columns are invariant and a(3)k = H

(2)a(2)k for k = 3, . . ., n.

Repeat for j = 2, . . . , n 1

H(j) = H(j)(a(

j)j )

A(j+1) = H(j)A(j)

until the end.

A(n) = H(n1) . . . H (2)H(1)A(1) an upper triangular matrix!

i.e. R = A(n) = H(n1) . . . H (1)A(1)

But the H(j) are orthogonal and symmetric: H(j)1

= H(j)T

= H(j)

A = A(1)

= H(1)

H(2)

. . . H (n

1)

QR

where Q is orthogonal as the product of orthogonal matrices, i.e.

Q := H(1)H(2) . . . H (n1)

But Q is not necessarily a symmetric matrix since the product of symmetric

matrices need not be symmetric.


52/99


6.2.1 Example

A(1) = A =

1 2 30 0 12 3 4

a(1)1 =

102

c(1) = sgn (1) 12 + 02 + 22 =

5

v = v(1) =

1 +

5

0

2

vTv = (1 +

5)2 + 02 + 22 = 10 + 2

5 = 2

5(1 +

5)

vvT =

(1 +

5)2 0 2(1 +

5)

0 0 0

2(1 + 5) 0 4

2

vTvvvT =

1 +15

0 25

0 0 025

0 1 15

H(1) = I 2vTv

vvT =

15

0 25

0 1 0

25

0 15

A(2) = H(1)A(1) =

5

85

115

0 0 1

0 15

25

a(2) =

85

0

15

c(2) = sgn(0)

+1!

02 +

15

2=

15


53/99


v = v(2) =

015 1

5

, vTv = 0 + 15

+1

5=

2

5

und

vvT =

0 0 00 15 150 1

515

H(2) = I

2

vTv

vvT = 1 0 0

0 0 1

0 1 0

A(3) = H(2)A(2) =

5 85

115

0 15

25

0 0 1

(The first row is unchanged, while the second and third rows are exchanged here

automatically!)

R = A(3)

= 5 8

5 11

5

0 1

5 2

50 0 1

Q = H(1)H(2) =

15

25

0

0 0 1

25

15

0

Q is not symmetric! but QTQ = I orthogonal!

QR = A here!


54/99

Chapter 7

Iterative methods for linear

systems

Literatur Oevel, Kap. 5.17, Schwarz, Kap. 11.1; Stummel/Hainer: Kap.

8.1-8.2

The numerical solution of a linear system of equations

Ax = b

through the LU or QR decomposition needs only finitely many arithmetic op-erations, O(n3). But the computational cost is so high for n >>> 1 that such

a direct method is not practical. In such cases, in particular if the matrix has

many zero components, an iterative method can be realistic although in prin-

ciple it needs infinitely many iterations.

Let T be an n n matrix and consider iterative method

x(k+1) = T x(k) + b

with x(k), x(k+1), b Rn.

If the sequence {x(k)} converges, then by continuity we havex = T x + b,

i.e. the limit x is a fixed point of the linear mapping F(x) = T x + b on Rn

or

(I T)x = b uniquely solvable with solution x = (I T)1b if and only if I T is

invertible.

53


55/99

CHAPTER 7. ITERATIVE METHODS 54

Question: When does the iteration sequence {x(k)} converge?

We can use the Banach contraction mapping or fixed point theorem, i.e.,

F Contraction x(k) x, where x is the unique fixed point of F.

In our case: F contraction T contraction, becauseF(x) F(y) = T x T y T x y < x y, falls T < 1.

(We assume here that the matrix and vector norms are consistent!)

Question: What has this got to do with the linear system Ax = b ?

Let A be invertible and let x = A1b be the unique solution. Then we have

x = x + (Ax b).This equation suggests the iterative method

x(k+1) = (I + A)x(k) bi.e. with T = I+ A and b instead of b.

But T = I + A is almost never a contraction.

Assume now that ai,i = 0 for i = 1, . . . , n, and define

D =

a1,1

a1,2. . .

an,n

This D is invertible with

D1 =

1/a1,1 . . .

1/an,n

Consider the equation

b = Ax = (A D)x + Dxor

Dx = (D A)x + bi.e. x = D1(D A)x + D1b.

The last equation suggests the iterative method


56/99


x(k+1) = (I D1A)x(k) + D1b k = 0, 1, 2, . . .

or

x(k+1)i =

nj=1=i

ai,jx(k)j + bi

/ai,i i = 1, . . . , n, k = 0, 1, 2, . . .This is called the Jacobi iteration method.

In this case we have

T = D1(D A) = I D1A

Question: When is this T a contraction?

Consider the max matrix norm (row summation)

T = maxi=1,...,n

nj=1

|ti,j |

For T = I D1A we have

ti,j = i,j ai,jai,i

=

0 i = j

ai,j

ai,i i = j,

where i,j is the Kronecker delta symbol.

Hence we have

nj=1

|ti,j | =nj=1=i

|ai,j ||ai,i| =

1

|ai,i|nj=1=i

|ai,j|

as well as

T = maxi=1,...,n

1

|ai,i|n

j=1=i|ai,j |

We will have T < 1, when

1

|ai,i|nj=1=i

|ai,j | < 1, i = 1, . . . , n

i.e. nj=1=i

|ai,j | < |ai,i| i=1,..., n


57/99


A is called strictly diagonal dominant.

Under this sufficient condition, all iteration sequences converge to the unique

solution x = A1b.

The convergence order is p = 1, i.e., linear convergence for a contractive

iteration method.

Question: Can we speed up this convergence?

As a variation of the Jacobi method, we can already use the component of the

vector x(k+1) which we have just calculated in the calculation of the following

components x(k+1), i.e.x(

k+1)1 =

nj=2 a1,jx(k)j + b1 /a1,1

x(k+1)i =

i1j=1 ai,jx(k+1)j nj=i+1 ai,jx(k)j + bi) ai,i, i = 2, . . . , n 1

x(k+1)n =

n1j=1 an,jx(k+1)j + bn /an,n

This new method is called the Gauss-Seidel iteration method

We can represent it in a matrix-vector form. Consider the follwing additive decomposition

of A,

A = D L R,where D = diag[ai,i] (as above) and

L =

0 a2,1 0a3,1 a3,2 0

.... . .

0

an,1 an,n1 0

R =

0 a1,2 a1,3 . . . a1,n0 0 a2,3 . . . a2,2

0. . .

0 an1,n 0

The Gauss-Seidel method then reads

Dx(k+1) = Lx(k+1) + Rx(k) + b

or

x(k+1) = (D L)1Rx(k) + (D L)1b k=1,2...,where

D L =

a1,1 a2,1 a2,2

......

. . .

an,1 an,2 . . . an,n


58/99


is invertible, because det(D L) = a1,1a2,2 . . . an,n = 0.

In the case the iteration method is

x(k+1 = T x(k) + b

with T = (D L)1R and b = (D L)1b.

theorem: (See Oevel Theorem 5.35, page 177-178)

LetA be strictly diagonal dominant. Then

T = (D L)1R I D1A < 1,

i.e. T is a contraction and the iteration sequences convergeto the unique solution x = A1b.

proof: We have already seen in the Jacobi method above that

K = I D1A= max

i=1,...,n

1

|ai,i|nj=1=i

|ai,j |

< 1

when A is strictly diagonal dominant. Consider now the mapping

y = T x

For i = 1,

y1 = 1a1,1

nj=2

a1,jxj

|y1| 1|a1,1|n

j=2

|a1,j | |xj |

maxi=1,...,n1

|ai,i|

n

j=1=i |ai,j |x= Kx

i.e. |y1| Kx

The proof continues by induction. Let i = 2, . . . , n and assume that |yj | Kx for j = 1, . . . , i 1. Then we have

yi = 1ai,i

i1j=1

ai,jyj 1ai,i

nj=i+1

ai,jxj


59/99


|yi| 1|ai,i|i1j=1

|ai,j | |yj |Kx

+1

|ai,i|n

j=i+1

|ai,j | |xj |x

x

K1

|ai,i|i1j=1

|ai,j | + 1|ai,i|n

j=i+1

|ai,j |

But K < 1

|yi| |x| 1|ai,i|n

j=1=i|ai,j |

maxi=1,...,n

1

|ai,i|nj=1=i

|ai,j |x = Kx

By induction we have

|yi| Kx, i = 1, . . . , n

y = maxi=1,...,n

|yi| Kx

i.e. y = T x| Kx. Hence

T = maxx=0

T xx K

i.e. T = (D L)1A contraction constant of

the Gauss-Seidel method

I D1A contraction constant of

the Jacobi method

< 1

the Gauss-Seidel method should converegce quicker than the Jacobimethod, because its contraction constant is smaller.

(But the convergence order still remains linear, because the GS method is a

contraction iteration method.)

7.1 Relaxation methods

Literatur Oevel, Kap. 5.18; Schwarz, Kap. 11.1; Stummel/Hainer, Kap. 8.3

We can often modify an interation method to ensure the convergence to a

desired fixed point or to accelerate the convergence.

Consider the following mapping in R1


60/99


(1) f1(x) =12

x + 2

(2) f2(x) = 2x 1

The function f1 has a unique fixed point x = 4 and the iterations

x(k+1) = f1

x(k)

converge to x = 4, since f1 is a contraction with contraction constant

12

. We

can accelerate this convergence with the iteration method

x(k+1) = f1,

x(k)

where

f1,(x) = (1 )x + 12

x + 2

and is a parameter with ( = 0)

x = 4 is the unique fixed point of the function f1, for each = 0. But

f1,(x) =

1 2

x + 2.

f1, is a contraction with contraction constant 1 2 for 0 < < 4,because

1

2 < 1, for 0 < < 4.But 1

2

< 12

or 12

< 1 2

< 12

when 1 < < 3.

For such the iteration method converges quicker!

Consider now f2(x) = 2x 1 with the unique fixed point x = 1. thismapping is not a contraction and the iteration sequences x(k+1) = f2(x(

k))

diverge. Instead consider now

x(k+1) = f2,(x(k))

with

f2,(x) = (1 )x + f2(x) = (1 )x + (2x 1).x = 1 is the unique fixed point for each = 0.

But f2,(x) = (1 + )x is a contraction with contraction constant

|1 + | < 1,


61/99


provided 1 < 1 + < 1, i.e. when 2 < < 0. For such the iterationsequences converge to the desired fixed point x = 1.

Such modified iteration methods are called relaxation methods and the pa-

rameter is called the relaxation parameter.

Now consider a linear system of equations

Ax = b,

with A an n invertible matrix. The unique solution is x = A1b

Assume that s

ai,i = 0, i = 1, . . . , n ,so the diagonal matrix D = diag [ai,i] is invertible. The Jacobi method here is

x(k+1) = D1(D A)x(k) + D1b,

i.e. with the mapping

F(x) = D1(D A)x + D1b

= x + D1(b

Ax)

Clearly x = F(x), but without additional assumptions we cannot be certain

the the iterations of the Jacobi method will converge to x.

Consider instead the mapping

F(x) = (1 )x + F(x) = x + D1(b Ax),

which has a unique fixed point (when = 0)

x = F(x) = A1b.

We want to choose so that the iteration method

x(k+1) = F(x(k)) = x(k) + D1(b Ax(k))

converges to x.

For this we assume that A positive definite and symmetric. Then ai,i > 0

for i = 1, . . ., n, and the diagonal matrix

D = diag [ai,i]


62/99


is also positive definite and invertible. We define the following scalar product

(., .)D and the corresponding norm D durch

(x, y)D = xTDy =

nj=1

aj,jxjyj , xD =

xTDx =

(x, x)D =

nj=1

aj,jx2j

for x, y Rn.

theorem: (Stummel/Hainer, Theorem 8, page 165)

Let A Rnn be positive definite and symmetric. The relaxed Jacobimethod

x(k+1) = F(x(k)) := x(k) + D1 b Ax(k)converges to x = A1b for each initial vector x(0) Rn if and only ifthe parameter and the largest eigenvalue 1 of D

1A satisfy

0 < < 2/1.

proof: The mapping

x D1Ax, x Rn,is symmetric and positive definite in the scalar product (., .)D in the sense that

(D1Ax,y)D = (D1Ax)TDy= xTATD1Dy

= xTAy

= xTD D1Ay

= (x, D1Ay)D

and

(D1Ax,x)D = xTAx > 0 for all x = 0= 0 if and only if x = 0

The matrix D1A thus has a complete orthonormal system of eigenvectorsv1, . . ., vn in the space Rn w.r.t. the scalar product (., .)D with associated realpositive eigenvalues 1 . . . n > 0,

i.e. D1Avj = jvj , (vi, vj)D = i,j , Avj = jDvj .

The matrix T = I D1A has eigenvalues 1 j with associated eigen-


63/99


vectors v1, . . ., vn.

Tx2D =n

j=1|1 j |2 |(x, vj)D|2

maxj=1,...,n

|1 j|2 n

j=1

|(x, vj)D|2

maxj=1,...,n

|1 j|2

q2

x2D

for each x Rn

If 1 is the largest eigenvalue of D1A and 0 < < 2/1, then we have

1 > 1 j 1 1 > 1, j = 1, . . . , n q < 1

i.e. T (and hence F) is a contraction w.r.t. the norm D on Rn. For 0 or 2/1 we have

q = |1 1| 1.Thus for the initial value x(0) = v1 (the eigenvector for 1)

Tk v1D = qk 1In this case the necessary and sufficient condition Tk 0 for the convergence

of the method is violated.

Remarks:

(1) The a-posteriori and a-priori error estimates

x(k) xD q1 q x

(k1) x(k)D

qk

1 q x(0)

x(1)

D

hold under the conditon that 0 < < 2/1.

(2) We can choose the parameter so that the factor

q = maxj=1,...,n

|1 j |

is as small as possible. Namely, =2

1 + n

q = 1 n1 + n

q for all (0, 2/1)


64/99


(3) The eigenvalues of D1A are often unknown, but we have the following

estimate

1 D1A = 1 + maxi=1,...,n

nj=1j=i

ai,jai,i

convergence, if

0 < < 2/D1A 21

A is strictly diagonald dominant D1A < 2But we also have convergence when A is not strictly diagonal dominant

(provided is small enough!)

7.2 The SOR method

Literatur Oevel, Kap. 5.18; Schwarz Kap. 11.1; Stummel/Hainer, Kap. 8.3

The SOR method is a relaxed version of the Gauss-Seidel method.

Let A be an nn invertible matrix with ai,i = 0 for i = 1, . . . , n and considerthe additive decomposition

A = D L R(see first part of the chapter).

The Gauss-Seidel method is

x(k+1) = (D L)1Rx(k) + (D L)1b

and has the fixed point x = A1b, i.e.

Dx(k+1) = Lx(k+1) + Rx(k) + b

with

Dx = Lx + Rx + b

or

Dx = [Lx + Rx + b] ( = 0)

Dx = (1 )Dx + [Lx + Rx + b]

As with the Gauss-Seidel method we can introduce an iteration method

which uses the newly computed components of x(k+1) straight away, i.e.

Dx(k+1) = (1 )Dx(k) + [Lx(k+1) + Rx(k) + b]


65/99


or

(D L)x(k+1)

= [(1 )D + R]x(k)

+ b

or 1

D L

x(k+1) =

1

1

D + R

x(k) + b.

We have

det

1

D L

= det

a1,1/

a2,1 a2,2/...

.... . .

an,1 an,2 . . . an,n/

=1

na1,1 . . . an,n = 0

1 D L is invertible for all = 0. Therefore we have

x(k+1) =

1

D L

1 1

1

D + R

x(n) +

1

D L

1b

This method is called the SOR method [SOR = Successive Over Relaxed]

Let A be positive definite and symmetric. From Theorem 16 in Stum-

mel/Hainer (page 167) we have

x(k) x = A1bfor all x(0) Rn 0 < < 2

The parameter region (0,2) here is more favourable than that for the relaxed

Jacobi method, i.e., (0, 2/1), because it does not depend on the particular ma-

trix A.


66/99

Chapter 8

Krylov space methods

Literature Plato, Kap. 11

8.1 Krylov spaces

We consider again the approximation of the solution of a linear systems of equa-

tions Ax = b, where A RNN is regular (i.e. invertible) and b RN, with theunique solution x = A1b RN.

Let

{0} D1 D2 RN (8.1)be linear subspaces (finitely or infinitely many), which we will specify in more

detail later.

We will investigate the following approaches to determining different se-

quences of vectors xn Dn, n = 1,2, .

Definition 1 (Orthogonal residual approach)

xn Dn, Axn b Dn , n = 1, 2, . (8.2)

Definition 2 (Minimal residual approach)

xn Dn, Axn b2 = minxDn

Ax b2, n = 1, 2, . (8.3)

Here

M := y RN : < y,x >2= 0 for each x Mdenotes the orthogonal complement of an arbitrary set M RN, while 2denotes the Euclidean vector norm and < , >2 the corresponding Euclideanscalar product.

65


67/99

CHAPTER 8. KRYLOV SPACE METHODS 66

The vector Ax b is called the residual of x RN (with respect to thesystem of equations Ax = b).

Krylov spaces play a leading role in the choice of the subspaces in the above

approaches.

Definition 3 The Krylov spaces corresponding to a given matrix A RNNand vector b are defined by

Kn(A, b) := span

b, Ab, , An1b RN, n = 1, 2, ,with K0(A, b) := {0}.

8.1.1 Properties of Krylov spaces

Krylov spaces are clearly increasing:

{0} = K0(A, b) K1(A, b) K2(A, b) .

and there is a a uniquely determined integer 0 n N such that

Kn1(A, b) Kn(A, b) = Kn+1(A, b) = ,

x = A1b Kn(A, b), x / Kn(A, b) for n = 0, 1, , n 1.

These properties follow immediately from the next lemma

Lemma 4 Given a regular matrixA RNN and a vectorb RN the followingstatements are equivalent for each integer n 1:

(a) The vectorsb, Ab, , Anb are linearly dependent;

(b) Kn(A, b) = Kn+1(A, b);

(c) AKn(A, b) Kn(A, b);

(d) there exists a linear subspaceM RN with dim M n such that b MandM is invariant w.r.t. A, i.e. A(M) M;

(e) x := A1b Kn(A, b).

Proof hint: The Cayley-Hamilton theorem says that the matrix A RNN is azero of its own characteristic polynomial, i.e.

pA(A) = AN + N1AN1 + + 1A1 + 0IN = 0 RNN,

where

pA(z) = det(zIN A) = zN + N1zN1 + + 1z1 + 0.


68/99


8.2 The OR-approach for symmetric, positive

definite matrices

Here we consider the Orthogonal Residual approach for general subspaces under

the additional assumption that A RNN is a symmetric and positive definitematrix, i.e.,

A RNN, A = A > 0. (8.4)Define

< x, y >A = xAy, xA =

xAx, x, y RN.

Since A = A > 0, it follows that < , >A is a scalar product on RN withcorresponding norm A.

In the Orthogonal Residual approach it is often more convenient to derive

error estimates in the norm A rather than the Euclidean norm 2.

8.2.1 Existence, uniqueness and minimality

Here we discuss the existence and uniqueness of the vectors xn that arise in the

Orthogonal Residual approach

Theorem 5 For a given symmetric and positive definite matrix A

RNN the vectors xn, n = 1, 2, . . . in the Orthogonal Residualapproach with general subspaces Dn are unique and

xn xA = minxDn

x xA, n = 1, 2, (8.5)

Proof

1. Uniqueness: Assume for a fixed n that two vectors xn and xn satisfy

property (8.2). Then

xn xn Dn und A(xn xn) Dn .

and it follows that

xn xn2A = xn xn, A(xn xn)2 = 0 = xn = xn.

2. Solvability: Consider an arbitrary basis d0, d1, , dm1 of Dn. Then avector

xn =m1j=0

jdj Dn

satisfies property (8.2) if and only if

Axn b Dn Axn b, dk2 = 0 for k = 0, , m 1,


69/99


i.e. if and only if the m coefficients 0, , m1 satisfy the system of m linearequations

m1j=0

Adj , dk2 j = b, dk2 for k = 0, , m 1. (8.6)

This system of equations is uniquely solvable due to the uniqueness of the solu-

tion which we showed above. (There are just three possibilities: no solution at

all or just one solution or infinitely many solutions).

3. Minimality: Finally, for an arbitrary vector x Dn we have

x

x

2A =

xn

x + x

xn

2A

= xn x2A + 2=0

A(xn x), Dn

x xn Dn

2

+x xn2A

xn x2A,

i.e., the unique solution xn satisfies the minimality property in (8.5).

8.2.2 The OR approach for an A-conjugate basis

In the proof of Theorem 5 we used an abitrary basis for the subspace Dn.The resulting system of equations (8.6) is much easier to solve when we use aparticular basis.

Definition 6 LetA RNN be a symmetric and positive definite matrix. Thenthe vectors d0, d1, , dm1 RN \ {0} are said to be A-conjugate if

Adk, dj2 = 0, for k = j.

Remark: A-conjugacy and pairwise orthogonality w.r.t. the scalar product

< , >A are the same.


70/99


The Orthogonal Residual approach is simple to implement for symmetric,

positive definite matrices A RNN

when an A-conjugate basis ofDn is given.

Theorem 7 Suppose that for a given symmetric, positive definite matrix

A RNN and givenA-conjugate vectorsd0, d1, RN\{0} we define

Dn = span {d0, , dn1} , n = 1, 2, .

Then, the vectors xn in the Orthogonal Residual approach w.r.t. to these

Dn have the representation for n = 1, 2,

xn =n1

j=0jdj with j =

rj , dj2

Adj , dj

2

, (8.7)

where rj := Axj b for j 1 and r0 := b.

Proof We make use of the A-conjugacy in the proof of Theorem 5 and obtain

xn =n1j=0

jdj with j =b, dj2

Adj , dj2, j = 0, 1, , n 1. (8.8)

The number j in (8.8) agrees with (8.7), which is clear for j = 0 and follows

for j 1 from

b Axn =rn

, dn2

= b, dn2 n1j=0

ajAdj , dn2 = b, dn2 , n = 1, 2, ,

because

Adj , dn2

= 0 for j = n.

Remark 1: We see from the representation (8.7) that the number j is

independent of n is, so we have

xn+1 = xn + ndn, (8.9)

rn+1 = rn + nAdn (8.10)

with x0 := 0 und r0 = b.This gives a further simplification of the implementation (8.7), because we

have already calculated the matrix-vector product Adn which is needed for the

determination of n. Thus we do not need any more matrix-vector products in

order to calculate the residual rn+1 through (8.10).

This representation is important because most stop commands are based on

the value of the residual.


71/99


Reamrk 2: In view of formula (8.9) the vector dn is called the search direction

and the number n is called the stepsize. They are optimal in the following sense

xn+1 xA = mintR

xn + tdn xA .

8.3 The CG method for positive definite matri-

ces

We will now use the Krylov spaces in the Orthogonal Residual approach.

Definition 8 For a symmetric, positive definite matrix A RNN, the con-jugate gradient method is is given by (8.9)-(8.10) with the special choice of

subspaces

Dn = Kn(A, b), n = 0, 1, . (8.11)

This method is often abbreviated as the CG method.

8.3.1 Computing A-conjugate search directions in Kn(A, b)Here we use the notation from Theorem 7. Starting from an already constructed

A-conjugate basis d0, , dn1 for Kn(A, b), we will construct an A-conjugatebasis for Kn+1(A, b) by the Gram-Schmidt orthogonalisation of the vectors

d0,

, dn1,

rn

RN

with respect to the scalarproduct < , >A.In the proof of Lemma 9 we will see that a Gram-Schmidt orthogonalisation

of the two vectors dn1, rn RN suffices.

Lemma 9 Suppose for a given symmetric, positive definite matrix A RNNthat the search directions are chosen so that

dn := rn + n1dn1 with n1 := Arn, dn12Adn1, dn12(8.12)

for n = 1, , n 1, and d0 := b, where n denotes the first index for whichrn = 0.

These vectors d0, d1, , dn1 RN are A-conjugate and

span {d0, , dn1} = span {b, r1, r2, rn1} = Kn(A, b) (8.13)

holds for n = 1, , n .

Proof: We will show the A-conjugacy of the vectors d0, d1, d2, , dn1 RN as well as both identities (8.12)-(8.13) by means of mathematical induction


72/99


over n = 1, 2, , n . The first step is clear fromspan {d0} = span {b} = K1(A, b)

Now consider a fixed index 1 n n 1 and assume that the procedure(8.12) delivers a system d0 = b, d1, d2, , dn1 of A-conjugate vectors withthe property (8.13). From (8.2) we have rn Kn(A, b) and, for the case rn= 0, the vectors d0, , dn1, rn are linearly independent. A Gram-Schmidtorthogonalisation of these vectors w.r.t. the scalar product < , >A deliversthe vector

dn := rn +n1j=0

Arn, dj2Adj , dj2

dj

()= rn + n1dn1, (8.14)

where () follows from the facts that AKn1(A, b) Kn(A, b) and rn Kn(A, b):Arn, dj2 = rn, Adj2 = 0, j = 0, 1, , n 2.

The vectors d0, , dn1, dn are A-conjugate by the construction andspan {d0, , dn1, dn} = span {b, r1, r2, , rn}

holds. In view of equation (8.10) we also have span {b, r1, r2, , rn} Kn+1(A, b).The required inequality follows on dimensional grounds.

Remark: The solution of the system of equations Ax = b is provided simul-

taneously from the stopping criterion described in Lemma 9, thus xn = x.Since the two vector systems in (8.13) are linearly independent it follows that

dim Kn(A, b) = n, for n = 0, 1, , n,and hence, necessariliy, that

n N.

As an immediate consequence of the proof of Lemma 9 we obtain the follow-

ing representation for stepsize, which is typically used in numerical implemen-

tations.

Corollary 10 In the notation given in Lemma 9 we have the represen-

tation

n =rn22

Adn, dn2, n = 0, 1, , n 1, (8.15)

n1 =rn22

rn122, n = 1, , n 1, (8.16)

with r0 := b.

Proof With rn Kn(A, b) and the expression (8.12) for the search direction


73/99


dn we obtain < rn, dn >2 = rn22. Together with (8.7) this yields (8.15).This representation (8.15) for n together with the identity rn = rn1 +

n1Adn1 (i.e., the identity (8.10) wih n replaced by n 1) gives

rn22 =

rn, rn1 =0

2

+ n1 rn, Adn12 = n1rn122,

and thus (8.15) also holds for n1.

8.3.2 The Algorithm for the CG method

We combine the above considerations and results to obtain the following algo-

rithm:

Algorithm for the CG method

Step 0: Set r0 = Ax0 b.

Step n = 0, 1, :

(a) Ifrn = 0, then stop, n = n.

(b) If, on the other hand, rn = 0, then in Step n + 1 proceed as follows:

dn = r0, if n = 0

rn + n1dn1, n1 = rn22rn122, if n 1

xn+1 = xn + ndn, n =rn22

Adn, dn2,

rn+1 = rn + nAdn

Remark: The expression conjugate gradient method originates in the follow-

ing two properties:

For each index n the residual rn is identical to the gradient of the energyfunctional

J(x) = 12

Ax,x2 < x, b >2at xn = rn = J(xn).

As an immediate consequence of (8.2) and (8.13), we have

rk, rj2 = 0 for k = j.


74/99


8.3.3 The CG method for the normal equations

If the regular system of linear equations Ax = b is symmetric and indefinite

or nonsymmetric, then we can apply the classical CG method to the normal

equations

AAx = Ab.

In this case the method is called the CGNR method.

As a direct consequence of Theorem 7, the following minimality property is

obtained for the iterates of the CGNR method

Axn b2 = minxKn(AA,Ab)

Ax b2 (8.17)

This property justifies the letter R in the CGNR notation, while the letter

N stands for normal equations. It is also clear from this property that

with the special choice of subspaces Dn = Kn(AA, Ab) for n = 0, 1, , theCGNR method coincides with the Minimal Residual approach (8.3).

Two matrix-vector multiplications are required in each iteration step of the

CG algorithm applied to the normal equations AAx = Ab (i.e. to calculateAdn and AAdn), but the numerically more expensive calculation of the matrixAA is not required.

8.4 The GMRES method and Arnoldi process

The GMRES method provides another possibility for solving a regular system of

linear equations Ax = b with a symmetric indefinite or a nonsymmetric matrix

A RNN.

Definition 11 The GMRES method is defined by the Minimal Residual ap-

proach (8.3 ) with the special choice of subspaces Dn = Kn(A, b), i.e., so that

xn Kn(A, b), Axn b2 = minxKn(A,b)

Ax b2, n = 0, , n (8.18)

The abbreviation GMRES stands for generalized minimal residual method.

For n = 1, 3, the basic procedure for realising the GMRES method is asfollows:

(a) Use the Arnoldi process (which will be described below) to generate an

orthogonal basis for Kn(A, b) with repsect to the Euclidean scalar product.

(b) With this orthogonal basis, the minimization problem (8.18) can be refor-

mulated as a simpler minimization problem which can be quickly solved.


75/99


8.4.1 The Arnoldi process

The Arnoldi process is easy to explain: starting from a given normalized vector

q1 RN a sequence of pairwise orthonormal vectors q1, q2, w.r.t. the classi-cal scalar product < , >2 is generated by Gram-Schmidt orthogonalization ofthe vectors q1, Aq1, Aq2, . (The vectors required are generated in the courseof the process and are not known a priori).

The following algorithm describes the exact procedure.

Algorithm for the Arnoldi process

Starting from a given vector 0 = b RN set

q1 =b

b2 RN

an proceed as follows for n = 1, 2,

(1) Orthogonalization: Set

hk,n := (Aqn)qk R, k = 1, 2, , n, (8.19)

qn+1 := Aqn n

k=1hk,nqk RN (8.20)

(2) Normalization: The process stops if qn+1 = 0, in which case the

stopping index is denoted by n = n. On the other hand, if qn+1= 0, then set

hn+1,n := qn+12 R, qn+1 := qn+1hn+1,n

RN (8.21)

Remark: The quantities in (8.19)-(8.20) imply that the Arnoldi process will

stop for the first time that Aqn span{q1, qn} holds.

The following Lemma summarizes the most important properties associated


76/99


with the Arnoldi process.

Lemma 12 The vectors q1, , qn RN produced by the Arnoldi pro-cess are pairwise orthonormal, and

span{q1, , qn} = span{q1, , qn1, Aqn1} = Kn(A, b) (8.22)

for n = 1, , n. If the matrix A is regular, then the unique solution x RN of the system of equations Ax = b satisfies

x Kn(A, b). (8.23)

Proof: The pairwise orthogonality is obtained by mathematical inductionw.r.t. n using (8.19):

qn+1, qk2 =1

hn+1,n

(Aqn)

qk hk,n

= 0, k = 1, 2, . . . , n

for n = 1, 2, . . ., n 1. The property qn+12 = 1 follows by (8.21).The two identities in (8.22) will now be proved too by mathematical induc-

tion w.r.t. n . In view of q1 = b/b2, the assertion is true for n = 1.The induction step 1 n 1 n n will now be verified. Since n n

the vectors q1, , qn1, Aqn1 RN are linearly independent and thus by theconstruction the first identity in (8.22) is true. The second identity in (8.22)

is obtained as follows: The relation follows from Aqn1 AKn1(A, b)) Kn(A, b) and the identity = then results from a dimension argument.

n = dim span {q1, , qn1, Aqn1} dim Kn(A, b) n.

We prove the statement in (8.23) as follows: from the definition of n wehave

Aqn span {q1, , qn} = Kn(A, b),and by construction

Aqk Kk+1(A, b) Kn(A, b), k = 1, , n 1,

which combine to give A(Kn(A, b)) Kn(A, b). Again by a dimension argu-ment, the mapping A : Kn(A, b) Kn(A, b) is bijective, so, since b Kn(A, b),we have then x Kn(A, b).

Remark 1: It is clear from the statement (8.23) that

dim Kn(A, b) = n for n = 1, , n,

so the Arnoldi process necessarily stops after at most N steps, n N.


77/99


Remark 2: If the matrix A is symmetric, then the identity

hk,n = qn Aqk = 0

holds for k n 2 because

Aqk Kk+1(A, b) Kn1(A, b)

and qn Kn(A, b).The Gram-Schmidt orthogonalization (8.

analisis libro 1

Documents