symposium on the theory of numerical analysis

Lecture Notes in Mathematics A collection of informal reports and seminars Edited by A. Dold, Heidelberg and B. Eckmann, Z0rich

193

Symposium on the Theory of Numerical Analysis Held in Dundee/Scotland, September 15-23, 1970

Edited by John LI. Morris, University of Dundee, Dundee/Scotland

Springer-Verlag Berlin. Heidelbera • New York 1971

A M S Subjec t Classif icat ions (1970): 65M05, 65M10, 6 5 M 15, 65M30, 6 5 N 0 5 , 6 5 N 10, 6 5 N 15,

6 5 N 2 0 , 6 5 N 2 5

I S B N 3-540-05422-7 Springer-Verlag Berlin • He ide lbe rg • N e w York I S B N 0-387-05422-7 Springer-Verlag Near Y o r k • He ide lbe rg • Berlin

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks.

Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher.

© by Springer-Verlag Berlin • Heidelberg 1971. Library of Congress Catalog Card Number 70-155916. Printed in Germany.

Offsetdruck: Julius Beltz, Hemsbach

Foreword

This publication by Springer Verlag represents the proceedings of a series

of lectures given by four eminent Numerical Analysts, namely Professors Golub,

Thomee, Wachspress and Widlund, at the University of Dundee between September

15th and September 23rd, 1970o

The lectures marked the beginning of the British Science Research Council's

sponsored Numerical Analysis Year which is being held at the University of Dundee

from September 1970 to August 1971. The aim of this year is to promote the theory

of numerical methods and in particular to upgrade the study of Numerical Analysis

in British universities and technical colleges. This is being effected by the

arranging of lecture courses and seminars which are being held in Dundee through-

out the Year. In addition to lecture courses research conferences are being

held to allow workers in touch with modern developments in the field of Numerical

Analysis to hear and discuss the most recent research work in their field. To

achieve these aims, some thirty four Numerical Analysts of international repute

are visiting the University of Dundee during the Numerical Analysis Year. The

complete project is financed by the Science Research Council, and we acknowledge

with gratitude their generous support. The present proceedings, contain a great

deal of theoretical work which has been developed over recent years. There are

however new results contained within the notes. In particular the lectures pre-

sented by Professor Golub represent results recently obtained by him and his co-

workers. Consequently a detailed account of the methods outlined in Professor

Golub's lectures will appear in a forthcoming issue of the Journal of the Society

for Industrial and Applied Mathematics (SIAM) Numerical Analysis, published

jointly by &club, Buzbee and Nielson.

In the main the lecture notes have been provided by the authors and the

proceedings have been produced from these original manuscripts. The exception

is the course of lectures given by Professor Golub. These notes were taken at

the lectures by members of the staff and research students of the Department of

Mathematics, the University of Dundee. In this context it is a pleasure to ack-

nowledge the invaluable assistance provided to the editor by Dr. A. Watson, Mr.

IV

R. Wait, Mr. K. Brodlie and Mr. G. McGuire.

Finally we owe thanks to Misses Y. Nedelec and F. Duncan Secretaries in

the Mathematics Department for their patient typing and retyping of the manu-

scripts and notes.

J. L1. Morris

Dundee, January 1971

Contents

G.Golub: Direct Methods for Solving Elliptic Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . . I

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2 2. Matrix Decomposition . . . . . . . . . . . . . . . . . . . 2 3. Block Cyclic Reduction . . . . . . . . . . . . . . . . . . 6 4. Applications . . . . . . . . . . . . . . . . . . . . . . . 10 5. The Buneman Algorithm and Variants . . . . . . . . . . . . 12 6. A~curacy of the Buneman Algorithms . . . . . . . . . . . . 14 7. Non-Rectangular Regions . . . . . . . . . . . . . . . . . 15 8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . 18

G.Golub: Matrix Methods in Mathematical Programming . . . . . . . 21

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 22 2. Linear Programming . . . . . . . . . . . . . . . . . . . . 22 3. A Stable Implementation of the Simplex Algorithm ..... 24 4. Iterative Refinement of the Solution . . . . . . . . . . . 28 5. Householder Triangularization . . . . . . . . . . . . . . 28 6. Projections . . . . . . . . . . . . . . . . . . . . . . . 31 7. Linear Least-Squares Problem . . . . . . . . . . . . . . . 33 8. Least-Squares Problem with Linear Constraints ...... 35

Bibliography . . . . . . . . . . . . . . . . . . . . . . . 37

V.Thom@e: Topics in Stability Theory for Partial Difference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Preface . . . . . . . . . . . . . . . . . . . . . . . . . 42 I. Introduction . . . . . . . . . . . . . . . . . . . 43 2. Initial-Value Problems in L ~ w~th Constant Coefficients . 51 3. Difference Approximations in L ~ to Initial-Value Problems

with Constant Coefficients . . . . . . . . . . . . . . . . 59 4. Estimates in the Maximum-Norm . . . . . . . . . . . . . . 70 5. On the Rate of Convergence of Difference Schemes ..... 79

References . . . . . . . . . . . . . . . . . . . . . . . . 89

E.L.Wachspress: Iteration Parameters in the Numerical Solution of Elliptic Problems . . . . . . . . . . . . . . . . . . . . . . 93

I. A Concise Review of the General Topic and Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 95

2. Successive Overrelaxation: Theory . . . . . . . . . . . . 98 3. Successive Overrelaxation: Practice . . . . . . . . . . . 100 4. Residual Polynomials: Chebyshev Extrapolation: Theory .102 5. Residual Polynomials: Practice . . . . . . . . . . . . . . 103 6. Alternating-Direction-lmplicit Iteration . . . . . . . . . 106 7. Parameters for the Peaceman-Rachford Variant of Adi .107

0.Widlund: Introduction to Finite Difference Approximations to Initial Value Problems for Partial Differential Equations .111

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 112 2. The Form of the Partial Differential Equations ...... 114 3. The Form of the Finite Difference Schemes . . . . . . . . 117 4. An Example of Divergence. The Maximum Principle ..... 121 5. The Choice of Norms and Stability Definitions ...... 124 6. Stability, Error Bounds and a Perturbation Theorem . .133

VI

7. The yon Neumann Condition, Dissipative and Multistep Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 138

8. Semibounded Operators . . . . . . . . . . . . . . . . . . 142 9. Some Applications of the Energy Method . . . . . . . . . 145

10. Maximum Norm Convergence for L 2 Stable Schemes ..... 149 References . . . . . . . . . . . . . . . . . . . . . . . 151

Direct Methods for Solving Elliptic Difference Equations

GENE GOLUB

Stanford University

i. Introduction

General methods exist for solving elliptic partial equations of general type

in general regions. However, it is often the ease that physical problems such as

those of plasma physics give rise to several elliptic equations which require to be

solved mauy times. It is not unco~non that the elliptic equations which arise re-

duce to Poisson's equation with differing right hand side. For this reason it is

judicious to use direct methods which take advantage of this structure and which

thereby yield fast and accurate techniques for solving the associated linear

equations.

Direct methods for solving such equations are attractive since in theory they

yield the exact solution to the difference equation, whereas commonly used methods

seek to approximate the solution by iterative procedures [12]. Hockney [8] has

devised an efficient direct method which uses the reduction process• Also Buneman

[2] recently developed an efficient direct method for solving the reduced system

of equations. Since these methods offer considerable economy over older tech-

niques [5], the purpose of this paper is to present a unified mathematical deve-

lopment and generalization of them. Additional generalizations are given by

George [6].

2. Matrix Decomposition

Consider the system of equations

= ~ , (2 .1 )

where M is an NxN real symmetric matrix cf block tridiagonal form,

M =

A T

T A e

• • W

T A

(2 .2 )

The matrices A and T are p×p symmetric matrices and we assume that

AT = TA .

This situation arises in many systems• However, other direct methods which are

applicable for more general systems are less efficient to implement in this case.

Moreover the classical methods require more computer storage than the methods te be

discussed here which will require only the storage of the vector ~. Since A and T

commute and are s~etric, it is well known Ill that there exists an orthogonal

matrix Q such that

QT A Q = A, QT T Q = 0 ,

and A and O are real diagonal matrices.

(2.3)

The matrix Q is the set ef eigenvectars of

A and T, and A and n are the diagonal matrices of the p-distinct eigenvalues cf A

and T, respectively•

To conform with the matrix M, we write the vectors x and ~ in partitioned form,

x --

X ~q

° i I

Furthermore, it is quite natural to write

x2j I

xj = .

I X

, p j I

L j

System (2.2) may be written

~Cj = 2J

YPJ I

J = 2,3,...,q-1 ,

(2.~_)

(2.5a) (2.5b)

T~q_ I + AX~q = ~ . (2.5e)

Frem Eq. (2.3) we have

A = Q A QT and T = Q O QT •

Substituting A and T into Eq. (2.5) and pre-multiplying by QT we obtain

where

z f ~

rewritten as

(,i = 2,3 , . . . ,q - i ) (2.6)

- = Q~x = Q~ x..i ~CI ' Z,i '~J ' J = 1 ,2 , . . . ,q .

and ~j are partitioned as before then the ith components of Eq. (2,6) may be

u N u

~iXij_l + kiXij + ~ixij+l

wiXiq-I + klXiq = Ylq j

fer i = 1,2,...pp.

= ~-~j , (j = 2,...,q-~) ,

If we rewrite the equatio~by reversing the rolls of i and J we may write

r i =

% - -=

P

6o i X i - q x q

" N

Xil

x i 2

Xiq

A

-] Yil

Yi2

1

so that Eq. (2.7) is equivalent to the block diagonal system of equations,

r i ~ o ~ , ( i ~ 1 , 2 , . . . , p ) . (2.8)

Thus, the vector ~isatisfies a symmetric tridiagonal system of equations that has a

constant diagonal element and a constant super- and sub- diagonal element.

(2.8) has been solved block by block it is possible to solve for ~j = Q~j.

have:

Algorithm 1

1. Compute or determine the eigensystem of A and T.

2. 0o~pute ij Q~j (J 1,2,.. . ,ql.

3. Solve ri~i = ~ (i = 1,2,...,p).

~. Compute xj = ~j (j . 1,2,...,q).

After Eq.

Thus we

r i =

For our system

k i w i

and the eigenvalues may be written down as

v = 2~ i r_~ ir k i + cos q+l

" " ~i

si ki

r = 1,2,..., q

It should be noted that only Q, and the yj, j = 1,2,...,q have to be stored,

A since _~ oan over~rite the ~j the ̂ ~ can overwrite the ~ and the ~joan overwrite

the ~j. A simple °aleulatien will show that approximately 2plq + 5Pq arithmetic opera-

tors are required for the algorithm when step 3 is solved using @aussian el4m4~a-

tion for a tridiagonal matrix when r i are positive definite. The arithemtic opera-

ters are dominated by the 2p2q multiplications arising from the matrix multiplica-

tions of steps 2 + 4. It is not easy to reduce this re,tuber unless the matrix Q ham

special properties (as in Poisson's equation) when the fast Fourier transform can be

used (see Hookney [8]) .

er that

r i = Z V i Z T ,

rs~ V i the diagonal matrix ef eigenvalues of r i and Zrs = o s sin ~ . Since r i and rj

have the same set of eigenvectors

r i rj = rj r i .

Because of this decomposition, step (3) can be solved by computing

~i = Z V~' Z T

where the Z is stored for each r i. This therefore requires of the order of 2pq"

multiplications and this approximately doubles the computing time for the algorithm.

Thus performing the fast Fourier transform method in step 3 as well as steps 2 and

is not advisable.

3. Block C,yclic Reductien

In Section 2, we gave a method for which one had to know the eigenvalues and

eigenvectors of some matrix. We now give a more direct method for solving the

system of Eq. (2.1).

We assume again that A and T are symmetric and that A and T commute. Further-

more, we assume that q = m-I and

m = 2 k+l

where k is some positive integer. Let us rewrite Eq. (2.5b) as follows:

~.i-2 + A~j-I + ~J = ~J-l '

TXj_l + A~j + Txj+ 1 = ~j ,

~j ÷ ~J+l + ~J+2 = ~j+l "

Multiplying the first and third equation by T, the second equation by -A, and addim@

we h a v e

T2xj_ 2 + (2T" - A 2)xj + T2xj+ 2 = T~j_I - A~j + T~j+I .

Thus if j is even, the new system of equations involves x.'s with even indices. ~j

Similar equations held for x• and Xm_ 2. The process of reducing the equations in

this fashion is known as c2clic reduction. Then Eq. (2.1) may be written as the

following equivalent system:

( 2T 2 -A" )

T" ( 2T 2-A" ) T 2

• • •

• @

k, -

~ + ~ - ~

e

e

(2'~'~')

F

o

o

~ m _ n ,

I.

(3.1)

and

~ j = Zj + ~(Xj_l + X i+l) J = 3,5,...,m-3 (3.2)

2 k+l , Since m = and the new system of Eq. (3.1) involves xj's with even indlcesp the

block dimension ef the new system of eqtmticns is 2k-l. Note that once Eq. (3.1) is

solved, it is easy to solve for the xj's with odd indices as evidenced by Eq. (3.2)•

We shall refer to the system of Eq. (3.2) as the eliminated equations.

Also, note that Algorithm i may be applied to System (3.1). Since A and T

commute, the matrix (2Ta-A a) has the same set of eigenvectors as A and T. Also, if

~(A) = ki, ~(T) = %, for i = 1,2,...,m-l,

= - .

Heckney [8] has advocated this procedure.

Since System (3.1) is block tridiagonal and of the form of Eq. (2.2), we can

apply the reduction repeatedly until we have one block. However, as noted above, we

can stop the process after any step and use the toothed of Section 2 to solve the

resulting equations.

To define the procedure recursively, let

~o) (j = 1,2, .,m-l). A (°) = A, T (e) = T; ~ = Zj, -" (3.3)

Then for r = O,l,..,k

A (r+l) = 2(T(r)) = _ (A(r)) =,

T (r+z) = (T(r))" , (3.~)

~(r-1) = T(r) • (r) . (r) - A(r) (r) j ~J-2 r + ~j+2 r Yj •

The eliminated equations at each stage are the solu~on of the diagonal system

(r-l) - T(r-l) A (r-l) X2r_2r_ , = ~2r_2r-, X2r

(r-l) - T(r-1) (xj2 r ) A( r -1 ) X j 2 r - 2 r " = ~ j 2 r - 2 r - ' + x ( j - 1 ) 2 r (3 .5)

j = 1,2,...,2 k-r

. (r-l) A(r-1) ~.I _2r., =~k+l_2r. , - T(r'l) X2k+l_2r

After all of the k steps, we must solve the system of equations

A(k) . (k) ~2 k -- ~2 k . (3.6)

In either ease, we must solve Eq. (3.5) to find the eliminated unknowns, Just a~ in

Eq. (3.2). If it is done by direct solution, an ill-conditloned system may arise.

Furthermore A = A(°)is tridiag~nal A (i) is quindiagonal and so on destroying the

simple structure of the original system. Alternatively polynomial factorization

retains the simple structure of A.

From Eq. (3.1), we note that A (1) is a polynomial of degree 2 in A and T. By

induction, it is easy to show that A (r) is a polynomial of degree 2 r in the matrices

A and T, so that 2r-I

A(r) = ~ e(r)2j A2j T2r-2j "~ P2 r(A'T)"

We shall proceed %0 determine the linear factors of P2r(A,T).

Let 2r-I

j--o

For t ~ O, we make the substitution

a/~ : -2 OOS e .

From Eq. (3.3), we note that

I p2r÷1(a,t) = 2t 2~÷ _ (p2r(a,t))~

It is then easy to verify using Eq~. (3.7) and (3.8), that

P2r(a,t) =-2t 2r cos 2re ,

and, consequently 2 r

J=l

and, hence,

-~-2-i (, + 2t cos ~2~+,~ , ) ,

(3.7)

(3.8)

A (r) = -~ (A + 2 cos e!r)T)~ , (3.9)

01

(r) = (2j_I)~/2~+, where ~j

Thus to solve the original system it is only necessary to solve the factored system

recursively. For example when r = 2, we obtain

A (1) = 2~ - A m = (~ T - A)(~ T + A)

whence the simple tridiagonal systems

(J: T-A) ~=~

(4~ T +A) x = w

are used to solve the system

A(1)x = ~ •

We call this method the cyclic odd-even reduction and factorization (CORF) algorithm.

10

4. Applications

Exampie I Poissen's Equation wit h Dirichlet Boundar~ Conditions,

It is instructive to apply the results of Section 3 to the solution of the

finite-difference approximation to Poisson's equation on a rectangle, R, with speci-

fied boundary values• Consider the equation

u + u : f(x,y) for (x,y)ER, ~x yy (~.l)

u(x,y) : g(x,y) for (x,y)¢aR .

(Here aR indicates the boundary of R.) We assume that the reader is familiar with

the general technique of imposing a mesh of discrete points onto R and approximating

~q. (4.Z). The eq~tion u + Uyy : f(x,y) is approximated at (xl,Yj) by

Vi-l.j - 2vi,j + Vi+l.j vi,j-1 - 2vi. j + vi.j+l

C ~ ) " + (Ay)"

= fi,J (i < i < n-l, I < J < m-i) ,

with appropriate values taken on the boundary

VO,J = gC,~ ' Vm, j = gm,J ( 1 g J g m - l ) ,

and

Vi,@ = gi,o' vi,m : gi,J (i < i ~ n-l).

Then vii is an approximation to u(xi,Yj) , and fi,j = f(xi'Yj)' gi,j : g(xl,Yj)-

Hereafter, we assume that

2k+l m -~ •

When u(x,y) is specified on the boundary, we have the Dirichlet boundary con-

dition. For simplicity, we shall assume hereafter that Ax = Ay. Then

1

l © • -4 I

(~ • .

1

and T = I . . l •

1

- 4 ( n - l ) x ( n - l )

11

The matrix In_ I indicates the identity matrix of order (n-l). A and T are symmetric

and co~ute, and, thus the results of Sections 2 and 3 are applicable• In addition,

since A is tridlagcnal, the use of the facterization (3.10) is greatly simplified.

The nine-polnt difference formula for the same Poisson's equation can be treated

similarly when m

-20 4

4 -20

A = •

0

O

•

& -20

, T=

(n-l)×~n-ll

"~ z 0

1 4 1

(~ . . I

1 &

Example II

The method can also be used for Poisson's equation in rectangular regions

under natural boundary conditions provided one uses

au = u(x + ~.y) - u(x - ~.y) Ox 2h

and similarly ~ • at the boundarie S,

Example III

Poisson's equation in a rectangle with doubly periodic boundary conditions is

an additional example when the algorithm can be applied.

Example IV

The method can be extended successfully to three dimensions for Foissents

equation.

For all the above examples the eigensystems are known an~ the fast Fourier

transform can ~e applied,

Example V

The equation of the form

(~(x)~)x + (KY)~)y + u(x,y) = q(x,y)

on a rectangular region can be solved by the CORF algorithm provided the eigensystem

is calculated since this is not generally known.

12

The counterparts in cylindrical polar co-ordinates can also be solved using

CORF on the ractangle~ in the appropriate co-ordinates.

5. The Buneman algorithm and variants

In this section, we shall describe in detail the Buneman algorithm [2] and a

variation of it. The difference between the Buneman algorithm and the CORF algo-

rithm lies in the way the right hand side is calculated at each stage of the reduc-

tion. Henceforth, we shall assume that in the system of Eqs (2.5) T = Ip, the

identity matrix of order p.

Again consider the system of equations as given by Eqs. (2.5) with q = 2k+l-1.

After one stage of cyclic reduction, we have

+ (21 - A')~j (5.1) 5j-2 p + 5j+2 = ZJ-I + ZJ+I -AZJ

for J = 2,4,...,q-I with ~e = ~+l = ~ ~ the null vector. Note that the right han~

side of Eq. (5.1) may be written as

(i) (5.2) ~J = ZJ-1 + ZJ+I -~J = A(1) A-'~j + ZJ-I + ~J+l - 2A-'~j

where A (1) = (21p- A') .

Let us define

(i) ~J-(1) ~j-I ~j+l " 22~ I)_ 2j : A-'Zj ; = +

(These are easily calculated since A is a tridiagon~l matrix.) Then

(1) = A(1) _(1) (1) (5-3) Z~ £j + %j •

After r reductions, we have by Eq. (3.i)

(r+l) , (r) (r)) -A(r) (r) j = ~ j - 2 ~ + ~j+2 ~j . (5.4)

Let us write

Substituting

21 - A (r+1) P

in a fashion similar to Eq. (5.3)

(5.5)

Eq. (5.5) into Eq. (5.4) and making use of the identity (A(r)) ' =

from Eq. (3.4), we have the following relationships:

(r+l) = 2(r) (A(r))_~ , (r) ~(r) ~r)) (5.6a) J J - ~j_2 r + ~j+2 r -

13

• ( r + l ) ~(r) (r) ^ (r+l)

For J = i2 r+l (i = 1,2,...,2k-r-l) with

~!r) = ~(r) (r) = ~(r) = O 2k+l = 2k+l - •

~r) Because the number of vectors ~ is reduced by a factor of two for each successive

r, the computer storage requirements becomes equal to almost twice the number of

data points.

To compute

of equations

A(r) , (r) (r+l) (r)r (r) (r) !,~j - ~,J ) == ~J-2 + ~ j+2 r - ~j '

where A (r) is given by the factorization Eq. (3.9); namely,

2 r A (r) ~ (A + 2 cos 8 (r)j = - Ip) ,

J=l

o~ r) = (2~ - ~ ) ~ / 2 r ~ l •

After k reductions, one has the equation

. = A(k) (k) ,~(k) A (k) x k ~2 k + ~2k

2

and hence

(A(r))-'(~J-2(r)r + ~J +2r~(r) _ ~r)) in Eq. (5.6a). we solve the system

~(k) (A(k))_1 ~(k) ~2k = ~2k + ~2k •

Again one uses the factorization of A (k) for computing (A(k)) -I ~I~ ) .

solve, we use the relationship

~J -2r + A(r) ~J + ~J +2r = A(r) ~r) + ~r)

for J = i2r(l,2,...,2k+l-r-1) with ~o = ~2k+ 1 = ~ •

For J = 2 r, 3.2r,...,2k+l-2 r, we solve the system of equations

A(r)(xj - ~r)) = ~r) _ (xj_2r + xj+2r) ,

Te back

(5.7)

14

using the factorlzation of A(r); hence

~J 2~r) (r)) = + (~J - £d "

(5.~3)

Thus to summarise, the Buneman algorithm proceeds as follows:

((r) (r)~ 1. Compute the sequence ~j , ~j } by Eq. (5.6) for r = l,...,k with

(o) e for J = 0,...,2 k+l ,ana ~O)Z = ~J for j = l, 2,...,2k+l-1.

2. Back-solve for ~j using Eqs. (5.7) and (5.8).

The use of the p(r) and q(r) produce a stable algorithm. Numerical experi- ~J ~J

ments by the author and his colleagues have shown that computationally the Buneman

algorihhm requires approximately 30% less time than the fast Fourier transform

method of Hockney.

6. Accuracy of the BunemanAl~orithms

As was shown in Section 5, the Bunemau algorithms consist of generating the

(r)l. Let us write, using Eqs. (5.12) and (5.13) sequence of vectors I~ r), ~J

£~r) : ~r) + ~J(r) (6.la)

(r) = Xj 2 r + x r - Afr) (r) (6.1h) ~J ~ - ~j+2 ~j '

where

and

Then

and

whe re

k : l (6.2)

S ( r ) = ( A ( r - l ) . . . A(O)) - ' . (6.3)

I1~.~ r) .(r)ll IIs(r)ll2 - ~ i l 2 ~i

l i ~ r ) - (~j_2r + ~j+2rl12

11.t1' (6.~)

IIs (r) ACr)il2 i1~1' , ( 6 . 5 )

llVll 2 indicates the Euclidean norm of a vector v ,

IICII 2 indicates the spectral norm of a matrix C, and

15

1~1t'. ~ll~jll 2 . j=l

Thus for A : A T r-1

Ns(r)II2 - ~ I](A(J))-III2

j:o

and since A ( j ) are polynomials of degree 2 j in A we have

r-I

lls(r)ll2 Vt [P2 j(×±) max I ] " [ ,

j:o [xif

where p2j(Xi) are polynomials in Ikil , the eigenvalues of A.

For Poisson's equation it may be shown that

lls(r)II2 < e'°r e

where o : 2r-1 and e > O. r

Thus Hs(r)ll2 _, o and h~ce

I12~ r) - ~H 2 ~ 0

~r) That is p tends to the exact solution wihh increasing r.

that llq~ r)N2 remains bounded throughout the calculation, the Buneman

leads to numerically stable results.

(6.6)

Since it can be shown

algorithm

7. Non-Rectangular Regions

In many situations, one wishes to solve an elliptic equation over the region

R

where there are n I data points in R I , n2 data points in R z and ne data points in

R, (~ R2. We shall assume that Dirichlet boundary conditions are given. When Ax is

16

the same throughout the region, one has a matrix equation of the form

m

G

© P

pT

@ ~(2)J ~c (2)

where

"A T

T A

G= ©

©

• #

e . T

T A n I x n l

B s

$ B .

(~ " .

S

H = S

B n 2 xn=

and P is (noXno).

Also, we write

x~ z) x ! 2 ~

x (1) =

o

x(a) ,,,r

x(2) • q I

We assume again that AT = TA and BS = SB.

From Eq. (7.1), we see that

0

0 x(1) = ~-I y(1) _ ~-1 .

(7.1)

(7.2)

(7.3)

( 7 . ~ )

an~

17

x(2) = H-I Z(2) - H-I

pT

0

0

x(1) ,,.,r (7.5)

Now let us write

G~(1) = ~(1), H~(2)

~w (I) =

= ~(2) ,

~(2)=

(7.6)

;l o I

"I

oJ

(7.7)

Then as -e partition the vectors z (i), z (2) and the matrices W (1) and W (2) as

in Eq• (7.3), Eqs• (7.4) and (7.5) becomes

(i) ~i) ~(1) x!2) (j = 1,2,...,r), ~j = £ - ,,j ~ ,

(2) = (2) _ w!2) x(1) (j = 1,2,..•,s)• £J ~j J ,,~ ,

For Eq. (7.8), we have

I w (1) r

w~ 2) i

(7.8)

(1)

z(2) (7.9)

It can be noted that W ~lj( ~ and W ~2j( ~ are dependent only on the given region and hence

the algorithm becomes useful if many problems on the same region are to be conside-

red.

Thus, the algorithm proceeds as follows•

i. Solve z(I) aria z! 2) using the methods of Section 2 or 3.

18

2. Solve for W (I) and W! 2) using the methods of Section 2 or 3. r

3. Solve Eq. (7.9) using Gaussian elimination. Save the LU decomposition of

Eq. (7.9).

h. Solve for the unknown components of ~(1) and ~(2) •

8. Conclusion

Numerous applications require the repeated solution of a Poisson equation.

The operation counts given by Dorr [5] indicate that the methods we have discussed

should offer significant economies over older techniques; and this has been veri-

fied in practice by many users. Computational experiments comparing the Buneman

algorithm, the MD algorithm, the Peaceman-Raohford alternating direction algorithm,

and the point successive over-relaxation algorithm are given by Buzbee, at al [3].

We conclude that the method of matrix decomposition, the Buneman algorithm, and

Hookney's algorithm (when used with care) are valuable methods.

This paper has benefited greatly from the comments of Dr. F. Dorr,

Mr. J. Alan George, Dr. R. Hockney and Professor 0. Widlund.

9. References

1. Richard Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York, 1960.

2. Oscar Buneman, Stanford University Institute for Plasma Research, Report No.294, 1969.

B.L. Buzbee, G.H. Golub and C.W. Nielson, "The Method of Odd/Even Reduction and Factorization with Application to Poisson's Equation, Part II," LA-h288, Los Alamos Scientific Laboratory. (To appear SIAM J. Num. Anal. )

J.W. Cooley and J.W. Tukey, "An algorithm for machine calculation of complex Fourier series," Math. Comp., Vol.19, No.90 (1965), pp. 297-301.

F.W. Dorr, "The direct solution to the discrete Poisson equation on a rectangle," to appear in SIAM Review.

J.A. George, "An Embedding Approach to the Solution of Poisson's Equation on an Arbitrary Bounded Region," to appear as a Stanford Report.

G.H. Golub, R. Underwood and J. Wilkinson, "Solution of Ax = kBx when B is positive definite," (to be published). ~

R.W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM., Vol.12 No.1 (1965), pp. 95-113.

3.

4.

o

6.

.

8.

19

9.

lO.

R.W. Hockney, in Methods in Computational Physics (B. Adler, S. Fernbach an~ M. Rotenberg, Eds.), Vol.S Academic Press, New York and London, 1969.

R.E. Lynch, J.R. Rice and D.H. Thomas, "Direct solution of partial difference equations by tensor product methods," Num. Math., Vol.6 (196A), pp. 185-199.

ii. R.S. Varga, Matrix Interative Anal2sis, Prentice Hall, New York, 1962.

Matrix Methods in Mathematical Programming

GENE GOLUB

Stanford University

22

I. Introduction

With the advent of modern computers, there has been a great development in

matrix algorithms. A major contributer to this advance is J. H. Wilkinson [30].

Simultaneously, a considerable growth has occurred in the field of mathematical

programming. However, in this field, until recently, very little analysis has been

carried out for the matrix algorithms involved.

In the following lectures, matrix algorithms will be developed which can be

efficiently applied in certain areas of mathematical programming and which give

rise to stable processes.

We consider problems of the following types:

maximize ~ (~) , where ~ = (x,, x,, .. Xn) T

subject to Ax= b

Gx ~h

where the objective function ~ (~) is linear or quadratic.

2. Linear Programming

The linear programming problem can be posed as follows:

T m~x~i,e ~ (~) = ~

subject to A~_ = b (2.1)

) 0 (2.2)

We assume that A is an m x n matrix, with m < n, which satisfies the Haar

condition (that is, every m x m submatrix of A is non-singular). The vector ~ is

said to be feasible if it satisfies the constraints (2.1) and (2.2).

Let I = lil, i2, .. iml be a set of m indices such that, on setting xj = O,

j $ I, we can solve the remaining m equations in (2.1) and obtain a solution such

that

xij > 0 , J = I, 2, .. m .

Thi8 vector x is said to be a basic feasible solution. It is well-known that

the vector ~ which maximizes ~ (~) = o T x is a basic feasible solution, and this

suggests a possible algorithm for obtaining the optimum solution, namely, examine

all possible basic feasible solutions.

23

Such a process is generally inefficient. A more systematic procedure, due to

Dantzig, is the SimylexAl~orithm. In this algorithm, a series of basic feasible

solutions is generated by changing one variable at a time in such a way that the

value of the objective function is increased at each step. There seems to be no

way of determining the rate of convergence of the simplex method; however, it works

well in practice.

The steps involved may be given as follows:

(i) Assume that we can determine a set of m indices I = liI , i,, .. iml such that

the corresponding x i are the non-zero variables in a basic feasible solution.

J Define the basis matrix

B = [ai , Ai2, .. aim ]

where the a are columns of A corresponding to the basic variables. --lj

(ii) Solve the system of equations:

B~=b

where ~.T= [Xil, Xi, ' .. Xim]

(iii) Solve the system of equations:

B T ^ W = C

where _~T__ [ci,, ci2' .. cim] are the coefficients of the basic variables in the

objective function.

(iv) Calculate

~T w] - T w , say. ( .cj - ~ ~. = Cr ~r - max

j £ I T

If c r - ~ w • 0 , then the optimum solution has been reached. Otherwise, a is to ~r

be introduced into the basis.

(v) Solve the system of equations:

Bt = - a --r

If t ~ 0 , k = I, 2, r k

bounded.

• . m , then this indicates that the optimum solution is un-

Otherwise determine the component s for which

x i x s = min - ~ trk 0

t r s 1 ~k~m t r k

24

Eliminate the column a i from the basis matrix and introduce column a r. s

This process is continued from step (ii) until an optimum solution is obtained (or

shown to be unbounded).

We have defined the complete algorithm explicitly, provided a termination rule,

and indicated how to detect an unbounded solution. We now show how the simplex

algorithm can be implemented in a stable numerical fashion.

~. A stable implementation of the simplex al6orithm

Throughout the algorithm, there are three systems of linear equations to be

solved at each iteration. These are:

B~ = b , m

BTw = c ,

Bt = -a --r -r

Assuming Gaussian elimination is used, this requires about m3/3 multiplica-

tions for each system. However, if it is assumed that the triangular factors of B

are available, then only O(m 2) multiplications are needed. An important considera-

tion is that only one column of B is changed in one iteration, and it seem, reasonable

to assume that the number of multiplications can be reduced if use is made of this.

We would hope to reduce the m3/3 multiplications to O(m 2) multiplications per step.

This is the basis of the classical simplex method. The disadvantage of this method

is that the pivoting strategy which is generally used does not take numerical

stability into consideration. We now show that it is possible to implement the

simplex algorithm in a more stable manner, the cost being that more storage is re-

quired.

Consider methods for the solution of a set of linear equations. It is well-

known that there exists a permutation matrix n such that

HB = LU

where L is a lower triangular matrix, and U is an upper triangular matrix.

If Gaussian elimination with partial (row) pivoting is used, then we proceed

as follows :

Choose a permutation matrix H, such that the maximum modulus element of the

25

first column of B becomes the (I, 1) - element of 1"] 1 B.

Define an elementary lower triangular matrix F k as

k ~ | -

r k = I ' ! - ! f

" i |

". ~ I

'LL I ' l , I ' | " ~ , J ".

Now~ can be chosen so that

P, HI B

has all elements below the diagonal in the first column set equal to zero.

Now choose 92 so that

92 r, 9, B

has the maximum modulus element in the second column in position (2, 2), and

choose r e so that

r= fl~ 1"t H2 B

has all elements below the diagonal in the second column set equal to zero. This

can be done without affecting the zeros already computed in the first column.

Continuing in this way we obtain:

rm- , ~ m - , . . . P 2 ~ , r, 9, B = U

where U is an upper triangular matrix.

Note that permuting the rows of the matrix B merely implies a re-ordering of

the right-hand-side elements. Thus, no actual permutation need be performed,

merely a record kept. Further any product of elementary lower triangular matrices

is a lower triangular matrix, as may easily be shown. Thus on the left-hand side

we have essentially a lower triangular matrix, and thus the required factorization.

The relevant elements of the successive matrices F k can be stored in the

lower triangle of B, in the space where zeros have been introduced. Thus the

method is economical in storage.

26

To return to the linear programming problem, we require to solve a system of

equations of the form

B ( 1 ) ~ = v ( 3 . ~ )

where B (i) and B (i-I) differ in only one column (although the columns may be re-

ordered)°

Consider the first iteration of the algorithm. Suppose that we have obtained

the factorization:

B (°) = S (°) U(o)

where the right-hand-side vector has been re-ordered to take account of the permuta-

tions.

The solution to (3 . i ) with i = 0 is obtained by computing

= (L~°)) -~ x

and solving the triangular system

v(O) = ~ ,

2 each of which requires m + 0 (m) multiplications.

Suppose that the column b (°) is eliminated from B (°) and the column g(O) is S O

introduced as the last column, then

BO) = [b(O) b(O) . b(O) bCo) ~(o)] L t • ~ 2 ' " ~ S " t • ~ S * 1 ' " "

0 0

Therefore,

( ~ ( o ) ) .1 BO) = HO) ,

where H (I) has the form:

/

{ <

27

Such a matrix is called an upper Hessenberg matrix. 0nly the last column need be

computed, as all others are available from the previous step. We require to apply

a sequence of transformations to restore the upper triangular form. It is clear

that we have a particularly simple case of the LU factorization procedure as

previously described, where r! I) is of the form: i

R I ' I i I #-~

• I k_Y ' I " I

11 1

,q~'/ I1 . J i "

I I 0

r~ I) =

only one element requiring to be calculated. On applying a sequence of transforma-

tion matrices and permutation matrices as before, we obtain

1) 1) . . r (1) H(1) = u (1) s s

o o

where U (I) is upper triangular.

(I) it is only necessary to compare two Note that in this case to obtain Hj

(I) and elements. Thus the storage required is very small: (m - So) multipliers gi

(m - So) bits to indicate whether or not interchanges are necessary.

All elements in the computation are bounded, and so we have good numerical

accuracy throughout. The whole procedure compares favourably with standard forms,

for example, the product form of the inverse where no account of numerical accuracy

is taken. Further this procedure requires fewer operations than the method which

uses the product form of the inverse. If we consider the steps involved, forward

and backward substitution with L (°) and U (i) require a total of m 2 multiplications

and the application of the remaining transformation in (L(i)) -I requires at most

i(m - I) multiplications. (If we assume that on the average the middle column of

the Basis matrix is eliminated, then this will be closer to (i/2) (m - I) ). Thus

a total of m 2 + i (m - I) multiplications are required to solve the system at each

28

stage, assuming an initial factorization is available. Note that if the matrix A

is sparse, then the algorithm can make use of this structure as is done in the

method using the product form of the inverse.

4" Iterative refinement of the.solution

Consider the set of equations

B~ = X

and suppose that ~ is a computed approximation to ~ . Let

-- ~+£

Therefore,

that is,

B(~ + 2) : v ,

Be_ -- v-B~

We can now solve for c very efficiently, since the LU decomposition of B is

available. This process can be repeated until ~ is obtained to the required accur-

acy. The algorithm can be outlined as follows:

(i) Compute ~j = ~ - B~_j

(ii) Solve B_cj = r -j

(iii) Compute ~j+1 = ~J + ~J

It is necessary for r to be computed in double precision and then rounded to --j

single precision. Note that step (ii) requires 0(m 2) operations, since the LU de-

composition of B is available. This procedure can be used in the following sections.

~. Householder Trian~ularization

Householder transformations have been widely discussed in the literature. In

this section we are concerned with their use in reducing a matrix A to upper-

triangular form, and in particular we wish to show how to update the decomposition

of A when its columns are changed one by one. This will open the way to implemen-

tation of efficient and stable algorithms for solving problems involving linear

constraints.

Householder transformations are symmetric orthogonal matrices of the form

Pk = I - k UkUk where u k is a vector and Ck = 2/( ). Their utility in this

29

context is due to the fact that for any non-zero vector 2 it is possible to choos~

u k in such a way that the transformed vector Pk a is zero except for its first

element. Householder [15] used this property to construct a sequence of transfor-

mations to reduce a matrix to upper-triangular form. In [29], Wilkinson describes

the process and his error analysis shows it to be very stable.

Given any A, we can construct a sequence of transformations such that A is

reduced to upper triangular form. Premultiplying by P annihilates (m - 1) O

elements in the first column. Similarly, premultiplying by PI eliminates (m - 2)

elements in the second column, and so on.

Therefore,

em-1 Pm-2 "'PI PoA = [ RO ] ' (5.1)

where R is an upper triangular matrix.

Since the product of orthogonal matrices is an orthogonal matrix, we can

write (5.1) as

QA = [ R] 0

A=QT[ R ] 0

The above process is close to the Gram-Schmidt process in that it produces

a set of orthogonal vectors spanning E . In addition, the Householder transforma- n

tion produces a complementary set of vectors which is often useful. Since this

process has been shown to be numerically stable, it does produce an orthogonal

matrix, in contrast to the Gram-Schmidt process.

If A = (~I ,...,~n) is an mxn matrix of rank r, then at the k-th stage of the

triangularization (k < r ) we have

where R k

A (k) PoA= = Pk-1Pk-2 "'"

0

is an upper-triangular matrix of order r.

T k

The next step is to compute

A.k+1.( ~ = Pk A'k" ( ~ where Pk is chosen to reduce the first column of T k to zero

except for the first component. This component becomes the last diagonal element

30

of ~+I and since its modulus is equal to the Euclidean length of the first column

of T k it should in general be maximized by a suitable interchange of the columns

of Sk . After r steps, T will be effectively zero (the length of each of its r

T k

col~Im=~ will be smaller than some tolerance) and the process stops.

Hence we conclude that if rank(A) = r then for some permutation matrix H the

Householder decomposition (or "QR decomposition") of A is

Q A ~ = Pr-1 Pr-2 "'" PO A =

r

O 0

where Q = Pr-1Pr-2 "'" PO is an m x m orthogonal matrix and R is upper-triangular

and non-singular.

We are now concerned with the manner in which Q should be stored and the

means by which Q, R, S may be updated if the columns of A are changed. We will

suppose that a column a is deleted from A and that a column a is added. It will ~p ~q

be clear what is to be done if only one or the other takes place.

Since the Householder transformations Pk are defined by the vectors u k the

usual method is to store the Uk'S in the area beneath R, with a few extra words of

memory being used to store the ~k'S and the diagonal elements of R. The product

Q~ for some vector ~ is then easily computed in the form Pr-1Pr-2 "'" PO ~ where,

T T for example, PO ~ = (I - ~0~O~0)~ = ~ - ~o(Uo~)Uo . The updating is best

accomplished as follows. The first p-1 columns of the new R are the same as before;

the other columns p through n are simply overwritten by columns ap+1, ..., an, aq

and transformed by the product Pp-1Pp-2 "'" PO to obtain a new

I (Sp_ I ~ I' then T is triangularized as usual.

\%1 ] p-1

This method allows Q to be kept in product form always, and there is no accumula-

tion of errors. Of course, if p = I the complete decomposition must be re-done

and since with m~ n the work is roughly proportional to (m-n/3)n 2 this can mean

a lot of work. But if p A n/2 on the average, then only about I/8 of the original

work must be repeated each updating.

31

Assume that we have a matrix A which is to be replaced by a matrix ~ formed

from A by eliminating column a and inserting a new vector g as the last column.

As in the simplex method, we can produce an updating procedure using Householder

transformations. If ~ is premultiplied by Q, the resulting matrix has upper

Qi = /

/

<

As before, this can be reduced to an upper triangular matrix in O(m 2) multiplica-

tions.

6. Projections

In optimization problems involving linear constraints it is often necessary

to compute the projections of some vector either into or orthogonal to the space

defined by a subset of the constraints (usually the current "basis"). In this

section we show how Householder transformations may be used to compute such pro-

jections. As we have shown, it is possible to update the Householder decomposi-

tion of a matrix when the number of columns in the matrix is changed, and thus we

will have an efficient and stable means of orthogonalizing vectors with respect to

basis sets whose component vectors are changing one by one.

Let the basis set of vectors a 1,a2,...,a n form the columns of an m x n

matrix A, and let S be the sub-space spanned by fail • We shall assume that the r

first r vectors are linearly independent and that rank(A) = r. In general,

m > n > r , although the following is true even if m < n •

Given an arbitrary vector z we wish to compute the projections

u = Pz , v = (I - P) z

for some projection matrix P , such that

Diagramatically, Hessenberg form as before.

32

a) z = u + v

(b) 2v = 0

(o) ~s r (i.e., 3~ ~uoh that ~ = ~)

(i.e., ATv (d) v is orthogonal to S r ~ = o)

One method is to write P as AA + where A + is the n x m generalized inverse of A,

and in [7~ Fletcher shows how A + may be updated upon changes of basis. In contrast,

the method based on Householder transformations does not deal with A + explicitly

but instead keeps AA + in factorized form and simply updates the orthogonal matrix

required to produce this form. Apart from being more stable and just as efficient,

the method has the added advantage that there are always two orthonormal sets of

vectors available, one spanning S and the other spanning its complement. r

As already shown, we can construct an m x n orthogona~ matrix Q such that

r n-r

QA = £i 0 S1

where R is an r x r upper-triangular matrix. Let

W = Qz =

I r

m-r

(6.~)

and define

~ ' X= ~2 (6.2)

Then it is easily verified that ~,~ are the required projections of ~, which is to

say they satisfy the above four properties. Also, the x in (c) is readily shown

to be

In effect, we are representing the projection matrices in the form

33

and

P Q C: r) = (z r o)Q ( 6 . ~ )

I-P =QT (im_rO ) (OI r)Q (6.A)

and we are computing ~ = P z, Z = (I - P)~ by means of (6.1), (6.2) • The first r

col,m~R of Q span S and the remaining m-r span its complement. Since Q and R may r

be updated accurately and efficiently if they are computed using Householder

transformations, we have as claimed the means of orthogonalizing vectors with re-

spect to varying bases.

As an example of the use of the projection (6.4), consider the problem of

finding the stationary values of xTAx subject to xTx = I and cTx = O, where A is a

real symmetric matrix of order n and C is an n x p matrix of rank r, with r ! P <~ n.

It is shown in [12] that if the usual Householder decomposition of C is

r n-r

Qc= (Ro OS )

th@n the problem is equivalent to that of finding the eigenvalues and eigenvectors

of the matrix PA , where

P = I-P = O O Q

0 In_ r

is the projection matrix in (6.4).

Note that, although PA is not symmetric, since P~ = P , then

PA = P2A

and further the eigenvalues of P2A are equal to the eigsnvaluee of the s~etric

matrix PAP. The dimensionality of the problem is not reduced; some of the eigen-

values will be zero.

~. Linear least-squares problem

The least-squares problem to be considered here is'

34

m£n l l b - A~_It 2

where we assume that the rank of A is n.

Since length is invariant under an orthogonal transformation we have

where QA =

lib - A x l l 2 = l l Q b - QA~_II "+ 2 2

[ 1{ ]. Let 0

Qb = c : [o_, ]. - - - - C 2 m- n

Then,

2, 1{] x U' = Ha_,- ~_H" + lla.il" " [~_,] - [o - , ,

and the solution to the least-squares problem is given by

= 1{ -1 c,

Thus it is easy to solve the least-squares problem using orthogonal transformations.

Alternatively, the least-squares problem can be solved by constructing the

normal equations

A x = A D

However these are well-known to be ill-conditioned.

Nevertheless the normal equations can be used in the following way.

Let the residual vector r be defined by:

r = b - A ~

Then,

ATr = ATb - ATA~ = 0

These equations can be written:

[IA A]O Ir> (:Jx+ Thus,

0 I

Multiplying out:

(1{7o) o

IAT AIi TO IOii IO

C CO/o

(r) X

(7.~)

:I(:)

35

where ~ = QE and S = Q~ .

This system can easily be solved for ~ and ~. The method of iterative refine-

ment may he applied to obtain a very accurate solution.

This method has been analysed by BJhrck [2].

8. Least-squares problem with linear constraints

Here we consider the problem

minimize ~ - A~_~ 2 2

subject to G~ = ~ .

Using Lagrange multipliers ~ , we may incorporate the constraints into

equation (7.1) and obtain

0 I A

G T A T 0 1 b

0

The methods of the previous sections can be applied to obtain the solution of this

system of equations, without actually constructing the above matrix. The problem

simplifies and a very accurate solution may be obtained.

Now we consider the problem

minimize llb - A~_~ 2 2

subject to Gx ~> h.

Such a problem might arise in the following manner. Suppose we wish to approximate

given aata by the polynomial

y(t) = ~t ~ + @t 2 + yt +

such that y(t) is convex. This implies

y(')(t) = 6at + 2~ ) 0 .

Thus, we require

6 a t i + 2~ ) 0

where t. are the data points, (This aces not necessarily guarantee that the poly- l

hernial will be convex throughout the interval. ) Introduce slack variables w such

that Gx - w = h

where w ~ _O .

36

Introducing Lagrange multipliers as before, we may write the system as:

i O 0 G -I 0 I A 0

G T A T 0 0

r

x

w

h

b

0

At the solution, we must have

T • _~o, w~o, _z_w=0.

This implies that when a Lagrange multiplier is non-zero then the corresponding

constraint holds with equality.

Conversely, corresponding to a non-zero w i the Lagrange multiplier must be

zero. Therefore, if we know which constraints held with equality at the solution,

we could treat the problem as a linear least-squares problem with linear equality

constraints. A technique, due to Cottle and Dantzig [5], exists for solving the

problem inthis way.

37

Bibliography

[11 Beale, E.M.L., "Numerical Methods", in Ngn~.inear Programming, J. Abadie (ed.).

John Wiley, New York, 1967; pp. 133-205.

[2] Bjorck, ~., "Iterative Refinement of Linear Least Squares Solutions II", BIT 8

(1968), pp. 8-30.

[3] and G. H. Golub, "Iterative Refinement of Linear Least Squares

Solutions by Householder Transformations", BIT 7 (1967), pp. 322-37.

[4] and V. Pereyra, "Solution of Vandermonde Systems of Equations",

Publicaion 70-02, Universidad Central de Venezuela, Caracas, Venezuela, 1970.

[5] Cottle, R. W. and @. B. Dantzig, "Complementary Pivot Theory of Mathematical

Programming", Mathematics of the Decision Sclences~ Part 1, G. B. Dantzig and

A. F. Veinott (eds.), American Mathematical Societ 2 (1968), pp. 115-136.

[6] Dantzig, G. B., R. P. Harvey, R. D. McKnight, and S. S. Smith, "Sparse Matrix

Techniques in Two Mathematical Programming Codes", Proceedinss of the S.ymposium

on Sparse Matrices and Their Appllcations, T. J. Watson Research Publications

RAI, no. 11707, 1969.

[7] Fletcher, R., "A Technique for Orthogonalization", J. Inst. Maths. Applics. 5

(1969), pp. 162-66.

[8] Forsythe, G. E., and G. H. Golub, "On the Stationary Values of a Second-Degree

Polynomial on the Unit Sphere", J. SIAM, 13 (1965), pp. 1050-68.

[9] and C. B. Moler, Computer Solution of Linear Algebraic Systems,

Prentice-Hall, Englewood Cliffs, New Jersey, 1967.

[10] Francis, J., "The QR Transformation. A Unitary Analogue to the LR Transforma-

tion," Comput. J. 4 (1961-62), pp. 265-71.

[11] golub, G. H., and C. Reinsch, "Singular Value Decomposition and Least Squares

Solutions", Numer. Math., 14(1970), pp. 403-20.

[12] and R. Underwood, "Stationary Values of the Ratio of Quadratic

Forms Subject to Linear Constraints", Technical Report No. CS 142, Computer

Science Department, Stanford University, 1969.

[13] Hanson, R. J., "Computing Quadratic Programming Problems: Linear Inequality

and Equality Constraints", Technical Memorandum No. 240, Jet Propulsion

38

Laboratory, Pasadena, California, 1970.

[14] and C. L. Lawson, "Extensions and Applications of the House-

holder Algorithm for Solving Linear Least Squares Problems", Math. Comp., 23

(1969), pp. 787-812.

[15] Householder, A.S., "Unitary Triangularization of a Nonsymmetric Matrix",

J. Assoc. Comp. Mach., 5 (1968), pp. 339-42.

[16] Lanozos, C., Linear Differential Operators. Van Nostrand, London, 1961.

Chapter 3 •

[17] Leringe, 0., and P. Wedln, "A Comparison Betweem Different Methods to Compute

a Vector x Which Minimizes JJAx - bH2 When Gx = h", Technical Report, Depart-

ment of Computer Sciences, Lund University, Sweden.

[18] Levenberg, K., "A Method for the solution of Certain Non-Linear Problems in

Least Squares", ~uart. Appl. Math., 2 (1944), pp. 164-68.

[19] Marquardt, D. W., "An Algorithm for Least-Squares Estimation of Non-Linear

Parameters", J. SIAM, 11 (1963), pp. 431-41.

[20] Meyer, R. R., "Theoretical and Computational Aspects of Nonlinear Regression",

P-181 9, Shell Development Company, Emeryville, California.

[21] Penrose, R., "A Generalized Inverse for Matrices", Proceedings of the

Cambridge Philosophical Society, 51 (1955), pp. 406-13.

[22] Peters, G., and J. H. Wilkinson, "Eigenvalues of Ax = kB x with Band Symmetric

A and B", Comput. J., 12 (1969), pp. 398-404.

[23] Powell, M.J.D., "Rank One Methods for Unconstrained Optimization", T. P. 372,

Atomic Energy Research Establishment, Harwell, England, (1969).

[24] Rosen, J. B., "Gradient Projection Method for Non-linear Programming. Part

I. Linear Constraints", J. SIAM, 8 (1960), pp. 181-217.

[25] Shanno, D. C. "Parameter Selection for Modified Newton Methods for Function

Minimization", J. SIAM, Numer. Anal., Ser. B,7 (1970).

[26] Stoer, J., "On the Numerical Solution of Constrained Least Squares Problems",

(private communication), 1970.

[27] Tewarson, R. P., "The Gaussian Elimination and Sparse Systems", Proceedings

of the Symposium on Sparse Matrices and Their Applications~ T. J. Watson

39

Research Publication RA1, no. 11707, 1969.

[28] Wilkinson, J. H., "Error Analysis of Direct Methods of Matrix Inversion",

J. Assoc. Comp. Mach., 8 (1961), pp. 281-330.

[29] "Error Analysis of Transformations Based on the Use of

Matrices of the Form I - 2ww H', in Error in Digital Computation, Vol. ii, L.

B. Rall (ed.), John Wiley and Sons, Inc., New York, 1965, pp. 77-101.

[30] The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,

1 965.

[31] ZoutendiJk, G., Methods of Feasible Directions, Elsevier Publishing Company,

Amsterdam (1960), pp. 80-90.

Topics in Stability Theory for Partial Difference Operators

VIDAR THOM~E

University of Gothenburg

42

PREFACE

The purpose of these lectures is to present a short introduction to some aspects

of the theory of difference schemes for the solution of initial value problems for

linear systems of partial differential equations. In particular, we shall discuss

various stability concepts for finite difference operators and the related question

of convergence of the solution of the discrete problem to the solution of the con-

tinuous problem. Special emphasis will be given to the strong relationship between

stability of difference schemes and correctness of initial value problems.

In practice, most important applications deal with mixed initial boundary value

problems for non-linear equations. It will net be possible in this short course to

develop the theory to such a general context. However, the results in the particular

cases we shall treat have intuitive implications for the more complicated situations.

The two most important methods in stability theory for difference operators have been

the Fourier method and the energy method. The former applies in its pure form only

to equations with constant coefficients whereas the latter is more directly appli-

cable to variable coefficients and even to non-linear situations. Often different

methods have to be combined so that for instance Fourier methods are first used to

analyse the linearized equations with coefficients fixed at some point and then the

energy method, or some other method, is applied to appraise the error comm~tte~ by

treating the simplified case. We have selected in these lectures to concentrate on

Fourier techniques.

These notes were developed from material used previously by the author for a

similar course held in the summer of 1968 in a University of Michigan engineering

summer conference on numerical analysis and also used for the author's survey paper

~361. Some of the relevant literature is collected in the list of references. A

thorough account of the theory can be obtained by combining the book by Richtmyer

and Morton E28] with the above mentioned survey paper E36S. Both these sources con-

rain extensive lists of further references~

43

I. Introduction

Let ~ be the set of uniformly continuous, bounded functions of x, and let

be the set of functions v with (d/dx)Jv in ~ for J ~k. For v ~ ~ set

X

For any v C ~)amy k, and ~ >0we can findv G ~Ksuch that

~1 v - v / / < £ .~

is dense in ~--.

Consider the ~_uitial-value problem

t .>.0 (1)

If v ~ C ~ this problem admits one and only one solution in C D

(2)

(3)

It is clear that the solution u depends for fixed t linearly on v; we define a

linear operator Ee(t ) By

where u is defined by (3) and where v C C~A The solution operator Eo(t ) has the

properties

and

II ~-~ b') v /t <~ II v//

In particular, the inequality means that a small change in v only causes a small

change in the solu~ o n u.

Although for v f ~% the function u defined by (3) is not a "genuine" or

classical solution for % = O, the integral still converges if v C- C and it is

natural to define a "generalized" solution operator by

• . . / + ' t . •

L vL',,? , t - : o , .

~ > 0

44

The operator E(t) still has the properties

= ~ " 0+)

l i e (~:~~ I/ ~< /i v I\ , (~)

and is continuous in t for t ~ O. For this particular equation we actually get a

c l a s s i o ~ solutio~ for t ~ o~ even i f ~ i s o ~ y i n C . e have E ( t ) . ~ (_ - - / ~ K=O

for t > O,

Consider new the initial-value problem

, ( ~ )

For v g ~

Clearly

this problem admits one and only one genuine solution, namely

(7)

(act~mlly we have equality) and it is again natural to define a generalized solution

operator, continuous in t by

This has again the properties (~), (5). In this case, the solution is as irregular

for t >0 as it is for t = O.

Both these problems are thus "correctly posed" in ~ ; they can be uniquel~

solved for a dense subset of ~ and the solution operator is bounded.

We could instead of ~ also have considered ether Basic classes of functions.

Thus let L ~ be the set of square integrable functions with

,, (LI 1 Consider again the initial-value problem (1),(2) and assume that u(x,t) is a classi-

cal solution and that u(x,t) tends to zero as fast as necessary whsm I~I .-~o for

the following to hold. Assume for simplicity that u is real-valued. We then have

~ t ( 8 )

45

so that for t ~ O,

i~ ~ [., ~-~'~ II ~ II v I \ (9)

Relative to the present framework it is also possible to define genuine and gene-

ralized solution operators; the latter is defined on the whole of L 2 and satisfies

(~-), (5) .

For the problem (6), (7) the calculation corresponding to (8) goes similarly~

b'£

One other way of looking at this is to introduce the Fourier transform; for

integrable v, set ~o ~X

v : ( lO)

Notice the Parseval relation, for v ~nx addition in L 2 we have ~ L a and

II '~ Ii = / i - ~ il v i ~ .

For the Fourier-transform u(~ ,t) with respect to x of the solution u(x,t) we then

get i~itial-value problems for the ordinary differential equations, namely,

for (l), (2) an~ a~ . ~, A ~ ~ = A v ( ~

f o r (6), (7). ~ e s e have the ~oZut~ons _~L -~ "~-,~

u ~ ( n )

_~ (12)

respectively, and the actual solutions can be obtained, under certain conditions,

by the inverse Fourier transform. Also by Parseval's formula we have for both (ll)

and (12), __~I I

which is again (9).

For the purpose of approximate solution of the initial-value problem (1), (2),

where h,k are small positive numbers which we shall later make tend to zero in such

a fashion that ~=k/h 2 is kept constant. Solving for u(x,t+k), we get

46

This suggests that for the exact (generalized) solution to (I), (2),

. (z~)

or after n steps

We s h ~ l l prove t h a t t h i s i s e s ~ e n t i ~ l l y c o r r e c t f o r any v ~- ~ i f , bu t on ly i f ~ ½

Thus, let us first notice that if ~ ~ ~ , then the coefficients of ~ are all non-

negative and add up to 1 so that (the norm is again the sup-norm)

or generally

iiE vll .< tl ll The boundedness of the powers of ~ is referred to as stability of ~.

Assume now that v 6 ~ We then know that the classical solution of (i), (2)

exists and if n(x,t) E(t)v = Eo(t)v , the~ u E g ~ = for t '~ 0 an~

We shall prove that~if nk = t, then ~ ~ II ~J II

To see this let us consider

Notice now that we can write

47

Therefore "

~-, E- ~- V(~l

which we wanted to prove.

We shall new prove that for v not necessarily in

have for nk = t,

To see this, let ~ ~ 0 be arbitrary, and choose 'v"

¢

but only in ~ , we still

when k ~ 0 .

such that

We then have

- -K

Therefore, choosing ~ -- ~'z(~il~l')-w~' have for h ~ ~

which concludes the proof.

Consider now the case ~

Taking

we get

~o

X~

The middle coeffic ~nt in ~ is th~ negative.

so that the effect of ~ is multiplication by (i-~). We generally get

48

Since ~ > ½ we have 1 - ~ ' ~ ~ -i and it follows that it is not possible to have

an inequality of the form

// T.

Th is can a l s o be i n t e r p r e t e d to mean t h a t sma l l e r r o r s i n the i n i t i a l d~ ta a re b lown

up to an extent where they overshadow the real solution. This phenomenon is oalle~

instability.

Instead of the simple difference scheme (13) we could study a more general

type of operator, e.g.

If we wa~t this to be "consistent" with the equation (i) we have to demand that E k

apprexi~tes E(k), or if u(x,t) is a solution, then

Taylor series development gives for smooth u,

o r

(\

J

Assuming these consistency relations to hold and assuming that all the aj

we get as above

are ~ O,

(15)

and the convergence analysis above can be carried over to this more general case

with few chs~ges.

49

However, the reasons for choosing an operator of the form (14) which is not our

old operator (13) would be to obtain higher accuracy in the approximation and it will

turn out then that all the coefficients are in general not non-negative. We cannot

have (15) then, but we may still have

f o r some C depend~mg on To

When we work with the L2-norm rather than the maximum norm, Fourier transforms

are again helpful; indeed in most of the subsequent lectures, Fourier analysis will

be the foremost tool.

Thus, let ~ be the Fourier transform of v defined by (lO). We then have

,.J

3 J

or, introducing the characteristic (trigonometric) polynomial of the operator ~,

Jj i~_,,,f we find that the effect of E k on the Fourier transform side is multiplication by

n a(h~ )n. One easily findsthat similarly, the effect of E k is multiplication by

a(h ~)n. Using Parseval's relation, one then easily finds (the norm is now the L z-

norm)

IIE I ]11 and that this inequality is the best possible. It follows that we have stability if

and only if la(~ ) I ~ 1 for all real~ . We then actually have (15) in the L2-norm

Consider again the special operator (13).

and a(~ ) takes all values in the interval[l-&/\

We have in this case

2A

, i]. We therfore find that also in

L 2 we have stability if and only if 1-4~ ~ -1, that is ~ ~ ½.

Difference approximations to the initial value problem (6), (7) can be analysed

similarly.

50

We shall put the above considerations in a more general setting and discuss an

initial-value problem in a Banach space B. Thus let A be a linear operator with

domain D(A) and let v ~ B.

such that

Consider then the problem of finding u(t) E B, t ~ O,

A~*~t-) , E--, o (16)

v ( 1 7 )

More precisely, we shall say that u(t), t ~ O, is a genuine solution of (16), (17)

if (17) holds and

(ii) Ii u(t,<)'-"([-) -- ~L~LI.) / / ~ 0 whenk-* O,

uniformly for 0 ~ t 4 T for any T > O,

Let D O be a subspace of B such that for v 6 Do, the problem (16), (17) has a

unique genuine solution, Then u(t) can be seen to depend linearly on v so that

u(t) = Eo(t)v defines a linear operator with D(Eo(t)) = D O, We say that the problem

(16), (17) is correctly posed if D can be chosen to be dense in B so that Eo(t) is o

a bounded operator for any t >. O, and for any T > 0 there is a C with

Clearly Eo(t ) then has a uniquely defined bounded linear extension E(t) with

D(E(t)) = B such that (18) still holds~ We call E(t) the (generalized) solution

operator. Thus for v 6 B, E(t)v = u(t) is a generalized solution of (16), (17),

and this solution depends continuously on v.

One can show that the solution operator has the semi-group property.

~(t+s) = ~(t)~(s), s , t ~ 0:

in the terminology of semi-group theory one can define (16), (17) to be correctly

posed if the densely defined operator A generates a strongly continuous semi-group

for t >~ O.

51

We shall now study the approximation of a solution u(t) = E(t)v of a correctly

posed Initial-value problem (16), (17). We will then for small k,k $ ke, consider

an approximation ~ of E(k), where ~ is a bounded linear operator with D(E k) = B

which depends continuously on k for 0 $ k ~ k e. The thought is then that ~

is going to approximate E(nk)v = E(k)nv.

We say that the eperator E k is consistent with the initlal-value problem (16),

(17) if there is a set'~ of genuine solutio~ of (16), (17) such that

(i) % ~ ~ ~(0~ ~ ~ ~ ~ ~ is dense in B.

for any T > O.

If the operator ~ is consistent with (16), (17), we say that it is convergent

(in B) if for any v e B and any t ~ O, and any pair of sequences ~i ~ 9

with kj ~-> O, njkj -->t for j ~ , we have

II g vI/ o whenj-e . We say that the operator ~ is stable (in B) if for any T ~ 0 there is a con-

stant C such that

It turns out that consistency alone does not guarantee convergence; we have the

following theorem which is referred to as Lax's equivalence theorem [22].

Theorem Assume that (16), (17) is correctly posed and that ~ is a consistent

approximation operator. Then stability is necessary and sufficient for convergence.

The proof of the sufficiency of stability for convergence is similar to the

proof in the particular case treated above; the proof of the necessity depends on

the Banach-Steinhaus theorem.

2. Ini~al-value problems in L 2 with constant coefficients

We begin with some notation. We shall work here with the Banach space

L ~ = L2(R d) with the norm

52

We define for a multi-index ~< = C~,~ ' j~%~ with ~<~ non-negative integers

setting D = (. ~ ~'~, ~ ..... ~ ~" ~'~d we then ~ve the fo~o~g no~tion for ~eneral

derivatives of order [~I = ~ ~J ~ n~ely

We denote ~y ~ the set of infinitely differentiahle complex-valued functions in

R d and ~y C~ the su~set of functions with compact support. We also introduce the ~o

set ~ of u ~ ~ such that for any multi-indices ~<9 ~

Clearly Cj C J C C and it is well known that Co and are denae in L*.

R d • For u integrable on we define the Fourier transform

We reoall that if U e J ~ S then u 6 . Further, for u g ~ we have Fourier's

inversion formula ~ < ~ ~ ~2

and Parseval ' s relation

C J and as a consequence of the latter, the set c~ of functions in with Fourier

transforms in C o is dense in L'

In the sequel we shall consider N-vector valued functions u(x)=(u,(x),...,uN(x)).

It is clearly natural %o define u(x) G L', f9 Co ~ etc. by demanding that this

holds for each component uj, J = I,...,N. Single bars will denote norms with respect

to N-vectors e.g. ~ ~ ~

I

and for NxN matrices,

v . o Ivi

53

and double bars will indicate norms with respect to L a , so that for the N-vector

u(x) 6 L ' , ~_ \ ~_

For later use we need the following

Lemma I Let ~ be a dense subset of L a and let a(~ ) be a continuous NxN matrix.

Then il~ V II

veglv ~i v/\ ~

Let u(x,t) ~e an N-vector-function defined for x ~ R d and t ~ O.

initial-value problem

-~ _ ~ ( , ~ = "5_ ~.~)~ ~ > ~ o

Consider the

where P~ are constant N~N matrices and where we can consider Pu to be defined for

u ~ ~ Let ~

We have :

Theorem I

for any T ~. O, there is a C such that

Proof

The initial-value ~oblem (I), (2) is correctly posed in L" if and onlyif,

a

~ IL o~-~ ~ T

Assume that (3) holds. Let v ~ ~ and consider

(~)

By differentiation under the integral sign we find that u(x,t) satisfies (i), and so

is a solution te (i), (2). Since for t >I O, u(x,t) ~ ~) it is a genuine solution

in the sense of Lecture land is also unique. Thus Eo(t)v = u(x,t) with D = 2 . By O

Fourier's inversion formula smd Parseval's theorem

54

Since S is dense in L ~ it follows that the initial-value problem is correctly

posed° Jk

We now want to prove the necessity of (3) for correctness. Let now v ~ ~o an~

define u(x,t) by (4). We find at once that u(x,t) satisfies the initial-value

problem (I), (2) and so u(x,t) = E(t)v. Again, by Fourier's inversion formula and

Parseval' s theorem

[I v//

so that by Lemma I,

which proves the necessity of (3) since Co is dense in L z.

Ex. i Consider the symmetric hyperbolic system

(5)

Then the in~ial-value problem for (5) is correctly posed in L 2 for 8

since this is a unitary matrix.

Before proceeding to the next example we state a lemma.

A with eigenvalues ~ d' j = I,...,N, we introduce

We then have

For an arbitrary NxN matrix

Lemma 2 If A is an NxN matrix we have for t ~ 0

.j --0

Proof See [9].

Ex~2 Consider the system (I) and consider also the principal part P of P which

corresponds to the polynomi~l

55

We say that the system (i) is parabolic in Petrovskii's sense if there is a ~, > 0

such that

.)

By homogeneity this is equivalent to the existence of a ~> 0 and a C such that

We then have that if (I) is parabolic in Petrovskii's sense, the corresponding

initial-value problem is correctly posed in L 2. For by Lemma 2 we have for

0 ~ t ~ T,

which is clearly bounded. In particular, the heat equation

clearly falls into this category.

Solutions of parabolic systems are smooth for t ~ 0; we have

Theorem 2 Assume that (1) is parabolic in Petrovsk~'s sense. Then for t ~ O,

D E(t)v ~ L 2 for any ~ and for ar~y T > 0 and any°Q there is a C such that

6<t:.<. -i"

Proof Via the Fourier transform and Parseval's relation this reduces to

But this follows at once by (6).

Ex. Consider the Schrodinger equation (N=I) 4

The initial-value problem for this equation is also correctly posed in L 2. For

Ex, 4 The Cauchy-Riemann equations can be written (d=l)

For these we shall prove the negative result, that the corresponding initial-value

problem is not correctly posed in L z. For here

56

-i o -~

and a simple calculation yields

which is not bounded for any t > 0 whe~ V~ cO .

Ex. 5 Although our theory only deals with systems which are first-order systems with

respect to t, it is actually possible to consider also hi~her-order systems by

reducing them to first-order systems. We shall only exemplify this in one particu-

lar case. Consider the initial-value problem (d=l)

.~"~ _- ~ - ~ " ~ ~ "k >~ ~

Introducing

~-~ ~

(7)

(a)

we have for u the initial-value problem

ul~o5 = vL~,5. ~ere

(9)

c o ~ ~ = / _~

so that we have that the initial-value problem (9) obtained by the transformation (8)

from (7) is correctly posed in L i.

In order that an initial-value problem of the type (I), (2) be correctly posed in

L 2, it is necessary that it be correctly posed in the sense of Petrovskii, more pre-

cisely:

Theorem 3 If (i), (2) is correctly posed in L 2 then there is a constant C such that

57

Proof Follows at once by

We shall see at once by the following example that (I0) is not sufficiemt for

correctness in Ls•

Ex. 6 Take the initial-value problem corresponding to (d~l)

0 _ . ~ = - f T_ -v -l: We get then

However, a simple calculation yields ~ 1

which is easily seen to he unbounded for 0 $ t ~ I (take t ~ = i).

Necessary and sufficient conditions for correctness have been given by Kreiss

[19]. The main contents in Kreiss' result are concentrated in the following lena.

Here for a NxN matrix A we denote by Re A the matrix

Also recall that for hermitian matrices A and B, A ~ B means

for all N-vectors v. We denote the resolvent of A by R(A;z);

It will be implicitly assume~, when we write down R(A;z ), that z is not an eigen-

value of A.

Lemma ~ Let ~ be a fa~ly of NxN matrices. Then the following four conditions are

equivalent

j

?---

(iii) For A ~ ~ ~A(A) <~ 0 and there are two constants C, and C 2 and for each

A a matrix S = S(A) such that

58

and such that

(, i s i ~ /s- '~) <- c~

Sf/ S -~ -- %~.

0 ~X.

is a triangular matrix with \

(iv) There is a constant C > 0 such that for each A ~ ~ there is a hermitian

matrix H = H(A) with

C - ~ ~ <-0

Proof

I~ ~ Q ~ and ~ ~ ~ ~

See [l~].

To be able to apply this lemma to our problem we need:

Lemma 4 Assume that (1), (2) is correctly posed in L 2. Then there exist constants

xt ~ ~a g and C such that

Pr, oo,,f, Let C and ~" be such that ~ "fO~" 05~ ~< ~ ~ ~e ~k~

For arbitrary t > 0 let It] be its integral part. We have ~t~ ~t

which proves the lemma.

Combining Le~as 3 and 4 we have at once:

Theorem 4 If (1), (2) is correctly posed in L z then there is a constant ~ such

that

satisfies the conditions of Lemma 3. On the other hand if there is a constant

such that~ satisfies at least one of the conditions of Lemma 3, then (1), (2), is

correctly posed in L z.

59

One commonly used criterion is:

Theorem 5 Let P(~) be a normal matrix. Then (i), (2) is correctly posed if and

only if (IO) holds.

Proof By Theorem 3 we only have to prove the sufficiency. Since P( ~ ) is normal we

can find a unitary U(~ ) such that

is diagonal. Hence

which proves the result.

For later use we state:

Theorem 6 If (1), (2) is correctly posed in L 2 then (lO) holds and there are posi-

tive constant C I and C a and for each ~ ~ R d a positive definite hermitian matrix

H(~ ) such ~t

-I

and.

(13)

l:'roof By Theorem 4 there is a constant 7 such that the family S in (11) satisfies

condition (iv) of Lemma 3 with C = C I. Thus for each ~g R d there is a positive

definite H(~ ) such that

But by (12) this implies (13).

3. Difference appr0ximations in L i to initlal-value problems with constant

coefficients

Consider again the initial-value problem

ae - ? (~ )~: - Z P,~'~u. , ~ , o (l) "oe bzl ._ M

u( ' , ,o) ~ v~×) (2)

60

For the approximate solution of (i), (2) we consider explicit difference operators

of the form

where h is a small positive parameter, ~ = (~, .... ,~d) with ~j integer, e~(h) are

NxN matrices which are polynomials in h, and the sunwaation is over a finite set of ~.

We introduce the symbol of the operator ~,

which is periodic with period 2,U/h in ] and notice that for v 6 ~ ~ the Fourier

transform of ~v is n /k A

Assume that the initial-value problem (i), ( 2 ) is correctly posed. Pie then want

to choose ~ so that it approximates the solution operator E(k) when k is a positive

parameter tied to h by the relation

~/h ~ = ~ = constant;

we actually want to approximate u(x,nk) = E(nk)v = E(k)nv by ~v. In the future we

shall emphasise the dependence on k rather than h and write ~ as in Lecture i.

To accomplish this, we shall assume that E k satisfies the condition in the

following definition. We say that ~ is consistent with (i) if for any solution of

(1) ~ C ~o

i f o (k ) can be ~ p ~ c , ~ by k(~(h~), w, ~ y that ~ ie aco~ate of o r d e r f . C1~arly

any consistent scheme is accurate of order at least i.

We can express consistency and accuracy in terms of the symbol (cf. [35]):

Lemma i The operator ~ is consistent with (I) if and only if

The operator ~ is accurate of order ~ i f and only i f ~ v ~

61

The proof of (3), say, consists in proving like in the special case in Lecture

1 that consistency is equivalent to a number of algebraic conditions for the coeffi-

cients, which turn out to be equivalent to the analytic functions exp(kP(h -I ~ )) and

Ek(h-' ~ ) having the same coefficients for h j ~ up to a certain order.

Using LemmA 1 it is easy to deduce that if ~ is consistent with (1) in the

present sense then we also have consistency in the sense of Lecture 1. For the set

@ of genuine solutions in the previous definition we can for instance take the ones

corresponding to v £ ~ . From Lax's equivalence theorem it is clear that we want

to discuss the stability of operators ~ of the form described. We have

Theorem 1 khe operator ~ is stable if and only if for any T > O,

c O~

,Proof We notice that F-,k( ~ )n

in Lecture 2 that

which praves the theorem.

is the ~ymbol of ~k" It follows in the same way as

and so

Proof We have for nk ~ l,

It is easy to prove by counter-examples that (4) is not sufficient for stabili~

Necessary and sufficient conditions for stability have been given by Kreiss [18] and

Buchanan [51 ; we quote here Kreiss' result. The main content in Kreiss' theorem is

concentrated in the following Lemma. Here we have introduced the following

notation: For H hermitian and positive definite, we introduce

(4)

We now turn to the algebraic characterization of stability. We first prove

the necessity of the yon Neumann condition. For any NxN matrix A we denote byp (A)

its spectral radius, the maximum of the moduli of the eigenvalues of A.

Theorem 2 If ~ is stable in L 2, there exists a constant ~ such that

62

IAu Ii~

Recall again that for hermitlan matrices, A ~ B means (Au,u) ~ (Bu,u).

Lemma 2 Let ~ be a family of NxN matrices. Then the following four conditions are

equivalent.

(i) sup ~ ~ ~ ~ ~-~ ~--~, ~ ~ ~o

(iii) For A ~ ~, ~ (A) ~ l, and there are two constants C, and Cm and for each

A @ Jl- a matrix S = S(A) such that

and such that

Sg =

is a triangular matrix wlth

(iv) There is a constant C ~ 0 such that for each A ~ ~ there is a hermitian

matrix H = H(A) with

c- 'T_ ~ ~ ~ C I

and

Proof see [28].

To be able to apply this lemma to our problem we need the following analogue of

Lemma 2.4.

Lemma 3 Assume that ~ is stable in L 2. Then there exists a constant

for ~,~ (~ ~ ~ ~t~) one has

such that

~O

63

An alternative way of expressing this result is that for some

any n we have ~

Y,k ~ i, and

Combining Lemmas 2 and 3 we have at once :

Theorem ~ If ~he operator ~ is stable in L i, then there is a ~ such that

satisfies the conditions of Lemma 2. On the other hand, if there is a constant

such that kF satisfies at least one of the conditions of Lemma 2, then E k is stable

in L ~.

One commonly used criterion is :

~eo~m ~ Let ~ be su~ t~t ~(~) is a no~l matrix. ~en ~cn ~e~'s con-

dition is necessary and sufficient for stability.

Proof By Theorem 2 we only have to prove the sufficiency. Since ~( ~ ) is nor,~al

there is for each k ~I and ~ ~ R d a unitary matrix Uk(~) such that

is diagonal. Her~e

I ('

which proves the result. To see the relation with Lemmas 2 and 3, we could also have

formulated this as fellows. We have with the same ~ as in (4) for Fk(~ ) = e Ek(~)

that _ ~ ~\

which is diagonal with eigenvalues of modulus ~ i. Thus, afortiori it is triangu-

lar, and the estimates in condition (iii) of Lemma 2 hold.

As for existence of stable operators, we have (cf. [17]):

Theorem ~ There exist L2-stable operators consistent with (i), (2) if and only if

(I), (2) is correctly posed in L z.

Proof We first prove that the correctness is necessary.

the stability that l~p ~i: ~ f ~ / "= ~m / ~.~]a~/

It follows by Lemma i, ar~1

6't

which implies correctness.

On the other hand if (I), (2) is correctly posed one can construct a consistent

difference operator, er which is equivalent, its symbol, by setting

Using Kreiss' stability theorems one can prove that this ~ is stable for small

~= K/h~ ~ The part of this operator corresponding to the second term in (5) is

referred to as an artificial viscosity.

We shall consider some examples.

Consider the initial-value problem for a symmetric hyperbolic system

We know from Lecture 2 that this problem is correctly posed in L ~.

before a difference operator

where for simplicity we assume e~ independent of h.

by Friedrichs [8].

(6)

Consider as

We have the following result

~ = - ' I then Theorem 6 If e~ are hermitian, positive semi-definite and

and thus E k is stable.

Proof We have the generalized Cauchy-Sohwartz inequality

where (u,v) = ~ uj ~j. Therefore ~-

65

and hence with w = Ek( ~ )v, ~.

which proves the lemma.

We have

so that

satisfies

As an application, take

: , - 4

t , 3

is consistent with (6) and accurate of order i. It is clear that if

o < ~ .< ~ '~ C a / ~ l / - ~ ,

the coefficients are positive semi-definite and so the operator ~ is stable.

The operator ~ can be considered as obtained from replacing (6) by

Consider for a moment the perhaps more natural equation

which gives the consistent operatorj ~i

E~vC~} --- vl~) ÷ ~ ~ aa~ L~ l~*~-~ ( '~ -~ ' ~

with J

66

We shall prove that this operator is not stable in L ~ if any of the Aj

Assume e.g. A, ~ 0 and set ~ j = 0 for J ~ I, ~ I h =~/2.

which has the eigenvalues

is non-zero.

With this choice,

where the r e a l numbers #~ are the e igenva lues o f A~. Thus the von Neum~nn c o n d i -

t i o n is not satisfied and the operator is unstable for any ~ .

It can be shown that in general the operator ~ defined in (8) is accurate of

order exactly 1. We shall now look at an operator which is accurate of order 2 in

the case of one space dimension (d=l).

= ~ --

thus have the ~stem

(9)

Consider the difference operator

=

with

This operator is often referred to as the Lax-Wendroff operator. We have

and so, E k is consistent with (9), and in general accurate of order 2. We shall

prove :

Theorem

(lO) is stable in L 2 if and only if

Pro0f It is easy to see that the eigenvalues of ~(h -I ~ ) are

Let~j, j = 1,...,N, be the eigenvalues of A. Then the operator E k in

(ll)

and we obtain after a simple calculation

if and only if (II) holds. Since E k is clearly normal, this proves the theorem.

For a NxN matrix A consider the numerical range

67

We have :

Theorem 8 If ~ is a family of NxN matrices such that

then ~ is a stable family, that is there is a constant C such that

Proof

since

We shall prove that condition (ii) in Kreiss' theorem is satisfied. Clearly

we have ~(A) ~ 1 so that R(A;z) exists for ~z~ >I.

and v = R(A;z)w we have

or

Therefore, if w is arbitrary,

which proves the result.

Remark One can actually prove that IAnl ~ 2, A ~

This result can be used to prove the stability of certain generalizations of

the Lax-Wendroff operator to two dimensions (see [2~]).

Consider again the symmetric hyperbolic system (6) and a difference operator

of the form (7), consistent with (6). Then A(~ ) = ~(h-1~ ) is independent of h.

We say with Kreiss that E k is dissipative or order O (~ even) if there is a ~ ~ 0

such that /X)

We shall prove

Theorem 9 Under the above assumptions, if E k

pative of order ~ it is stable in L 2 .

is acct~ate of order~ -I an~ dissi-

68

Proof By the definition of accuracy, we have

o,.s :~ -? O

Let U = U( "~ ) be a unitary matrix which triangulates A( } ) so that

Since B(~ ) is upper triangular it follows that the below-diagonal elements in

exp(~UP(~ )U ~) are O(~). Since this matrix is unitary, the same can easily be

proved to hold for its above-diagonal terms, and thus the same holds for the above-

diagonal terms in B( ~ ) so that

\ o

. . . .

and the s t a b i l i t y f o l l o w ~ by c o n d i t i o n ( i i i ) i n ~ e i s s t ~ e o r e m , v

Consider now the initial-value ~roblem for a Petrovskii parabolic system

. _ ~ ~ 0

so that

We know from Lecture 2 that this problem is correctly posed in L 2. Consider a

69

aifference operator

We say, foZlo~r.l.ng John [15] and ~ id lund [38] tha t E i s a parabol ic d i f f e r e ~ e k

operator if there are constants ~ and C, S ~ 0 such that

Notice the close analogy with the concept of a dissipative operator.

Theorem 10 Let E be consistent with (12) and parabolic. Then it is stable in L ~. k

We shall base a proof on the following lemma, which we shall also need later for

other purposes.

Lemma 4 There exists a constant C N depending only on N such that for any NxN

matrix A with spectral radius ~ we have for n ~ N, !

IP, l <. C. +( Proof See [35].

Proof of Theorem 10 By consistency we have

We therefore have for n ~ N, nk ~ T, ~_Nfj

s

which proves the stability.

Consider forward difference quotients

and. for a general ~,

We then easily have the following discrete analogue of Theorem 2.2.

Theorem ii Assume that E k is parabolic. Then for any o( and T > O there is a C

such that -- ~

70

,Proof By Fourier transformation this reduces to proving . _ ~ l

and the result therefcre easily follows by (13).

We know by Lax's equivalence theorem that the stability of the parabolic

difference operators considered above implies convergence. We shall now see

that the difference quotients also converge to the corresponding derivatives,

which we know to exist for t > 0 since the systems are parabolic.

Theorem 12 Assume that (12) is parabolic and that ~ is consistent with (12) an~

parabolic. Then for any t > O, any o~ , and any v 6 L 2 we have for nk = t,

• ~ i [ ~ _ ~ li b-~ ,, v f~ ( , ) v ii - -> o ~ ~, ---. o , ( ~ , )

Proo____~f By Theorems 2,2 and ii one finds that it is sufficient to prove (14) for v A~

in the dense subset C~ . But then, by Parseval's relation,

• "~ t~ - %~

The result therefore follows by the following lemma which is a simple consequence

of Lemma i.

Lemma ~ If ~ is consistent with (12) then

l e ~ i _ i ~ ~'

uniformly for '~ in a compact set.

4. Estimates in the maximum-norm

Consider the initlal-value prohl~n for a symmetric hyperbolic system with

constant coefficients ~i

' ~ ;)=i (1)

As we recall from Lecture 2, this problem is correctly posed in L ~. However, thls

is not necessarily the case in other natural Banach spaces.

71

In this lecture we shall consider the Banach space C

continuous functions in R d with norm

In ~ one has the somewhat surprising result

Theorem I

if

of bounded, uniformly

b~ Brenner [2],

The i al- alue probl (1), (2) is oo eot posed in if and only

(3)

Let us comment that it is well known that the condition (3) is equivalent to the simultaneous diagonalizability of the Aj, that is (3) is satisfied if and only

if there exists a unitary matrix U such that

is a real diagonal matrix for all J = l,...,d. This means that if we introduce ,V

u = Uu as a new variable in (i) we can write (i) in the form

d . -b

~t--~ = ~ , ~ .~ --~^ (4)

But this is a system of N uncoupled first order differential equations. Thus, only in the case that (i) can be transformed into a system of uncoupled equations is (I),

(2) correctly posed in ~.

It can be shown that in the case of non-correctness, that is when (3) is not

satisfied, there are no consistent difference operators which are stable in the max,-

mIJ~-no X~le

We shall now consider a very special case of a system of the form (4), namely

one single equation and ~=I,

~ ~ ~ ~ real (5)

We then want to discuss the stability in the maximum-norm of consistent explicit

operators of the form

where aj are constants and only a finite number of terms cccuro Introducing the

characteristic polynomial Ck ~ "-- ~-~ ~) = ~ C~ ~

72

we have stability in L" :if and only if la( } )1 ~ 1 fur r ea l

We have

Lemma I The norm of the operator ~ in d is

We clearly have Proof

so that

On the other hand, let v(x) ~ ~ be a function withlv(x)l~ 1 such that

- ~ i.~ o,- t : o

J

Then

so that

3

This proves the lemma.

We have earlier observed that ~ has the symbol ~( ~ )n, that is the

characteristic polynomial a( ~ )n If

,I

we therefore have ~i

3

(63

It follows from Lemma i above that

3

ang the discussion of the stability will depend on estimates for the anj.

We now state the main result for this problem.

73

Theorem 2 The operator ~ is stable in the maximum-norm if and only if one of the

following two condi~ens is satisfied

.)

(~) I~ [~ i < i e ~ e p t for a t , o s t a f i n i t e n ~ b s r of points '~ ~, ~ : Z , . . . , ~

in I~1 r . ~ where I~t~l~-~ , For q= l , . . , ,Q there are constants ~ ) ~ % ~35

where ~ i s r ea l , Re ~ % > O, and'O~ is an even natural number, such that

We shall sketch a proof of the theorem in the case that E k satisfies the addi-

tional assumption

" (8)

We have

Lemma 2 Assume tha t a ( ~ ) is a trigonometric po lynomia l such that (8) i s s a t i s f i e a

and such that

= ~ (9)

where ~ is real, Re ~ > O, and ~J is even. Then, if anj is defined by (6),

there is a positive constant C independent of n and J such that

Proof By (8) and (9) there i s a ~ > 0 such that

We therefore get ~ I ~ ~Y~ i ~_,~ I0~(~II C~"~ ~ -, ~ ( -~"~ ~'01C~"~

which proves the first half of the lemma. To prove the second half, we define

74

" ~ o,~ t ~ ~1

After two integrations ~y parts, using the periodicity of a(V ) we get

Aooor~_'i.nt!l to (9) we have

and it follows for ~ \ %

~l I ~ -%

We thus get ~,

and since

~C -~ ~) ~ ~ + , , ._,~

-c , -~ ~ ~~- t-~ f ~

the result follows.

We then have

CorollaI~ Assume that ~ has the characteristic polynomial a( ~ ) which satisfies

the assumptiens ef Lemma 2. Then ~ is stable in ~,

We have 1 II ~" v, ~ i~-~f,I- <'~ i~-<~ ~,t> n y'~

C U '/'> ~ - , ,,,.~ (.~_~< ~/ . l~,l )'

<~ C(~ -~ ~'~ ~ (,,+

whioh proves the corollary.

~C~

Consider now the n e c e s s i t y o f the c o n d i t i o n ( 7 ) i which under the assumption

(~ ) ~ & c e . ~ ( 9 ) . A s ~ ~ a t (9) i s not ~ t t s f i e d . We t h ~ must have

75

- : - (,4

~ real, q('~ ) r e a l polynomial, q(O) $ O, 1 < ..~ <. " ,3

By Parseval's relation for periodic functions we have

O

and using (iO), it is easy to deduce from this hhat

.a

Using a lemma ~y van der Corput it is also possible to prove

© even, Re)~ > O .

j

~ d s o ~ ~' - ~'~

which tends to infinity with n since t < ~ '

As an application, consider for the solution of the equation (5) the Lax-

Wendroff operator, which in this case reduces to

E

We have

and

and so ~ is stable in L 2 if and only i f i ~ i ~ i On the other hand, i f O<l~l~ ~ i

. e h a v e i°'C i<i for O <

It follows from Theorem 2 that ~ is unstable in ~ . By the above proof we have

- K

Serdjukova [30 ] , [31] and Hedstrom [ 11 ] , [12] have, by using more refined

techniques of estimating the a . above, been able to give more precise estimates of na

the growth of II ~ II for the case when ~ is stable in L z but unstable in Co In

76

the particular case of the Lax-Wendroff operator, the exact result is

C i - ~ ..,

more generally, when a(~ ) has the form (i0) one has

c~ ~,~(,-~'/-.,5 .< tl ~11 -< c~_, ',~'~('-~/'~ (n) The instability present here is of course quite weak.

The proof of the sufficiency part of Theorem 2 is due to John [15] an~ Stran~

[32]. The proof of the necessity part can he found in [33]. Theorem 2 has also

been generalize~ to variable coefficients in Th°mee [3~] an~ to L p, I ~ p ~ oo,

p ~ 2, in Brenner and Thome~e [3]. The analogue of (ii) then reads

c, r,~-~'~°-~'/"~ <./i ~li,_~ .-< C~ r, '~-y~i('- H"/ Consider now the initial-value problem for a system with constant coefficients

J

which is parabolic in Petrovskli's sense, that is

- ~ i~ i~ H (12)

(l~)

where

We recall from Lecture 2 that this problem is correctly posed in L a and also that

derivatives can be estimated in L ~.

~e shall now see that (12), (13) is actually correctly posed in ~ and that

again also the derivatives can be estimated in the maximum-norm.

Theorem ~ Assume that (12) is parabolic in Petrovskii's sense. Then for t > 0

the solution operator has the form

where

. . , , , e r e is . > O . . c , t h . ,

(l?)

77

The problem is correctly posed ink.

that

and for ar~v T )~ 0 and o~ there is a C such

(18)

Proo,f To give a hint of the proof, we recall that if v g ~ then after Fourier

transformation the initial-value problem becomes a problem in ordinary differential

equations with solution

/k u (. L*")

which is the Fourier transform of (15) if exp(tP(~ )) That this is the case

follows from (17) which still remains to prove. On the other hand, once this is

proved, the estimate (18) is a trivial consequence. To hint at the proof of (17) we

need the following extension of (14) to complex numbers:

Lem~ ~ If (12) is parabolic in Petrovskii's sense there are positive constants

an~[ C such that for any ~ ~- ~ ~ ~ "1~ ~ ~j~ ~ ~,~d .,'x ( .< + c/ 1" + c .

Proof See [7].

To complete the proof of (17) we first notice that by Lena 2.2 we have for

0 <t~ T,

Using this estimate one can see that the domain of integration in (16) may be moved

so that for any /~ ~ ~

or after differentiation,

~hus, by(19) wegetforO < t S T,w~th ¢ =~.i~ ~,~ N-t + W ~

78

where the constants are independent of ."~ . Now choose

I We then have ~1~-, (!~_~

_ ~ m> +c .£ lq i ~ ~-(~c - t t J

and the estimate (17) now easily follows.

Consider now explicit difference operators

consistent with (12). We recall from Lecture 3 that ~ is called parabolic if

- ~ C ~ . ~ . . \ ~ i \ < - ~ ~ > o

that such an operator is stable in L~ and also that difference quotients may be

estimated in L 2.

We shall now see that a corresponding result holds also in the maximum-norm.

Theorem 4 Assume that E k is parabolic, and let

Q

where ~(~) is the symbol of ~ and Q = I~ ~I ~ ~I. The end(h) satisfy the /

following estimates (where difference quotients are taken with respect to ~ ),

The o p e r a t o r E k i s s t ab l e i n ~ and f o r any ~( and T ~ 0 t he re ks a O such t h a t

II "a~ I~. v II ~< C (,nK~ ~ II~/l/.) 0 4 ,~. ~ -1- (2l)

Proof For details see [39]. The fact that (21) follows from (20) is due to the

can be considerea as a Riemann sum for the integral

79

and therefore this sum is bounded independently of n.

To prove (20) one goes through essentially the same steps as in the proof of

(17) in Theorem 3, utilizing Le~ma 3.4 instead of Lena 2.2 and the following lemma

instead of Lena 3.

Lemma 4 Assume that ~ is parabolic, and let ~o be given. Then there are positive

constants ~ and C such that for 4 = "~'~" ~ /~ ~- A,} E: L~ -

Again, the estimates for the difference quotients can be used to prove their

convergence to the corresponding derivatives. We state the following result without

proof.

Theorem 5

parabolic.

Assume that (12) is parabolic and that ~ is ~nsistent with (12) and

Then for a~yt ~0, any ~ , and any v ~ ~ we have for v~ = t,

il -

The choice of ~ as the operator approximating D ~ is again rather arbit-

rary. One can indeed show that the same result as in Theorems ~ and 5 hold for any

difference operators consistent with D~

Theorems 3, 4 and 5 generalize to variable coefficients.

5. On the rate of convergence of difference schemes

In this lecture we shall work in a slightly more general setting than bef~e.

Let L p = LP(Rd), 1 ~ p < co, denote the set of measurable functions (or vector-

functions) v such that

and consider the family of Banach spaces

i/w = { )

Consider also the Sobolev spaces ~ of distributions v such that D v 6 W P P

This is also a Banach space with norm

for I~ % ~o

80

For 1 ~ p ~ ~ this can be thought of as the closure with respect to the norm (1)

$ w~ w ~ of or (~0 " Let finally be the set of v which are in for all m. This P P

means that D v is continuous for any ~ and D v~W . The set ~#o is dense in W . P P P

Consider again an initial-value problem

~-'~ - ~ ~, ~ ~ (2)

where as before P~(x) have all derivatives in (~,

In the sequel we shall demand not only that the initlal-value problem be

(3)

order, then strong correctness in W is an automatic consequence of correctness. P

Further, systems which are parabolic in Petrovskii's sense are strongly correctly

posed in W for any p with I% p ~oo. P

Consider difference operators of the same form as before, namely

We have previously defined consistency of ~ with (2) to mean that for any suffi-

ciently smooth solution u(x,t) of (2),

J

more precisely, E k is said to be accurate of order #

k --> 0 (~)

V~en (2), (3) is strongly correctly posed in Wp, it is sufficient to assume this

local condition to obtain the following global estimate:

if for such u,

correctly posed in W , but that it satisfies the stronger requirement of the follow- P

ing definition. We say that the initial-value problem is strongly correctly posed in

W if for any positive m and T, v6W m implies E(t)v~W m and there is a constant C P P P

such that for all v&W m, P

In particular, this definition implies that E(t) ]l~°C W °° . P P

It can be proved that if P(x,D) has constant coefficients, or if it is of first

81

Theorem i Assume that the initial-value problem (2), (3) is strongly correctly

posed in W and that ~ is consistent with (2) and accurate of order ~ . Then P

there exists a constant C such that for any v ~ W m+~ P

Ii <.c. ii vll * Proof See [27] . The proof consists in expanding E(k)v = u(x,k) and ~v in Taylor

series around the point (x,O), using (4), and estimating the remainder terms in

integral form. In doing so it is sufficient to consider v in the dense subset W OO of P

W. P

We now easily obtain the following estimate for the rate of convergence:

Theorem 2 Assume that the initial-value problem (2), (3) is strongly correctly

posed in W and that ~ is stable in W , consist~at with (2) and accurate of orderf P P

Then for any T > 0 there is a constant C such that for v~ ~+ ~ nk ~ T, P

Proof Vie have n ~ ~. ~-~ -S

,~--0 o (5)

and so by the stability of ~, Theorem i, and the strong correctness,

which proves the theorem.

Thus, the situation is that for initial-values in W we have (by Lax's equiva- P

lence theorem) convergence without any added information on its rate, and if the

initial-values are known to be in ~+~ we can conclude that the rate of convergence P

is O(h ~) when h ~0. It is natural to ask what one can say if the initial-values

belong to a space "intermediate" to W and ~+/~. To answer this question we shall P P

introduce some spaces of functions which are interpolation spaces between W and W m P P

in the sense of the theory of interpolation of Banach spaces (cf. [27] and ref~enems~

Let s be a positive real number and write s = 8+c~ , S integer, 0 < ~ ~ 1.

Set T~v(x) = v(x+ T). We then denote by B s the space of v~W such that the follow- P P

ing norm is finite, namely

= iI. -, i l ll T - b ~i= S t : . $o ~" °

Thus, B s is defined by a Lipsehitz type condition for the derivatives of order S; P

82

these spaces are sometimes called Lipschitz spaces.

L we have for 1 ~ p < co

For the Heavysi&e function (4=1)

we h a v e S_

' -~ ~ = ~ • (7) ~< CC, c ~ ~I~II

Theorem 2 and (7) with A = ~-E(nk) prove immediately the following result:

Theorem ~ Assume that the initlal-value problem (2), (3) is strongly correctly

posed in W and that ~ is stable in W , consistent with (2), and accurate of order P p

• Then for O~ s < M+ ~ and T >0 there is a constant C such that for any V~Bp,

nk~T,

' E " ,< C k ~ ' ; - '~ - ~ . II k ~. -~(."~'Y~ v//',,'at, / / v l / ~ ~ X-r,, . .t . , (~)

t,. l", ',;."t " Notice that ~ = grows with~ and lim ~ = 1. This means that the

estimate (8) becomes increasingly better for fixed s when ~ grows. In other weras,

if for a given strongly correctly posed initial-value problem one can construct

stable difference schemes of arbitrarily high order of accuracy, then given a~ s ~ 0

one can obtain rates of convergence arbitrarily close to O(h s) when h *0, for all

initial-values in B s p"

As an application, consider an L 2 stable operator ~ with order of accuracy

for the hyperbolic equation

~-'-~ = " -by , -~

and it follows th~ if ~ C~ th~n ¢ ~ ~ ~

One can prove that B sl C B s2 if s I ~ s2, and that for integer s and ~ > 0 P P

arbitrary, B s+~ ~ W s CB s . The main property of these spaces that we will need is P P P

then the following interpolation property: Assume that 1 ~ p ~ ~ , m is a natural

number, and m is a real number with 0 < s ~ m. Then there is a constant C such

that any bounded linear operator A in W with P

85

and letv= ~,~where ~ eo and'~ we have in this case

is the Heavyside function (~). By above

For dissipative operators ~, stronger results have been obtained in Apelkrans [I],

and Brenner and Thcme~e [4], where also the spreading of discontinuities is discussed.

It is natural to ask if for a parabolic system, the smoothing property of the

solution operator can be used to reduce the regularity demands on the initial data

in Theorems 2 and 3. This is indeed the case. Before we state the result we give

the following result, which follows easily from properties of fundamental solutions.

v

Theorem A Assume that (I) is parabolic in Petrovskii's sense. Then for any p with

1 $ p ~ co, any m> @ and T > 0 there is a constant C such that

0 -~-k ~ -'f"

We can now state and prove the result about the rate of convergence in the

parabolic case.

Theorem ~ Assume that (2) is parabolic in Petrovskii's sense and that E k is stable

in %, consistent with (2) and accurate of order ~ . Then for any s > O, T ~ O,

there is a constant C such that for v E B s p, nk ~ T,

t/

Proo___~f For details, see [27]. Here we shall only sketch the proof for the case

v C~ B s where ~ ~Q ~*~the other cases can be treated similarly. P

We shall use (5). For J = 0 we have by the stability and Theorem l,

,,_, ~ ~Ck ~'~ It~1/ ~*~

L and sc by (7), since s > y

s

For J > 0 we have by Theorems i and 4,

84

.<

and hence by (7),

where

Since therefore _~

we get by adding over j

which proves (9) im the case considered.

S

Investigations by Hedstrom [13], [14], Lefstrom [25], and Widlund [40] have

1 shown that in special cases the factor log ~ can be removed from the middle inequa-

lity in (9). In particular the following result Ms been proved by Widlund [40].

Theorem 6 Assume that (2) and ~ are parabolic in the sense of Petrovskii an~ John,

respectively, and that ~ is accurate of order ~ . Then for any T ~ O there is a

constant C such that for V~Bp ~, nk~T,

The proof of this fact is considerably more complicated than the above and

depends on estimates for the discrete fundamental solution. Using these estimates

it is also possible to get estimates for the rate of convergence of difference

quotients to the c rresponding derivatives. We have thus the following more precise

version of Theorem 4.5.

Theorem 7 Assume that (2) and ~ are parabolic in the sense of Petrovskii and John

respectively, and that ~ is accurate of order ~i ° Let

85

be a finite difference operator which is consistent with the differential operator Q

of order q and also accurate of order~ . Then for O< nk = t ~ T,

In view of the fact that unsmooth initial-data give rise to lower rates of con-

vergence it is natural to ask if the convergence be made faster by first smoothing

the initial-data. This is indeed the case for parabolic systems and we shall des-

cribe a result to this effect(Kreiss, Thome~e and Widlund [21]).

We shall consider operators of the form

a,, v .._ ¢~,-~- v ~,c-,/= ~_a ¢ ( , < " , , / , :9

where ~ is a function such that its Fourier transform satisfies A

.A

Here b~ ~), " j = 0,I are such that for some S ~ 0, b~ V) '~ and b~ ) " coincide with multi-

pliers on ~W for ~ ~ ~ <~ ~ and I~ 1 >~ respectively. Such an aperator is said to P

be a smoothing operator of order ~ in W . Since the multipliers on ~L 2 are simply P

the functions in Lee , the above condition can be seen to be satisfied for p=2 if

and for any multi-index ~ } O 3

~9'~ : ¢ C 1 ~ - ~ ~1~] "" ~=~ ~ uniformly in .~.

Special examples of smoothing operators of orders 1 and 2, respectively, in ~he

case d=l are [ k

'1

86

and for general

the form

, a smoothing operator of order ~ can easily be constructed in

D

where ~ is a function which is piecewise a polynomial of degree ~ -I and which \

vanishes outside i~4~j~,~-. -~ for ~ odd and L--}x4\jy-~) for~ even.

p=2, the operator ~ corresponding to

For

where 0 ~ ~ ~ Y~ is a smoothing operator of arbitrarily high order. In this

c a s e

Smoothing operators in higher dimensions can be obtained by taking products of one-

dimensional operators d

where ~,j is a smoothing operator with respect to xj.

The result on the rate of convergence is then the following.

Theorem 8 Assume that (2) and ~ are parabolic in the sense of Petrovskii and John,

respectively, amd that ~ and ~ are accurate of order ~ . Then there is a constant

C=Cp, T such that for 0 ~ t = ~k ~ T,

We shall complete this lecture by considering the case of the simple hyper-

bolic equation

--y ~ ~ ~ real constant ~

and a consistent difference operator of the form \

with characteristic polynomial

87

we shall examine the case when ~ is stable in L a but unstable in ~ ; more pro-

t[~ real pol~o~al , t( °~ ~ 0

The following result on the rate of convergence is due to Hedstrom [13] and Brenner

and Thomde [3].

Theorem ~ Under the above assumptions on the operator Ek, then for s ~ O, s ~*~

and ~ +,/~ ~ -~1 ~ and ,~ ~ T,

: o , °

It can also be proved that this result is best possible in the sense that the

function g(s) above is the largest for which an estimate of the form (I0) holds for

all v6 B s. In the stable cases, i.e. when ~3 =f*~ or p=2~ the order of conver- P

gence is s)~/( )~ +I) when 0 < s < J~- +i im agreement with Theorem 3. In the oppo-

site case t h e e r r o r i s l a r g e r f o r s < /~-" l~ l (% - b * ' \ ~--/" For small s, g(s) is then

negative and for s--O we recognize the exponent in e.g. (&.ll) (with some difference

in notation)e

It is interesting to note that if the irregularity of the initial-function

stems from the behavicur at isolated points then the result above may be improved

so that for p = ~ we obtain the same result as if ~ were stable in ~. We shall

formulate this result in terms of the Banach space ~ of functions v with support in

[-M,O] such that

is finite. Using the L ~ convergence result, Sobolev's inequality, and the fact that

is continuously embedded in B~ +~ one can prove the fonowing result BvS.

cisely we shall assume that

I ~I ~ I for

88

Theorem lO

positive M 8Ln~. s ~ ~ ÷i end for TLk ~ T. h~,e,~j~./(r+,B)

This result also holds when ~ depends upon x. /

Consider a L, stable operator ~ for the equation (5). Then for given

89

REFERENCES

[i] M.Y.T. Apelkrans. On difference schemes for hyperbolic equations with dis-

continuous initial values. Math. Comp. 22 (1968), 525-539.

[2] Ph. Brenner. The Cauchy problem for sy~netric hyperbolic systems in L . P

Math. Scand. 19 (1966), 27-37.

[3] Ph. Brenner and V. Thomee. Stability and convergence rates in L for certain P

difference schemes. Math. Scand. To appear.

/

[Z~] Ph. Brenner and V. Thomee. Estimates near discontinuities for some difference

schemes. To appear.

[5] M.L. Buchanan. A necessary and sufficient condition for stability of diffe-

rence schemes for intial-value problems. J. Soc.Indust.Appl.Math. Ii

(1963), 919-935.

[6] R. Courant, K. Friedrichs and H. Lewy. ~er die partiellen Differenzen-

glelchungen der mathematlschen Physik. Math. Ann. lO0 (1928), 32-74.

[7] A. Friedman. Partial differential equations of parabolic type.

Prentice-Hall. Englewood Cliffs, New Jersey, 1964.

[8] K. Friedrichs. Symmetric hyperbolic linear differential equations.

Comm. Pure Appl. Math. 7 (1954), 3&5-392.

[9] I.M. Gelfand and G.E. Schilow, Verallgemeinerte Funktionen III.

Deutscher Verlag der Wissenschaften, Berlin, 196&.

S.K. Godunov and V.S. Ryabenkii. Introduction to the theory of difference

schemes. Interscience. New York, 196~.

G.W. Hedstrom. The near-stability of the Lax-Wendroff method.

Numer. Math. 7 (1965), 73-77.

G.W. Hedstrom. Norms of powers of absolutely convergent Fourier series.

Michigan Math. J. 13 (1966), 393-&16.

G.W. Hedstrom. The rate of convergence of some difference schemes.

SIAM J. Numer. Anal. 5 (1968), 363-&06.

G.W. Hedstrom, The rate of convergence of difference schemes with constant

coefficients. BIT 9 (1969), 1-17.

F. John, On integration of parabolic equations by difference methods,

Con. Pure AppI. Math. 5 (1952), 155-211. II

H.0. Kreiss. Uber Matrlzen die beschrm~kte Halbgrupp~ erzeugen.

Math, Stand. 7 (1959), 71-80.

[lO]

I l l ]

[12]

[13]

[ l~]

[15]

Ix6]

90

[17] H.0. Kreiss. Uber die Losung des Cauchy problems fur lineare partielle

Differentialgleichungen mlt Hilfe yon Differenzengleichungen.

Acta Math. lO1 (1959), 179-199.

[18] H.0. Kreiss. Uber die Stabilitatsdefinitien fur Differenzengleichumgen

die partielle Differentialgleichungen approximieren. BIT 2(1962 ), 153 -181.

[19] H.0. Kreiss. Uber sachgemasse Cauchyprobleme. Math. Soand. 13 (1963),

109-128.

[20] H.O. Kreiss. On difference approximations of dissipative type for hyper-

bolic differential equations. Comm. Pure Appl. Math. 17(1964), 335-353.

[21] H.O. Kreiss, V. Thom~e and O.B. Widlund. Smoothing of initial data and

rates of convergence for parabolic difference equations.

Comm. Pure Appl. Math. To appear.

[22] P.D. Lax and R.D. Richtmyer. Survey of the stability of linear finite

difference equations. Comm. Pure Appl. Math. 9 (1956), 267-293.

[23] P.D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl.

Math. 13 (1960), 217-237.

[24] P.D. Lax and B. Wendroff. Difference schemes for hyperbolic equations with

high order or accuracy. Comm. Pure Appl. Math. 17 (196~), 381-398.

[25] J. Lofst~m. Besov spaces in theory of approximation.

Ann. Math. Pure Appl. 85 (1970), 93-184.

[26] @.G. O'Brien, M.A. Hymaa and S. Kaplan. A study of the numerical solution

of partial differential equations. J. Math. and Phys. 29(1951), 223-251.

[27] J. Peetre and V. Thom~e. On the rate of convergence for discrete initial-

value problems. Math. Scand. 21 (1967), 159-176.

[28] R.D. Richtmyer and K.W. Morton. Difference methods for initial-value

problems. 2nd ed., Interscience, New York, 1967.

[29] V.S. P%vabenkii and A.F. Fillipow. Uber die Stabilitat yon Differenzen-

gleichungen. Deutscher Verlag der Wissenschaften, Berlin, 1960.

[30] S.I. Serdjukova. A study of stability of explicit schemes with constant w

real coefficients. Z. Vycisl. Mat. i Mat. Fiz. 3 (1963), 365-370.

[31] S.I. Berdjukeva. On the stability in C of linear difference schemes with

constant real coefficients. Z. Vycisl. Mat i Mat. Fiz. 6(1966), &77-486.

[32] W.G. Strang. Polynomial approximation of Bernstein type.

Trans. Amer. Math. Soc. 105 (1962), 525-535.

[33] V. Thomee. Stability of difference schemes in the maximum-norm.

J. Differential Equations 1 (1965), 273-292.

91

[3~]

[55]

[36]

[57]

[38]

[39]

V.

V.

V.

V.

Th°mee. On maximum-norm stable difference operators. Numerical Solution

of Partial Differential Equations (Proc. Syrup.s. Univ. Maryland, 1965),

pp. 125-151. Academic Press. New York.

Theme~e. Parabolic difference operators. Math, Scan~, 19 (1966), 77-107.

Thomee. Stability theory for partial difference operators.

SIAM Rev. Ii (1969), 152-195.

Thomee. 0n the rate of convergence of difference schemes for hyperbolic

equations. Numerical Solution of Partial Differential Equations. (Proc.

Sympos. Univ. Maryland~ 1970). To appear.

0.B. Widlund. On the stability of parabolic difference schemes.

Math. Comp. 19 (1965), 1-13.

O.B. Widlund. Stability of parabolic difference schemes in the maximum-

norm. Numer. Math. 8 (1966), 186-202.

0.B. Widlund. On the rate of convergence for parabolic difference schemes,

II. Comm. Pure AppI. Math. 23 (1970), 79-96.

Iteration Parameters in the

Numerical Solution of Elliptic Problems

EUGENE L. WACHSPRESS

General Electric Company Schenectady, New York

94

These n o t e s a r e i n t e n d e d t o s e r v e a s a g u i d e to a d e e p e r s tud~ o f ~ a t e r i a l

p r e s e n t e d i n a s e r i e s o f l e c t u r e s d e l i v e r e d i n Sep tember , 1970 a t t h e U n i v e r s i t y o f

Dundee as a part of the special one year symposium on The Theory of Numerical

Aridly sis.

Lecture

i

Subject

A Concise Review of the General Topic and

Background Theory

Successive Overrelaxation: Theory

Successive 0verrelaxation: Practice

Residual Polynomials and Chebyshev Extrapolation:

Theory

Residual Polynomials : Practice

Alternating-Direction.Implicit Iteration:

Theory

Parameters for the Peacems~-Rachford Variant

of ADI

Reference text: "Iterative Solution of Elliptic Systems," by Wachspress (Prentice Hall, 1966).

95

i. A CONCISE REVIEW OF THE @ENERAL TOPIC AND BACKGROUND THEORY

We are concerned with iterative approximation to the vector x which satisfies

the system of linear equations

Ax = k m

where: k is a known m-vector

A is a given nonsingular mxm matrix, and

x is an m-vector which is to be found.

An approximation y to x is acceptable when

/ / ~ . - x / / < E,

/ / ~_ / /

(1)

(2)

where E is some prescribed error bound and // . // a designated norm.

We shall first categorize various iteration procedures. We shall then des-

cribe a measure of efficiency or rate of convergence. Having done this, we will

indicate a rather general approach to demonstrating convergence for a wide class of

methods for iterative solution of linear systems. Finally, an example of each of

three major classes of linear iteration procedures will be given.

It is convenient to restrict matrix A in (i) to be real, symmetric, and

positive definite. (Our definition of positive definite is such that a real matrix

which is p.d. must be symmetric. ) Although less restrictive conditions are subject

to ar~lysis, the discussion is greatly simplified in this manner.

A STATIONARY LINEAR ITERATION procedure is characterized by :

xo = a known "trial" vect~ (3)

~, = T ~n-, + R k , n = 1,2, ....

This procedure is convergent if~ x n ~ x = A-'k for any xo and k.

In oraer that x be a stationary point, we require:

x = ~ x + ~ k . (J+) m m

Defining the error vector e_, = ~, - x, and subtracting (4) from (3), we get

= w _on I = T" ao , (5)

en ~ O_ for arbitrary eo iff Tn ~ O.

96

The spectral Tad_lus of T is r(T) = max/gi(T)/ where the gi are eigenvalues

of ~. Thus, r(~) muet be lose t~n unity for oonverge~e. From (~), (:-~)A-'k_ =

R k for any k so that (I-T)A -1 = R, and (3) can be written

x_, = r ~_, + (:-T)A-'~_. (6)

If we could compute A-'k on the right hand side of (6), we would have no need for

iteration. Thus, T must be such that the right hand side of (6) does not require

computation of A "I or A-'k. To clarify this, suppose we can solve the system

Bx = k for x where B approximates A in some sense. We may attempt to iterate by:

B x. = (B-A) x._, + k . cr

_x. = (:-B-'A) x._, + B-'k.

Here, T = I-B-'A and (I-T)A-' = B-' = R.

We note that B is a "good" approximation to A for this iteration when the spectral

radius of (I-B -I A) is much less than unity.

We will now derive a condition sufficient to assure convergence which has

application to many iteration techniques.

A is positive definite and hence has a unique positive-definite square root,

A ~. Thus, (I-B"A) is similar to (I-A~B-'A~), whose spectral radius is bounded

above by the square root of the largest eigenvalue of

, I I I A ½B'' (B+BT-A)BT-I A ½ (7) (I-A:B-'A:) (I-A'~BT-IA:) = I-

B T The condition sufficient for convergence is that B + - A be positive definite.

For then we can define K = A~B-'(B+BT-A) ~ and M = (I-A~B-'A ~) and rewrite (7) as

MM T = I-KKT; we note here that K is nonsingular, being the product of nonsingular

matrices. Thus 0 ~ g(MM T) = g(l-KE T) = I - g(KK T) < i, and we have proved that

r(I-B-'A) < i if B+B T - A is positive definite. To convince ourselves that this is

not a necessary condition, we need only find one counterexample :

is net positive definite.

yield

with r(T) = ~ while B+B T - A

97

If we choose our norm so that // T // "~ r(T) and if xo = O_, then eo = -x and

II e. /I =//~"!o/I <_. II ~ lln /I ~o II ~ r" II ~_ II so tha~ II ~. /I < ~. for

// x ll n "- log E log r " We note that when r is close to uni~ the number of iterations

required to satisfy a prescribed convergence criterion varies as i/(l-r).

A PARTIALLY STATIONARY LINEAR procedure is one for which the approximation

is obtained as a linear combination of vectors which could be obtained by a station-

ary procedure. Let xj (J = 1,2,...,n) be iterates generated hy (6). Then the n-th

iterate of a partially stationary procedure based upon this stationary iteration

would be n V--

y, = ~ ajn xj. If we define fn = Yn - Y = y., - x, then

fn = Pn(T) e_o , where P,(T) is a polynomial of degree n in T, normalized to unity

when T=I. (I is the identity matrix of order the same as T.)

Convergence is established by showing that for a given Pn and T: 1

r limit Isup /P,(gi) /~ 1 = < I

n -, co

where the gi are the eigenvalues of matrix T.

A NONSTATIONARY LINEAR iterative procedure is one for which the iteration

matrix is a function of parameters which may change from iteration to iteration:

n n Xn = Tnxn_, + R.k . or_ Xn = I~I Tj) xo + (I- J I~ Tj) A-'k_ • If r. is the

n I/n spectral radius of jn I= Tj, then the asymptotic convergence rate is n~oolimit rn .

When the Tj all commute, analysis is quite similar to that applied to partially

stationary schemes. When the Tj do not commute convergence theory is often less

definitive.

In examining relative merits of iteration procedures, we endeavor first to

establish convergence for a range of iteration parameters, second to determine the

spectral radius as a function of these parameters, and third to ascertain how these

parameters may be chosen to minimise this spectral radius. Thus, a minimax pro~lam

arises in the analysis of each iterative procedure. Three commonly used techniques

will be considered in subsequent lectures.

98

The se are :

I. Successive Overrelaxation (stationary)

II. Chebyshev Extrapolation (partially stationary)

III. ~JLternating-Direction-Implicit Iteration (nonstationary).

Each of these is well documented in the literature and these notes are intended only

as an introduction to the subject rather than a detailed analysis.

2. SUCCESSIVE 0VERRELAXATION: THEORY

We may "improve" the value of a component of the vector x by computing as a

new value during iteration n that number which yields satisfaction of the p-th equa-

tion with values from iteration n-I substituted in the equation for the remaining

components of x: m

= k from iteration n-I for j % p apjXj P xj nforj =p.

j=l

An iteration consists in improvement of all components of x. We may call this

"Relaxation" or "simultaneous relaxation". On an array computer with many arithmetic

units working in parallel, it would be possible to improve all components simultan-

eously.

We may, alternatively, use new neighbour values as soon as they are computed.

Then ~ apjXjn + J=~p+lapjXjn-1 = kp , p -- 1,2,...,m.

Components are new improved in some order, and we call this "successive relaxation".

Although this latter approach is often better than simultaneous relaxation, a more

significant gain in efficiency is usually achieved by extrapolation. If we denote

the unextrapolated result of successive relaxation at point p by Xpn , then before

proceeding to the next point during the n-th iteration, we compute for a prescribed

extrapolation parameter w :

+ w(x~ - ) . (8) Xpn = Xpn-1 ~n Xpn-1

Numerical solution of elliptic type difference equations is accomplished with

w in (~,2) and since w is greater than unity this is called "successive overrelaxa-

ticn." Among factors responsible for extensive literature on S0R are its simplicity,

99

wide applicability, and firm theoretical foundation.

D.M. Young's analysis provided a basis for efficient utilization of SOR. He

introduced the concept of a "consistent ordering" which is related to what is now

known as Young's Property A. The equations of a system having this property may be

ordered as follows:

An index s(p) is assigned to unknown x associated with the p-th equation. P

Then for every nonzero apj, the ordering s(p) is said to be consistent if

s(j) = s(p)-i for j < p

s(j) = s(p)+l for j > p.

Components of x are improved in order of increasing s. Points (equations) with a

common s-value which are coupled to one another directly must be updated simultane-

ously. Many systems arising in practice may be consistently ordered without a need

for simultaneous improvement even though several components may have colmnon s-values.

Five-point difference stars are an example.

Young established an intimate relationship between eigensolutions of the

simultaneous relaxation and consistently ordered $0R iteration matrices derived

therefrom. If g is an eigenvalue of the SR matrix, then there is a corresponding

eigenvalue h of the S0R matrix satisfying:

The optimum extrapolation parameter w b is obtained by solving the minimax

problem:

H(w,@) = maximum /h(w,g)/

H(Wb,@ ) = minimum H(w,G). w

The solution to this minimax problem (which is a rather simple minimax problem) is

2 wb = 1 + J1-~ 2 and H(wb,~) = Wb-1. (10)

The remarkable gain in efficiency of S0R over SR is evidenced for the case &=-l-r for

r << 1 by comparing the relative number of iterations required by the two methods

for an error reductions of E:

-in E nSR = -in E/r while ns0 R = 2 2~r " (II)

100

The potency of the convergence theorem given in the first lecture is illus-

trated by applying it to the S0R iteration matrix. Let the coefficient matrix be

A = D - R - R T where D is diagonal (and positive since A is positive-definite by

hypothesis) an~ R is strictly lower triangular. If S0R is applied with the natural

ordering (successive updating of components 1,2,... ) then the S0R iteration matrix

is ~. = (I - wD-'R)-' (I - wl + wD-'R ~) .

This may be written in the alternative form

D L = I - (w - R)-' A.

Thus, B in equation (7) is equal to D _ R in this case, and w

. . . . . ~T (2/w - l) D. B + B T A 2D/w R R T D + R + =

The spectral radius of L is less than unity when 2/w - i is greater than zero, or

when w is in (0,2).

It can also be shown by this approach that S0R converges even when the order-

ing and the extrapolation parameter are changed each iteration.

When one digs deeper into the theory, one finds that the S0R iteration matrix

with optimum parameter and consistent ordering does not have a diagonal Jordan form.

The resulting eigenvector deficiency has an adverse effect on convergence.

3. SUCCESSIVE OVERRELAXATION: PRACTICE

For efficient implementation of SOR, one must choose an appropriate ordering

and estimate the optimum parameter w b. One must also choose a strategy consistent

with the characteristics of the computer for which the iteration program is designed.

This latter point is sometimes overlooked. One illustration is that on the CDC-6600

there is a stack feature which leads to a gain in speed by a factor of ten in the

basic arithmetic when one programs the "inner arithmetic loop" in machine language,

taking full advantage of the stack feature rather than relying on FORTRAN. Another

consideration is relative efficiency of getting data in and out of fast memory an~

of computation once the data is in memory. On some machines several iterations can

be performed in the time it takes to read the data in and out of memory. The method

of concurrent iterations enables one to perform several iterations with one pass over

the equations. This is particularly important when solving large problems where all

101

the data cannot be contained in memory.

Periodic boundary conditions present minor difficulties. Line relaxation is

beneficial, and the proper choice of lines enables retention of consistent ordering.

(I often think in terms of problems arising from discretization of partial differen-

tial equations tc obtain five-point or seven-point difference stars. )

A major consideration in any event is choice of w. As the spectral radius of

SR approaches unity, it becomes increasingly more important to choose a good w.

Several elaborate techniques have been described for estimating the extrapolation

parameter. However, a reasonably effective procedt~e which I have had success with

for many years does net require any sophisticated additional programming.

One starts with a parameter, wo, chosen deliberately smaller than w b and iter-

ates until am asymptotic convergence rate is established. This convergence rate is

measured by comparing successive changes in the vector:

II~ _~o II = H(Wo, ~), ~ ~(Wo,~).

?or Wo < %: ~(wo,~) = (Wo~12 + J~'l~ + i - Wo )' .

To estimate w b we estimate, using the above equation:

(i - H (w~,G).) (H(wo,G)n - (Wo - I)') (1-=') (l-C,). (12) = " ~'wo H(wo,=).

We thus estimate after n iterations with Wo :

2

w,= i+ J(l - ~)~ "

The presence of higher modes decaying at the rate Wo - i is such that in practice

the asymptotic rate is approached from below and w I is less than w b. Thus, the

extrapolation parameter may be updated periodically in this fashion.

If one encounters a class of problems for which w b is quite close to two (say,

greater than 1.95), then one should seek alternative procedures to either replace or

supplement the SOR iteration.

102

4. RESIDUAL POLYNOMIALS: CHEBYSHEV EXTRAPOLATION: T~EORY

We may append to the stationary procedure

~_*. : T _~n_, + (T -T)A- ' _k

the extrapolation

(13)

~_. = _ ~ . , + w. (_~'. - _~., ) (14)

This differs from S0R in that improved values do not appear on the right hand side

of (13) and the extrapolation parameter varle s with n. It is easily shown that

= [j~ (T + wj(T-T))]_eo = Pn(T) ~o (15) e_.

where P.(T) is a polynomial of degree n in T (wj~O) and P.(I) = I. To obtain a

least upper bound on the error norm reduction for arbitrary eo and spectrum g of T,

we solve the minimax problem

.p. = sup IP.(gi)I gi

(16) HCn : sin Hpn

P.(1) = l

This classic Chebyshev problem has as its solution for gi arbitrary in the

interval (a,b) with b less than unity:

cos(n cos-' 2g-(~+b) ) 0 . ( 6 ) : b-a (17)

2-~a÷bl ) " oosh(n cosh -I h-a

The roots of this polynomial(b_a)arecos (2j-l)2n + (a+b)

z . : (18) 2 J = 1,2,...,n.

By choosing wj =Izl-!- for use in (14), we generate the Chebyshev polynomial of

degree ~ after n iterations.

This approach has some undesirable features. We must decide upon n in advance

to compute the set of wj. When h is close to unity, some of the wj become quite

large. In this case, large values of n are often required to yield an acceptable

approximation. Roundoff error can become detrimemtal.

By utilizing the recursion formulas for Chebyshev polynomials, we can refine

the iteration. A three-term extrapolation formula is now used:

103

where the pj

xj = xj_ 1 + pj(xj - xj_ I) + qj(xj_ 1

and qj are generated by:

(b-a) U = 2 - (a+b)

- x j_2) (20)

p, = 2~/(b-~) q, = 0

& PJ = (b_aC)j_l(1/u) Ci_2(1/u)

cjU/u) and qj = cj(t/u)

for j > I.

(21)

(22)

The pj and qj approach asymptotic values of order magnitude unity as b approaches

u n i t y , r o u n d o f f e r r o r i s no l o n g e r a s e r i o u s p r o b l e m , and we n e e d no l o n g e r d e c i d e

i n advance upon t h e number o f i t e r a t i o n s .

There are other polynomial extrapolation procedures in common use. One family

of iteration schemes based on Lanczos' work involves computation of the pj and qj

from certain inner products of the xj and x~ vectors. These include the steepest

descent and conjugate gradient methods.

5 • RESIDUAL POLYNOMIALS : PRACTICE

The eigenvalue interval (a,b) upon which the extrapolation parameters are

based may be estimated from observed convergence with assumed bounds. 0s¢illatox,j

error behavior indicates that the assumed lower bound is too large while a uniformly

signed error indicates the upper bound is too low (for the case where the assumed

bounds lie within the true bounds). By noting the sign of the error and the rate of

change, one can update the eigenvalue interval.

Although it is possible to start a new parameter cycle after each updating,

one may use the asymptotic values for pj and qj immediately. This has proven to be

quite satisfactory for many problems.

Systems which satisfy Young's Property A should be treated by the golub-Varga

method which reduces the arithmetic by a factor of two. (See Pp. 155-6 in the ref-

erence text. )

104

The Chebyshev and S0R methods are comparable in efficiency. The Chebyshev

method does often yield a greater reduction in error norm for a given number of

iterations, but other factors often outweigh this. Computer and problem character-

istics often dictate which approach is better.

We may examine more precisely means by which the interval [a,b] may be esti-

mated. Although I have had no occasion to use the procedure which will now be

described, thus making this discussion more "theory" than "practice", the means by

which parameters are updated is one of the more practical aspects of iteration and

thus falls appropriately in this section.

When Chebyahev extrapolation is based upon an assumed eigenvalue interval

[a',b'] which contains the true interval [a,b], the asymptotic convergence rate is

ct_l ( ~ ) ~e e r t l i m =

mE b t - a ! where ~e 2-(a'+b' )

When [a,b] is not contained in[a',b'], error components associated with eigen-

values outside Is,b] ~ [a',b'] will eventually predominate. If it is known that

either a' ~ a or b' ~ b, we seek only one bound and the procedure is analogous tc

that already described for SOR. We shall describe a more sophisticated approach for

estimating both a and b.

After sufficiently ma~V iterations, s, we suppose that successive iterates

satisfy

~+t ~- + ~t z , ( s ) + ~t ~ ( s ) , t = ~ , 2 , 3 , . . .

and define for k = 1,2,3, :

• k-i k-I = --%+k-~ - U s + k = k , ( l - k , ) X, + X, ( 1 - k ~ ) v~ .

It is easily shown that

e~ - (! + i 1 . x, i~)~ + ~ ~ = o .

Let w (a,O) -- e, - ae2 + ~ . ~ .

105

We may ascertain values for ~ and ~ which minimize II w (a,~) [12

~o = C_,, .,~ ) (~.,~) - (,_, .,~ ) (,~.,~)

(,~.,~) (m.,~) - (,~.,~)'

and ~o __ (-', -m ) ('~-'~) - (-" .m ) (m-'~) ('~ -m ) ('~-m) - ('~ -m )"

We may for example, compute =o and ~o after every ten iterations

(s = 10,20,30,... ) until values are obtained which do not change appreciably with s.

Having determined ae and Be we may estimate k, and ka by using the relationship :

1 1 1 r + ^,r : ao , ~,x, - ~o

Thus ~o + 2~4"~°o- ~+S~ and ),2 "~ -

= = 2~o

are the estimated values for k.

Referring to the Chebyshev polynomials with eigenvalue Z of the basic itera-

tion (SR) matrix, we have

k'r

2Z - (a'+~') Now let x = bt_a ,

Then

cosh [(s+t) oosh-' (2Z b-'-a'(a'+b'))] c2z - (a'+b')]

cosh [(s+t-l) cosh-'~ b'-a'

x "- r ( x ÷ ~ ) , and we have

x = r/x + X/r

2

The estimates for a and b are obtained by substitution in the above equation:

b'-a' (z +~)3 if x, ~r, = ½ [Ca'÷b') ÷ ~ x,

b ' - a ' (r_. a ~ a = - , ! , [ ( a ' + b ' ) + ~ ~, + r ) ] i f ~ , ~ - r

This analysis is intended primarily as a guide to further study of methods for

estimating extrapolation parameters. Numerical procedures should be molded to suit

specific problems. We have indicated how a comparison of observed convergence with

theoretical convergence provides a means for updating parameters.

106

6. ALTERNATING-DIRECTION-IMPLICIT ITERATION

A family of non-stationary procedures is generated by splitting A into the stem

of two matrices so that a two step procedure of the following type is formed:

A=H+V

(H + wjl)xj_~ = -(v - wj~)xj_ I + _k (23)

(v + z/)x_j = -(H - ,jl)x_j_~ + _k

J = 1,2, ....

The matrices H and V are chosen so that this iteration is numerically con-

venient. For example, five-point difference equations can be handled when H

includes horizontal (along lines of constant y-value) coupling while V includes all

vertical (along lines of constant x-value) coupling. Then (23) involves a correc-

tion of each horizontal line treated as a block followed by correction of each ver-

tical line. When H and V are both positive-definite and all the wj and sj are

equal to a single positive constant, say w, convergence is easily demonstrated;

_e. = T" eo where T = (V+wI)-'(H-wI)(H+wI)"(V-wI).

T is similar to T' = K(H) . K(V) where K(X) " (X-wI)(X+wI)-' has a spectral norm

less than unity for a~y positive-definite X. Hence, the spectral radius of T' is

less than unity.

Convergence i s not a s su r ed wi thout f u r t h e r c o n d i t i o n s when wj and z j va ry

with ~. Pear~y's theorem asserts that by using a large enough number of parameters

in a monotonically noninoreasing sequence (within the interval of the eigenvalues

of H and V) one can obtain a spectral radius less than unity. Repetitive appli-

cation of such a parameter cycle is convergent.

The theory for this procedure is not as useful in application as theory for

S0R and polynomial extrapolation.

The analysis of the model problem, wherein H and V co-,,ute, is quite elegant.

Optimum parameters are found by solving the minimax problem:

t

g(x,._,_.) - w~ - x (25) Z~ + X

107

H = ~imum tg(x,m,~_>.g(y,"_,=_ll B

a~x~b

c ~ y ~ d ( 2 6 )

H o o = m i n i m u m H E ,z_ E,A W Z

The e x i s t e n c e o f a u n i q u e s o l u t i o n to t h i s min imax ~ o b l e m was r e c e n t l y

e s t a b l i s h e d i n ~y t h e s i s ( s e e ~ I P ' 68 p r o c e e d i n g s f o r a c o n c i s e summary ) . S e v e r a l

means are available for choosing nearly optimum parameters.

It is interesting to review the literature on this problem and note how the

theory has been developed during the past fifteen years. An analytic solution for

O zO w and _ found by W.B. Jordan culminated the search for optimum parameters. Never-

theless, this min~m-x problem was actually solved about i00 years earlier (as obser-

ved by J. Todd)! Jordan first devised a bilinear transformation of variablesto

reduce the problem to an analogous one with identical eigenvalue intervals for both

variables. My thesis could then be used to establish that the set of wj is identi-

cal to the zj (except for order) for the transformed problem.

7. PARAMETERS FOR THE FEACEMAN-RACHFORD VARIANT OF ADI

The optimum parameters are obtained by theory involving modular transformations

of elliptic functions. Numerical evaluation turns out to be quite easy. An appro-

ximation valid over a wide range is:

2(a/4b)rj (1 + (a/4b)2(1-r j )) .b w j

i + (a/@b)2rj (27)

2t ' J = 1,2,...,t.

To illustrate the mathematical elegance of analysis of convergence rates of

(23), we will derive the equations for generating parameters a~ which solve the

minimax problem: 2"

s(x,-_) = j ~ aj+xa'1"x (28)

H = ~imum . - _-.Ig(x,a)l H O = m~n~mum H . (29) a a a a

a ~ x ~ b -- --

M u l t i p l y i n g n u m e r a t o r a n d d e m o n i n a t o r o f e a c h f a c t o r i n t h e p r o d u c t on t h e r i s h t

108

hand side of (28) by ab/ajx, we obtain

ab ab x aj

~(x,~) = ab + ab j=1 x a j

(3o)

ab/ o As x varies from a to B, ab/x varies from b to a. Hence the set /aj is the same

o as the set aj by virtue of the uniqueness of the parameters for any given eigen-

value interval, Combining the factors with aj and ab/aj , we get

(aj - ~)(ab/aj - x)

(aj + x)(ab/aj + x)

(x' + ab) - (aj + ~/aj)x (x" + ab) + (aj + ab/aj)~

(x + ab/x) - (a~+ a~/ajl

(x + ab/x) + (aj + ab/aj)

Now 1,t ~, = ½(~ + ab/x) =d a~ = ½(a:, + ab/aj). ~,n

I~(~,~_)I = a' - ~,i

J=l

where (ab) ½ = a' ( ~' ( b' = (a+b)/2.

(31)

Continuing in this fashion, we successively reduce the number of factors in the

product until we arrive at the one parameter problem:

a(n ) . x(n ) a(n) x(n ) b(n ) g(x(n),a) _- a!n) + x (.) , . . .

This is solved by noting that a~ n)

a~. )

a(n ) b(n ) or

a(")

= (a (") b(.>)"l" (32)

We may work backwards to obtain a parameter "tree" by successive solution of

quadratics:

(s-~) a~S) j(a~)), _ Ca(S)),

(s-1) Ca(S))" (s-1) aj, -. / aj, (33)

109

Although (27) looks a lot simpler, this technique was developed before the

elliptic function solution was known. There is sn intimate connection between this

process an~ Lan~en transformations for evaluation of elliptic functions.

Introduction to Finite Difference Approximations

to Initial Value Problems for

Partial Differential Equations

OLOF WIDLUND

New York University

This work was in part supported by the U.S. Atomic Energy Commission~ Contract AT(50-I)-1480 at the

Courant Institute of Mathematical Sciences, New York University

112

i. Introduction

The study of partial differential equations and methods for their exact and

approximate solution is a most important part of applied mathematics, mathematical

physics and numerical analysis. One of the reasons for this is of course that very

many mathematical models cf continuum physics have the form of partial differential

equations. We can mention problems of heat transfer, diffusion, wave motion and

elasticity in this context. This field of study also seems to provide a virtually

inexhaustable source of research problems of widely varying difficulty. If in

particular we consider finite difference approximations for initial value problems we

find a rapidly growing body of knowledge and a theory which frequently is as

sophisticated as the theoryfor partial differential equations. The work in this

field, as in all of numerical analysis, has of course been greatly influenced by the

development of the electronic computers but also very much by the recent progress in

the development of mathematical tools for problems in the theory of partial diffe-

rential equations and other parts of mathematical analysis.

Much of this progress has centered around the development of sophisticated

Fourier techniques. A typical question is the extension of a result for equations

with constant coefficients, to problems with variable coefficients. In the constant

coefficient case exponential functions are eigenfunctions and such a problem can

therefore, via a Fourier-Laplace transform, be turned into a, frequently quite

difficult, algebraic one. ~uch recent work in the theory of finite difference

schemes, including much of that of the author has been greatly influenced by this

development. These techniques are usually referred to as the Fourier method an4 will

be the topic of several of Thom~e's lectures here in Dundee. The emphasis of these

lectures will be different. Wewill concentrate on explaining what is known as the

energy method after a discussion of the proper choice of norm, stability definition

etc. We will also try to make some effort in relating the mathematics to the under-

lying physics and attempt to explain a philosophy of constructing classes of useful

difference approximations.

113

We have decided to use as simple technical tools as possible, frequently con-

centrating on simple model problems, to illustrate our points. Some generality will

undoubtedly be lost but it will hopefully make things easier to understand and

simplify the notations. A considerable amount of time will be spent on analysing

the differential equations we are approximating. Experience has shown that this is

the most convenient way to teach and work with the material. The properties of the

differential equation are almost always easier to study and a preliminary analysis

of the differential equations can frequently be translated into finite difference

form. This is particularly useful when it comes to choosing proper boundary condi-

tions for our difference schemes.

The objective of our study is essentially to develop error bounds for finite

difference schemes I methods to tell useful from less useful schemes and to give

guidelines as to how reliable classes of schemes can be found. On the simplest

level finite difference methods are generated by replacing derivatives by divided

differences, just as in the definition of a derivative, discretizing coefficient

functions and data by evaluating them at particular points or as averages over sm~ll

neighbourhoods. As we will see there are many choices involved in such discretiza-

tien processes and the quality of the approximate solutions can vary most drasticalS~

The finite difference approsch has some definite advantages as well as disadvan-

tages. Thus the most one can hope, using a finite difference scheme, is to be able

to get a computer program which for any given set of data will give an accurate

answer at a reasonable cost. The detailed structure of the mapping which transforms

the data into the solution will of course in general be much too complicated to

understand. Thus the classical approach giving closed form solutions to differential

equations frequently gives much more information about the influence on the solution

of changes in data or the model. The same is true perhaps to a somewhat lesser

extent, of methods of applied mathematics such as asymptotic and series expansions.

However finite difference schemes and the closely related finite element methods

have proved most useful in many problems where exact or asymptotic solutions are

unknown or prohibitively expensive as a computational tool.

114

The main reference in this field is a book by Richtmyer and Morton [1967]. It

is a second edition of a book by Richtmyer [1957] which, in its theoretical part, is

based to a great extent on work by Lax and Richtmyer. The new edition is heavily

influenced by the work of Kreiss. A second part of the book discusses many specific

applications of finite difference schemes to problems of continuum physics. There

is also a survey article by Kreiss and the author [1967], with few proofs, basea on

lectures by Kreiss which still awaits publication by Springer Verlag. It may still

be available from the Computer Science Department in Uppsala, Sweden. Also to be

mentioned is a classical paper by Courant, Friedrichs and Lewy [1928] which has

appeared in English translation together with three survey articles containing

useful bibliographies [1967]. Another classical paper, by John [1952], is also

very much worth a study. Among recent survey articles we mention one by Thom~e

[1969]. That paper essentially discusses the Fourier method.

2. The form of the partial differenti_al e~uaticns

We will consider partial differential equations of the form,

8tu: P(x,t,ax)U , x ¢ Q, t ¢ [O,T], T < eo

where u is a vector valued function of x and t. The variable x = (xl ,...,Xs),

varies in a region O which is the whole or part of the real Euclidian space R s.

When 0 is all of R s we speak of a pure initial or Cauchy problem; in the opposite

case we have a mixed initial boundary value problem. The differential operator P is

defined by

~ Av(x,t vl u s P(x,t,a x) = ) ax, .... axs Iv m

where {~I = ~'i and the matrices Av(x,t) have sufficiently smooth elements. The

degree of the highest derivative present, m, is called the order of the equation.

If we let the coefficients depend on u and the derivatives of u as well we say that

the problem is nonlinear.

115

We will restrict our attention almost exclusively to linear problems and to the

approximate calculation of classical solutions, i.e. solutions u(x,t) which are

smooth enough to satisfy our equation in the obvious sense.

In order to turn our problem into one with a possible unique solution we provide

initial values u(x,O) = f(x). It is thus quite obvious that for the heat equation

atu = ~z u ' x -co ~ x ~ co ,

a specification of the temperature distribution at some given time is necessary in

order to single out one solution. Frequently when O is not the whole space we have

to provide boundary conditions on at least part of the boundary 80 of O. Sometimes

we also have extra conditions such as in the case of the Navier-Stokes equation

where conservation of mass requires the solution to be divergence free. The

boundary conditions, the form of which might vary between different parts of the

boundary, have the form of linear (or nonlinear) relations between the different

components of the solution and their derivatives. The essential mathematical

questions are of course, whether it is possible to find a unique continuation of the

data and to study the properties of such a solution.

Like in the case of systems of ordinary differential equations, we can hope to

assure the existence of at least a local solution by providing initial data, etc.

It is however, often not immediately clear how many boundary conditions should be

supplied. Clearly the addition of a linearly imdependent boundary condition in a

situation where we already have a unique solution will introduce a contradiction

which in general leads to nonexistence of a solution. Similarly, the removal of

a boundary condition in the same situation will in general lead to a loss cf

uniqueness. Existence is clearly necessary for the problem to make sense, similarly

to require uniqueness is just to ask for a deterministic mathematical model. The

correct number of boundary conditions as well as their form is often suggested by

physical arguments (Cf. §3).

We now mention some simple examples which will be used in illustrating the

theory. The equation

8tU = 8xU

116

is the simplest possible system of first order, a class of problems of great

importance. We have already mentioned the heat equation

~t u = ~=x u .

The equation

8tu = i 8~u

~t

is a simple model for equations of Schrodinger type. It has several features in

common with

a~u = 4~u

which arises in simplified time dependent elasticity theory. Finally we list the

wave equation with one and two space variables:

~u = 0xU , 8~u = 8xU + 8yU .

The last three equations do not have the form we have considered until now being

second order in t. This can however be easily remedied by the introduction of new

variables. Thus let v =Stu and w = 8xU , Then

° (:)= o)0x (:I will be equivalent to the equation 8~u = -8~Uo Initial conditions have to be

provided. We first note that u and 0tu both have to be given like in the case of

ordinary differential equations of second order. This is also clear from the

analogy with a mechanical system with finitely many degrees of freedom. The initial

conditions for w can be formed by taking a second derivative of u(x,0)o

The wave equation, in two space variables, can be transformed into the required

form in several ways. We will mention two of these because of their importance in

the following discussion. Let us first just introduce the new variable v = 0tu and

rewrite the wave equation as

As we will see later on this is not a convenient form and we will instead introduce

three new variables

117

u, = atu, u, = axU, u, = ayu

which gives the equation the form

8 t Cu!) (i °I l ul o I • 0 0 a x u2 + o o

o o u~ I i o o

ul

8y u2 •

uj

Initial conditions for this new system are provided as above.

In order to illustrate our discussion on the number of boundary conditions we

consider

~tu:axu, o: Ix~x~ol , t~o. u(x,o)=f(x)

If f has a continuous derivative then f(x+t) will be a solution of the equation for

x ~ O, t ~ 0 and it can be shown that this solution is unique. The boundary consists

of the origin only. If we introduce a boundary condition at this point say u(0,t) =

g(t), a given function, we will most likely get a contradiction and thus no solution.

The situation is quite different if O = Ix; x ~ Ol (or what is essentially the same,

8tu = -SxU and O = Ix; x ~ 01). The solution is still f(x+t) for x + t ~ O, t ~ 0

but in order to determine it uniquely for other points on the left halfline a boundary

condition is required. It can be given in the form u(O,t) = g(t), f(O) = g(O), g(t)

once continuously differentiable. The solution for 0 ~ x + t ~ t, t ~ 0 will be

g(x+t). Thus different g(t) will give different solutions and the specification of

a boundary condition is necessary for uniqueness° The condition f(O) = g(O) assures

us that no jump occurs across the line x + t = O.

3. The form of the fini_te difference schemes.

We begin by introducing some notations. We will be dealing with functions

defined on lattices of mesh points only~ For simplicity we will consider uniform

meshes: ~ = Ix; x i = nih, n i = O, ~ i, ~ 2,...Io The mesh parameter h is a measure

of the fineness of cur mesh. We also discretize in the t-direction: R k = It; t = nk,

118

n = 0,1,2,..o I . When we study the convergence of our difference schemes we will

let both h and k go to zero. It is then convenient to introduce a relationship

between the timestep k and the meshwidth h of the form k = k(h), k(h) monetonicaS/y

decreasing when h ~ O, k(0) = Oo Often this relationship is given in the form

k = kh m, m = the order of the differential equation and k a positive constant@

The divided differences, which replace the derivatives, can be written in

terms of translation operators Th, i defined by

Th,i~(x ) = ~(x + hei) ,

where e i is the unit vector in the direction of the positive x.-axiso Forward, 1

backward and central divided differences are now defined by

I D+i ~(x) =~ (Th, i - I)~(x)

I (I - T h,i)~(x) D i ~(~) :

# I Doi ~(x) :~ (Th, i - T h, i)~(x) : ~(D+i ÷ D_i)~(x ) o

These difference operators serve as building blocks for our finite difference

schemes° The form of the complete schemes will become apparent as we go along°

We will now look into the mathematical derivation of the heat equation in

order to illustrate a very useful technique for generating finite difference schemss@

Let us consider heat flow in a one dimensional medium@ Denote the absolute

temperature by u(x,t)o The law governing the heat flow involves physical quantities,

the specific heat per unit volume K(x,t) and the heat flow constant Q(x,t) o The

heat energy per unit volume is K(x,t)u(x,t) at the point x at time to The quantity

Q(x,t)BxU is the amount of heat energy that flows per unit time across a cross

section of unit area.

Consider a segment between x and x + AXo The amount that flows into this

segment per unit area per ~lit time is

Q(x + Ax,t) 8xU (x + Ax,t) - Q(x,t) axU (x,t)

and it must in the absense of heat sources be balanced by

x + Ax

a t ]K(x',t) u(x',t) dx' o

x

119

A simple passage to the ]~m~t, after a division by Ax, gives

at(~) -- a x Qaxu •

If the slab is of finite extent, lying between x = 0 and I, physical consiae-

rations lead to boundary conditions. The heat flow out of the slab at x = 0 is

proportional to the difference between the inside and outside temperature uo. With

an appropriate heat flow constant Qe we have a flow of heat energy at x = 0 per unit

area which is

Qe(U - ~) and the balance condition is therefore

QexU + Q,(u- ue) =0 at x =0 .

If Qe is very large we get the Dirichlet condition u = ue at x = O. Similar con-

siderations give a boundary condition for x = I.

This derivation already contained certain discrete features. In order to turn

it into a strict finite difference model we have to replace the derivatives an&

integral in the balance conditions by difference quotients and a numerical quadratic

formula respectively. We can get essentially the same kind of schemes by starting

off with a discrete model dividing the medium into cells of length Ax giving the

discretely defined variable u the interpretation of an average temperature of a cell.

The relation between the values of the discrete variable , i.e. the difference scheme,

is then derived by the use of the basic physical balance conditions.

It is clear that we can get many different discrete schemes this way. In parti-

cular we do not have very much guidance when it comes to a choice of a gooa disereti-

zation of the t derivative . We will now examine a few possible finite difference

schemes, specializing to the case K = Q = I and Dirichlet boundary conditions. The

first two schemes are

u(x,t+k) = u(x,t) + ~_D+u(x,t)

and

u(x,t÷k) --u(x,t-k) ÷ 2~_D÷u(x,t)

for x = h 9 2h,..., l-h, t E ~. We assume that I/h is an integer and we provide the

schemes with the obvious initial and boundary conditions.

120

The first scheme, known as Euler's method or the forward scheme, can immediately

be used in a successive calculation of the values of u(x,t) for t = k, 2k, etc. The

second scheme requires the knowledge of at least approximate values of u(x,k) before

we can start marching. The latter scheme is an example of a multistep scheme. The

extra initial values can be provided easily by the use of a one step scheme such as

Euler's method in a first step. We could also use the first few terms of a Taylor

expansion in t about t = 0 using the differential equation and the initial value

function f(x) to compute the derivatives with respect to t. Thus atu(x,o) = a~f(x),

8~u(x,O) : 8~f(x), eta. The possible advantage in introducing this extra complication

is that the replacement of the t derivative by a centered instead of a forward diffe-

rence quotient should help to ~ke the discrete model closer to the original one.

Such considerations frequently make a great deal of sense. We will, however, see

later that our second scheme is completely useless for computations.

The difference between a finite difference scheme and the corresponding diffe-

rential equation is expressed in terms of the local truncation error which is the

inhomogenous term which appears when we put the exact solution of the differential

equation into the difference scheme. If the solution is sufficiently smooth we can

compute an expression for this error by Taylor series expansions. We will later see

that a small local truncation error will assure us of an accurate numerical procedure

provided the difference scheme is stable. Stability is essentially a requirement of

a uniformly continuous dependence of the discrete solution on its data and it is the

lack of stabili~ which makes our second scheme useless. We will discuss stability

at s o m e length in 86.

These two schemes are explicit, i.e. schemes for which the value at any given

point can be calculated with the help of a few values of the solution at the immed_i-

ate]y preceeding time levels.

Our next scheme is implicit:

(I - ~.D+) u(x,t+k) -- u(x,t) .

It is known as the backward scheme. Each time step requires the solution of a linear

system of equations, However this system is tridiagonal and positive definite and

can therefore be solved by Cholesky decomposition or some other factorization method

I21

at an expense which is only a constant factor greater than taking a step with an

explicit scheme. We will see that the backward scheme has a considerable advantage

over the forward scheme by being unconditionally stable, which means that its

solution will vary continuously with the data for any relation between k and h. F~

the forward scheme a restriction k/h 2 ~ ½ is necessary in order to assure stability.

This forces us to take very many time steps per unit time for small values of h.

0ur fourth scheme can be considered as a refined version of the backward scheme

k D D+) u(x,t+k) = (I + ~ (I - ~ _ k D D+) u(x,t).

This scheme, known as the CrankoNicclson scheme, is also implicit and unconditionally

stable. It treats the two time levels more equally and this is reflected in a

smaller local truncation error.

We have already come across almost all of the basic schemes which are most

useful in practice. We complement the list with the well known Dufcrt-Frankel scheme:

(I + 2k/h') u(x,t+k) = 2k/h z (u(x+h,%) + u(x-h,t)) + (1 - 2k/h') u(x,t-h) .

In order to see that this two step scheme is consistent, which means formally con-

vergent, to the heat equation we rewrite it as

(u(x,t+k) - u(x,t-k))/~ = (u(x+h,t) - u(x,t+h) - u(x,t-h) + u(x-h,t))/h"

and find by the use of Taylor expansions that it is consistent if k/h , O when h , O.

The scheme is unconditionally stable, explicit, suffers from low accuracy but it is

still quite useful becuase of its simplicity.

Another feature of the Dufort-Frankel scheme, worth pointing out, is that the

value of u(x,t+k) does not depend on u(x+2nh,t) or u(x+(2n+l)h,t-k), n = 0,+ i, _+ 2 ....

We therefore have two independent calculations and we can make a 50% saving ])y

carrying out only one of these using a so called staggered net.

4. An example of diver~ence~ The maximum principle.

We will now show that consistency is not enough to ensure useful answers. In

fact we will show by a simple general argument that the error can be arbitrarily

large for any explicit scheme, consistent with the heat equation, if we allow k to go

to zero at a rate not faster than h.

122

Consider a pure initial value problem. The fact that our schemes are explicit

and that k/h is hounded away from zero implies that only the data on a finite subset

of the line t = 0 will influence the solution at any given point. If now, for a

fixed point (x,t), we choose an intial value function which is infinitely man 2 times

differentiable, not indentically zero but equal to zero in the finite subset m~ntior~d

above then the solution of the difference scheme will be zero at the point for all

mesh sizes. On the other hand the solution of the differential equation equals

u(x,t) = ~ Fe-(X-Y)'/~t f(y)dy ,

-o~

an~ thus for amy non negative f it is different from zero for all x and t > O.

Using this solution formula we can prove a maximum principle,

max lu(x,t)l ~ max If(x)I for all t ~ O. x x

Thus, after a simple change of variables,

+oo

lu(x,t)l, L, -''/'*t Jf(x-,)ld.,.,

. -If(x)l f = l f ( x ) l x ~ x

-co

This shows t h a t the s o l u t i o n v a r i e s c o n t i n u o u s l y w i t h the i n i t i a l va lues i n the

maximum norm sense• This property is most essential and has a natural physical

interpretation. It means, of course, that in the absense of heat sources the maxi-

mum temperature cannot increase with time. Similar inequalities hold for a wide

class of problems known as parabolic in Petrowskii's sense, for Cauchy as well as

mixed initial value problems. Cf. Friedman [196&].

We will now show, by simple means, that our first and third difference schemes

satisfy similar inequalities, a fact which will be most essential in deriving usefal

error bounds, etc. First consider Euler's method with the restriction that k ~ ha/2.

The value of the solution at any point is a linear combination of the three values

at the previous time level, the weights are all positive and add up to one. Thus

the maximum cannot increase. For the third scheme we can express the value of the

solution at amy point as a similar mean value of one value at the previous time

123

level and at those of its two neighbours. Therefore a strict maximum is possible

only on the initial line or at a boundary point. This technique can be used for

problems with variable coefficients and also in some nonlinear cases. Unfortunately

it carmot be extended to very many other schemes because it requires a positivity

of coefficients which does not hold in general.

For the finite difference schemes discussed so far we have had no problems with

the boundary conditions. They were inherited in a natural way from the differential

equation and in our computation we were never interested in using more than the next

neighbours to any given point. We could however be interested in decreasing the

local truncation error by replacing 8xU by a difference formula which uses not three

but five or even more points. This creates problems next to the boundaries where

some extra conditions have to be supplied in order for us to be able to proceed with

the calculation. It is not obvious what these extra conditions should be like.

Perhaps the most natural approach, not always successful, is to require that a

divided difference of some order of the discrete solution should be equal to zero.

This problem is similar to that which arises by the introduction of extra initial

values for multistep schemes but frequently causes much more serious complications.

If we go back to our simple first order problem 8tu = 8xU , we see that there

are two possibilities. Either we use a one sided difference such as

u(x,t+k) = u(x,t) + kO+u(x,t)

for which no extra boundary condition is needed or we try a scheme like Euler's

u(x,t÷k) = u(x,t) + ~ou(x,t)

for which a boundary condition has to be introduced at x = O. We leave to the

reader the simple proof that, for k/h ~ l, the solution of our first scheme depends

continuously on its data in the sense of the maximum norm. The second scheme is,

as we will later show, unstable even for the Cauchy case. The problem to provide

extra boundary data however still remains even if we start out with a scheme which

is stable for the Cauchy ease.

124

We also mention another method which has some very interesting features;

u(x,t÷k) + u(x-h,t+k) + ~_u(x,t,k) = u(x,t) + u(x-h,t)

- ~_u(x,t), uCo,t) = o, uCx,o) = f(x), o ~ x < ® .

This difference scheme approximates 8tu = -SxU on the right half line. It has been

studied by Thome~e, [1962J, and is also discussed in Richtm2er and Morton [1967].

It is implicit but can be solved by marching in the x-dlrection and could therefore

be characterized as an effectively explicit method. It is unconditionally stable.

Finally we would like to point out a class of problems for which the boundary

conditions create no difficulties namely those which have periodic solutions. This

allows us to treat every point on the mesh as though it were an interior point. In

the constant coefficient case such problems can be studied successfully by Fourier

series. The analysis of a periodic case is frequently the simplest way to get the

first information about the usefulness of a particular difference scheme.

5. The choice of norms and stability definitions

In the systematic development of a theory for partial differential equations

questions about existence and uniqueness of solutions for equations with analytic

coefficients and data play an important role. Cf. G~rabedlan [1964]. The well

known Cauchy-Kowaleski theorem establishes the existence of unique local solutions

for a wide class of problems of this kind. As was pointed out by Ha~rd [1921],

in a famous series of lectures, such a theory is not however precise enough when

we are interested in mathematical models for physics. We also have to require, amm~g

other things, that the solution will be continuously influenced by changes of the

data, which we of course can never hope to measure exactly. The class of analytic

functions is too narrow for our purposes and we have to work with some wider class

of functions and make a choice of norm. In most cases the ~aximum norm must be

considered the ideal one. We have already seen that for the heat equation such a

choice is quite convenient and that the result on continuous dependence in this nc~m

has a nice physical interpretation. For other types of problems we also have to be

guided by physical considerations or by the study of simplified model problems.

125

Hadamard essentially discussed hyperbolic equations and much of our work in

this section will be concentrated on such problems. A study of available closed

form solutions of the wave equation naturally leads to the following definition.

Definitio~ An initial value problem for a system of partial differential

equations is well posed in Hadamard's sense if,

(i) there exists a unique classical solution for amy sufficiently smooth initial

value function,

(ii) there exists a constant q and for every finite T > 0 a constant C T such that

=az lu(x.t)J ~ CT mx la~u(x,O)l • x t60,T] x,l~1 ~ q

One can ask if it is always possible to choose q equal to zero. A study of our

simplest first order h~perbolic equation 8tu = axu gives us hope that this might be

possible, and so ~es an examination of the wave equation in one space variable

a~u - - o ' a ~ u , u(x,o) = f (x ) , atu(x,o) : g(x)

which, after integration along the rays x = Xo+ct, is seen to have the solution

x+ct u(x,t) f(x+.t) + f(xct) i [

= '" 2 + ~c g(s)d~. k-ct

In fact these two equations are well posed in Lp, 1 4 p 4 oo, in a sense we will

soon specify.

We will soon see that a choice of q = 0 is not possible for the wave equation

in several space variables. Before we explain this further we introduce some

concepts which we will need repeatedly.

Because of the linearity of our equations there is, for any well posed intial

value problem, a linear solution operator E(t,t I ) 0 g tl g t which maps the solution

at time tl into the one at time t. In particular

u(x,t) = ~(t,0)f(x) if u(x,0) = f(x) .

0he of Hagam~ra's requirements for a proper mathematical model for physics is that

the solution operator forms a semigroup i.e.

E(t,r) E(r,tl) = E(t,t) for 0 < t, ~ T ~ t .

126

When we deal with completely reversible physical processes the semigroup is in fact

a group. Such is, for instance, the case for wave propagation without dissipation.

We now introduce the definition of well posedness with which we will finally

choose to work.

Definition An initial value problem is well posed in L if, P

(i) there exists a unique classical solution for any sufficiently smooth

initial value function,

(ii) there exists constants C and a such that

[[E(t,~ )flip ~ c exp(,(t-t, ))Llfllp •

By the L norm, of a vector valued function, we mean the L norm, with respect P P

to x, of the 12- norm of the vector.

Littman [1963] has shown, by a detailed study of the solution formulas for the

wave equations, that except for one space dimension they are well posed only for

p = 2. His result has been extended to all first order systems with symmetric

constant coefficient matrices by Brenner [1966].

Theorem (Brenner [1966]). Consider equations of the form

s

atu=~A 8 u, x

A constant and symmetric matrices. This system is well posed in the L for a P

p # 2 if and only if the matrices A commute.

This leaves us with only two possibilities. Either we only have one space

variable or there is a eommon set of eigenvectors for the matrices A . In the

latter case we can introduce new dependent variables so that the system becomes

entirely uncoupled, consisting of a number of scalar equations.

Brenner's proof is quite interesting but too technical to be explained here.

Instead we will first show the well posedness in ~ of symmetric first order systems

and then proceed to show that the wave equation is not well posed in L for several co

space variables. This will of course answer the question about the possibility of

a choice of q = O.

127

We note that most~perbolic equations of physical interest can be written as

first order systems with symmetric coefficient matrices.

Introducing the standard L 2 inner product we see that for any possible solution

to the equation, which disappears for large values of x,

s ~ s

at(u,u) 2 (Aa= /~=1 /J=1 ~=1

Therefore S

at(u(t), u(t)) =-2L(u,(ax A )u) ( const. (u(t), u(t))

if the elements of A (x) have bounded first derivatives.

From this immediately follows

[lu(t)[l~ ~ exp((const./2)t)[[u(O)H2

In particular we see that the L 2 norm of u(x,t) is unchanged with t if the

coefficients are constant.

The restriction to solutions which disappear at infinity is not a serious one.

Any L 2 function can be approximated arbitrarily closely by a sequence of smooth

functions which are zero outside bounded, closed sets. A generalized solution can

therefore for any initial value in L 2, be defined as a limit of the sequence of

solutions generated by the smooth data. This is of course just an application of a

very standard procedure in functional analysis, for details el. Richtmyer and Morton.

Examining the solution formula for the wave equation in one space variable we

find that information is transmitted with a finite speed less than or equal to c.

This finite speed of propagation is a characteristic of all first order hyperbolic

equations. Thus the solution at a point (x,t) is influenced solely by the initial

values on a bounded subset of the plane t = O. This subset is known as the domain

of dependence of the point. Similarly any point on the initial plane will, for a

fixed t, only influence points in a bounded subset.

We also see, from the same solution formula, that the boundary of the domain

of dependence is of particular importance. This property is shared by other hyper-

bolic equations. In particular for the wave equation in 3, 5, V ... space variables

128

the value at any particular point on the initial plane will, for a fixed t, only

influence the solution on the surface of a certain sphere. This result, known as

Huygen's principle, can be proved by a careful study of the solution formula of the

wave equation. It is of course also well known from physics. Cf. @arabedlan [1964S.

We have now carried out the necessary preparations for our proof that the wave

equation in three space dimensions cannot be well posed in Leo. We write the

equation in the form of a sy~etric first order system. We choose for all compo-

nents of the solution, the same spherically symmetric class of initial values namely

C ~ functions which are equal to one for r ~ ~/2 and zero for r ~ ¢ and having valus~

between 0 and 1 for other values of r. The spherical symmetry of the initial values

will lead to solutions the values of which depend only on the distance r from the

origin an~ on the parameter ¢. It is easy to show that 8tU,axiU ,i = I ,2,3, are

solutions of the wave equation and that they therefore satisfy Huygen's principle.

Therefore, the solution at t = I/c will be zero except for values of r between I - E

and I + ¢. We know that the L 2 norm of the solution is unchanged and it is easy to

see that it is proportional to ~ . Now suppose that the equation is well posed

in L . This means not only that the maximum of all components of the solution at eo

t = I/e is bounded from above by a constant independent of c but also that the norm

of the solution is bounded away from zero uniformly. If this were not the case we

could solve our wave equation backwards and we could not have both well posedness

in L and a solution of the order one at t = O. eo

Denote by CI the maximum of the component of largest absolute value at t = I/o.

This point has to have a neighbour at a distance no larger than constant × cJ where

the value is less than C~/2. In the opposite case the L 2 norm of the solution could

not be of the order ~2 because the volume for which the solution is larger than

C~/2 would exceed constant × ¢3. Our argument thus shows that the signals have to

become sharper, in other words, the gradient of the solution increases. At t = O

it is of the order I/¢ and it has to be proportional to ~-~ at t = I/o. This

however contradicts our assumptions because first derivatives of a solution are

also solution of the wave equation and their maximum cannot grew by more tham

129

constant factor. Thus the wave equation in three space variables is not well

posed in L and a choice of q > 0 sometimes has to be made. ao

The well posedness of the wave equation in L 2 has a nice interpretation in terms

of physics. In one dimension, for example, the kinetic energy is p/2 [(atu)~dx and J

the potential energy is T/2 f(SxU)Zdx , where Tip = c 2 = the square of the speed of J

propagation. The total energy is therefore p/2 /((~tu) 2 + c2(axU)2)dx and it

remains unchanged in time in the absense of energy sources, i.e. inhomogenous or

boundary terms. In fact we note that our proof of the well posedness in L 2 of

symmetric first order system is our first application of what is known as the energy

method.

The fact that all first order hyperbolic equations have finite speeds of propa-

gation has immediate implications for explicit finite difference schemes. Thus,

just as for explicit methods for parabolic problems, we have to impose certain

restrictions on the relation between k and h in order to avoid divergence. The

appropriate condition has the form k/h sufficiently small. It is thus less restric-

tive than the condition in the parabolic case. This is known as the Courant-Friedriohs-

Lewy condition and simply means that, for sufficiently small values of h, any point

of the domain of dependence of the differential equation is arbitrarily close to

points belonging to the domain of dependence of the difference scheme. It is easy

to understand how in the opposite case we can construct initial value functions

which will give us arbitrarily large errors at certain points.

Experience also shows that it is advisable to use schemes and k/h which allows us to

have the domains of ~ependence coincide as much as possible while satisfying the

Courant-Friedrichs-Lewy conditions. This is related to the particular importance

of the boundary of the domain of dependence mentioned above. However it should be

pointed out that this is hard to achieve to any great extent when several propa@ation

speeds are involved and when they vary from point to point.

It should be mentioned that one can show that any first order problem which is

well posed inL 2 (or any Lp space) is well posed in Hadamard's sense. Therefore

Hadamard's definition is less restrictive than the other one. The proof is by

showing that derivatives of solutions also satisfy well posed first order problems

and the use of a so called Sobolev inequality.

130

A result by Lax, [1957] gives an interesting side light on the close relation

between the questions of well posedness, existence and uniqueness. We describe it

without a proof. Thus if a first order system with analytic coefficients, and not

necessarily hyperbolic, has a unique solution for any infinitely differentiable

initial value then it must be properly posed in Hadamard's sense. A corollary is

that Cauchy's problem for Laplace's equation, which can be rewritten as the Cauchy-

Riemamm equations and which is the most common example of a problem which is ill

posed, cannot be solved for all smooth initial data.

Another interesting fact is that it can be shown that homogenous wave motion,

satisfying obvious physical conditions such as finite speed of propagation etc.,

has to satisfy a hyperbolic differential equation of first order. This gives added

insight into the importance of partial differential equations in the description of

nature. For details we refer to Lax [1963].

We still face a choice between the two definitions of well posedness. Hadamard's

choice has the advantage of being equivalent to the Petrowskii condition in the cs~e ef

constant coefficients. The Petrowskii condition states that the real part of the

eigenvalues of of the symbol ~ of our differential operator P should be bounded from

above. The symbol is defined by

S

where ~ ¢ an~ < ~,x ~ -- ~ixi

i=I

and is thus a matrix valued polynomial in ~. This algebraic condition is most

natural for the constant coefficient case. We are ~mmediately led to it if we start

looking for special solution of the form

exp(kt) exp i < ~,x > ~, , ~ some vector .

Thomee will probably discuss these matters in much more detail. For a proof of the

equivalence between the H~Ismard and Petrowskii conditions of. Gelfan& and Shilov

[196~].

131

Due to the effort of Kreiss [1959], [1963] four algebraic conditions which are

equivalent to~ll posedmess in L 2 are known in the constant coefficient case. The

full story is quite involved and subtle. We only mention one of these conditions.

Thus a constant coefficient problem is well posed in L z if for some constants a and

K and for all • such that Res > 0

ICCs + ~) - ~ , ) ) - ' I, ,, z /R~ .

We leave to the reader to verify,using this and the Petrowskii conditions,that cur

first attempt to rewrite the wave equation is well posed in Had~mAr~'s sense but not

in L 2. The intuitive reason why our second attempt was more successful is that the

new variable r~turally defined a norm which defines the energy of the system while

no similar physical interpretation can be made in the first case.

The algebraic conditions just introduced are about the simplest possible criteria

we can hope to find to test whether or not a differential equation is well posed.

Analogous criteria have been developed for finite difference schemes. We will now

try to find out if they could be used for problems with variable coefficients as wall .

It is known from computational experience that instabilities tend to develop

locally and it is therefore natural to hope that a detailed knowledge of problems

with constant coefficients, obtained by freezing the coefficients at fixed points,

should provide a useful guide to problems with variable or even nonlinear coefficdsnts.

The constant coefficients problemscan be treated conclusively by the Fourier trans-

fo I'm.

This idea is quite sound for first order and parabolic problems provided our

theory is based on the second definition of well posedness. It is however not true

for all problems. This is illustrated by the following example, due to Strang [1%6]

atu -- i8 x sin XaxU = i sin XSxU + i cos XSxU •

This is well posed inL 2 because, using the scalar product (u,v) = fu vdx and

integration by parts,

at(uCt), uCt)) = o

for any possible solution. However if we freeze the coefficients at x = 0 we get

132

atu = iaxU

which violates the Petrowskii condition.

For a more detailed discussion we refer to Strang's paper.

The main critisism of thm Petrowskii condition is that it is not stable against

perturbations or a change of variables. This is illustrated by the following exampl~

due to Kreiss [1963],

I I) = U(t) /. U-'(t)

8tu 8xU 9

\ 0 I

U(t) t-cos t sin o

It is easy to see that the eigenvalues of the symbol, for all t, lie on the ima~na~y

axis. The equation is however far from well posed. To see this we change the

dependent variables by introducing v(t) = U -I (t)u(t). This gives us a system with

constant coefficients

(~ 1 0 -I Sty = a v - v ,

I x I 0

after some calculations. The eigenvalues of its symbol equal is ~J~(1 + is)

and the Petrowskii condition is therefore violated.

In itself there is nothing wrong with Hadamard's definition. It is however

much more convenient to base the theory on the other definition. We will soon see

that an addition of a zero order term, which is essentially what happens in our

example above, will not change a problem well posed in L into an ill posed one. P

It is possible, by present day techniques, to answer questions on admissab!e

perturbations for certain problems, even with variable coefficients for which a

loss of derivatives as in Hadamard's definition is unavoidable. A class of problems

of this nature is the so called weakly hyperbolic equations. Some of them are of

physical interest. These questions are very difficult and we therefore conclude

that if there is a chance, possibly by a change of variables, to get a problem which

is well posed in L we should take it. P

133

One of the main conclusions of this long story is that we have to live with L 2

norms in the first order case. It is well known that an L 2 function might be

unbounded in L and the error bounds in L 2 which we will derive shortly might there- co

fore look quite pointless. At the end of this series of talks we will however see

than an assumption of some extra smoothness of the solution of the differential

equation will enable us to get quite satisfactory bounds in the maximum norm as well.

In this section we have seen examples of the use of conservation laws; the

energy was conserved for the wave equation. Similar considerations went into the

derivation of the heat equation. There is an on going controversy if the discrete

models necessarily should be made to satisfy one or more laws of this kind. First

of all, it is of course not always possible to build in all conservation laws into a

difference scheme because the differential equation might have an infinite number of

them. Secondly a difference should be made between problems which have sufficiently

smooth solutions and those which do not. In the latter case the fulfilment of the

most important conservation laws often seems an almost necessary requirement

especially in nor~inear problems. When we have smooth solutions we are however

frequently better off choosing from a wider class of schemes. The error bounds,

soon to be developed, give quite good information on convergence, etc., and it

might even be argued that the accuracy of a scheme, not designed to fulfil a certain

conservation law, might conveniently be checked during the course of a computation

by calculating the appropriate quantity during the calculation.

6. Stability. error bounds and a perturbation theore m

As in the case of a linear differential equation we can introduce a solution

operator Eh(nk,n+k), 0 < nl ~ n, for any finite difference scheme. It is the

mapping of the approximate solution at t = nlk into the one at t = nko For explicit

schemes the solution operator is just a product of the particular difference

operators on the various time levels.

Let us write a one step implicit scheme symbolically as

(I + Q_,) u(x,t+k) = (I + kQo) u(x,t)

134

where Qe and Q-I are difference operators, and assume that (I + kQ. I)'I exists and

is uniformly bounded in the norm to be considered. Then

Eh(t+k,t) = (I + ~.,)" (I + kQo)

and there is no difficulty to write up a formula for Eh(nk,nlk).

A simple device enables us to write multistep schemes as one step systems. We

illustrate this by changing the second difference scheme of Section 3 into this form.

Introducing the vector variables

v(x,t) = /u(x't+k) )

u(x,t) /

the difference scheme takes the form

2kD_D+ I v(x,t+k) = v(x,t) .

I 0

The same device works for any multistep scheme and also when we have a system of

difference equations. When we speak about the solution operator for a multistep

scheme we will always mean the solution operator of the corresponding one step sys~.

We will now introduce our stability definitions. Stability is nothing but the

proper finite difference analogue of well posedness.

Definition A finite difference scheme is stable in L if there exist constants P

and C such that

l~(nk,nlk)fllp, h ~ Cexp(~(nk-n,k)) llfllp, h o

The finite difference schemes are defined at mesh points only. Therefore we

use a discrete L norm in this context defined by P

"U"p,h = I~ hs 'u(x)'P? I/p .

x~ h

It should be stressed, that for each individual mesh size, all our operators

are bounded. Therefore the non trivial feature about the definitions is that the

constants C and ~ are independent of the mesh sizes as well as the initial values.

Frequently we will use another stability definition°

135

Definition A scheme is strongly stable with respect to a norm [II. If Ip, h ,

uniformly equivalent to li.llp,h, if there exists a constant ~ such that

I J l~h((n+1 )k ,~ )e l I Ip,h '~ ( l ~ k ) 1t ~el llp, h •

We recall that ll.llp, h and III-II Ip, h are uniformly equivalent norms if there

exists a constant C > O, independent of h, such that

(1/c ) l lfllp, h g l l l f l l l p , h ~ C I I I f l [ I p , h

for all f ~ Lp, h.

It is easy to verify that a scheme strongly stable with respect to some norm is

stable. The strong stability reflects an effort to control the growth of the solutie~

on a local level. Note that we have already established that certain finite diffe-

rence approximations to the heat equation are strongly stable with respect to the

maximum norm. Our proof that the L 2 norm of any solution of a symmetric first order

system has a limited growth rate gives hope that certain difference schemes for such

problems will turn out to be strongly stable with respect to the L2, h norm.

In many oases we will however be forced to choose a norm different from llQllp, h

in order to assure strong stability. For a discussion of this difficult subject we

refer to Kreiss [1962] and Richtm~er an~ ~orton [1967].

For any stable scheme, the coefficients of which do not defend on n, there ex~ts

a norm with respect to which the scheme is strongly stable. This can be shown by

the following trick which the author learned from Vidar Thom~e.

The fact that the coefficients of the difference schemes do not depend on time

makes Eh(nk,n,k ) a function of n-n, only. We can therefore write it as Eh(nk-n,k)o

Introduce

II lelllp,h = ~8 lle'~lk ~(~)fllP,h "

It is easy to show that this is a normo It is equivalent to ll.ilp, h because by

stability and a choice of i = O,

llfllp, h • l llflllp, h ~ cllfllp, h

136

Our difference scheme is clearly strongly stable because,

[I [Eh(k)fl l ip, h = {u~ lie -~lk Eh(lk)Eh(k)f[lp, h

eak I I l~ l l lp ,h •

= su lie -~ l k Eh((l+l )k)f [ [p, h

We could consider using a weaker stability definition. A closer study gives th~

following analogae of the Hadamard condition.

Definition A finite difference scheme is weakly stable in L if there exis% P

constants ~, C and p such that

IIEh(nk,n,k)f[Ip, h ( C(n-n,+l )P exp(~(nk-n,k)) I[£1p, h o

A theory based on this definition would however suffer from the same weakness

as one based on the Hadam~rd definition of well posedness. For a detailed discuss~n

cf. Kreiss [1962] or Richtmyer and Morton [1967]. It is also clear that, in general~

we will stay closer to the laws of physics if we choose to work with the stronger

stability definit ionso

This far we have only dealt with homogenous problems. Going over to the inhcmo-

genous case is however quite simple. We demonstrate this for an explicit scheme

u(x,t÷k) = u(x,t) + kQou(x,t) + ~(t),

u(x,O) = f(x)

Using the solution operator we get

u(x,nk) = ~(m<,O) f (x) + k n ~ Eh(nk,vk) F((v-1 )k) . v--1

If the scheme is stable in L we get P

Itu(n~)llp, h ~ C(exp(c~nk) Ilfllp, h

n k ~e a(nk-vk)

Now

n

+ k ~e (~(nk-vk) ×t~[O,nk]max iIF(t)llp, h) . is.v. !

~(1-e~k)/(1-eak), if ~ i o

q n k t f ~ = 0 .

137

Notice that this is really essentially a matter of computing compounded interests

on an original capital llfll and periodic savings liF(t)ll. The formalism is known

as Duhamel's principle.

Its most common application is to error bounds for finite difference schemes.

Put the solution of the differential equation into the difference scheme. As was

pointed out before we will then get an extra inhomogenous term~ the local truncatian

error, of the form kT(x,nk,h). Introduce the error which is the difference between

the approximate and the exact solution. Subtract the two difference equations.

The error will then satisfy the same difference equation with the truncation error

as an inhomogenous term. It is easy to see from our estimate that we have conver-

gence in L for 0 g t g T if max llT(nk,h)II goes uniformly to zero with the P Ognk4T p,h

mesh size and that we have a rate of convergence h r if r(nk,h) = O(h r) uniformly.

We recall that w can be computed using just Taylor series.

We now turn to the theorem on perturbation mentioned in the previous section.

For simplicity we give a proof only for time independent coefficients.

Theorem Consider a finite difference scheme

u(x,(n+1 )k) : Q u(x,~)

stable in Lp, statisfying

Ilu(nk)llp, h < C exp(amk) llu(O)llp, h .

Let Qf be an operator, uniformly bounded in Lp, h. Then solutions of

vCx,(n+1 )k) : (Q+kQf) v(x,~)

satisfy

with

Proo f

of these consist of j factors ke-~kQ ' and there at most (j+1) factors of the form

(e -~k Q)~ ~ some natural numbers.

The norm of suck a term is by our assumption bounded by k j C j+1 llQ'il j e -~kj.

llv(~k)llp, h ~ C e~(~) llv(o)llp, h

= ~ + C e -~k ilQ'llp, h .

e-~ ('I Consider the 2 v terms in the development of (e "~k Q + kQ')W, j of

138

Thus

j=O

C(1 + kCe -ak I]Q'll) v ~ C e yvk where y = C e - a k ]IQ'I] .

Thus we have verified a finite difference version of our perturbation theorem.

A differential equation theorem can now be derived by taking, for any particular

differential equations a stable difference scheme, applying the theorem just proved

and taking a limit by letting h and k go to zero.

We have not shown that a stable scheme always can be found for any well posed

differential equation but that is in fact the case. Also notice that there is

never any chance to derive stronger inequalities for a finite difference scheme thm~

those which hold for the corresponding differential equation. The most we can hope

is to get exact analogues of what is true in the continuous case.

We leave it to the reader to give another proof of our latest theorem using

the norm III. If Ip, h which we constructed in our proof that any stable scheme is

strongly stable at least with respect to one norm.

7. The yon Netmmnn condition, dissipative and multistep schemes.

In this section we will use Fourier series to derive stability conditions and

also introduce a number of useful schemes for hyperbolic equations.

We first consider the periodic problem atu = 8xU, t ) O, u(x,O) = f(x) = ~x+~

If f(x) is sufficiently smooth it can be developed in a convergent Fourier

series

eiVX f(x) -- % o

M=--GO

The solution of the equation takes the form

+oo

~, ei~(x+t) u(x,t) = %

V=--O0

139

The Euler difference scheme

vCx,t+k) = vCx,t) + kDo vCx,t) , v(x,O) = fCx)

can be studied in the same way. Its solution is

ivx v(~,,e) : % (I + Ix sin ,, h) n e

where k = k/h.

In contrast to the differential equation case the amplitude of Fourier com-

ponents will grow,, This is so becuase 11 + ik sin ~ h] = J1 + k = sin 2 v h > I for

v ~ O. From our discussion of the Courant-Friedrichs-Lewy condition we know that

k = I would be ideal and we see that choosing k equal to a constant will lead to

very rapidly growing high frequency components. It is easy to show that for many

very smooth initial value functions this very strong amplification of high frequm~y

components will lead to arbitrary large and wildly divergent approximate solutions.

The amplification of the lowest frequency modes is however not very large.

This might lead us to the following idea. Replace the initial value function by a

fixed partial sum of its Fourier series. If the initial data is sufficiently smooth

we can do this changing the values of the initial value function and the solution

by an arbitrarily small amount. If we use the new initial value for the finite

difference scheme we can see, from the explicit solution formula , that the disc1~te

solution will converge to the correct one when h goes to zero= In fact the same

argument shows that we could proceed with a k larger than I because for any constant k

(I + iX sin vh) n will converge to e i~kn for any fixed value of v.

This approach however suffers from the same weakness as a theory for differen-

tial equations based on analytic functions only. In fact a finite Fourier series

represents an analytic function and much of Hadamard's criticism of the Cauchy-

Kowaleski type theory carries over to the finite difference case.

To see that something is drastically wrong with our argument above we urge the

reader to carry out a few steps with the Euler method using k = 10 and an initial

value function which is ~ at one mesh point and zero elsewhere. The rapid growth

of this special solution will assure us of a totally unacceptable growth of round

off errors. One could say that the error of measurements which played an important

140

part in Hadamard's argument are replaced by the round off errors. From the error

bound in section 6 we see that we will not be seriously affected by round off errors

if a difference scheme is stable. This is thus a reason, perhaps the most impor-

tant one, why we insist on using only stable schemes for computations.

There is a simple remedy for the lack of stability of the Euler scheme namely

the addition of a so called dissipation term. The term corresponds to a finite

difference approximation of yet another term in the Taylor expansion of u(x,t+k)

with respect to t. This way we get the Lax-Wendroff scheme.

v(x,t+k) = v(x,t) + kDov(x,t) + k~/2 D_D.v(x,t) , v(x,O) = f(x) .

The coefficient of e ivx is now amplified by a factor

i + iX sin ~h - (k2/2) sin~(~h/2) in each step.

It is elementary to verify that this factor is less than or equal to one in absolute

value for k < i. This clearly ensures strong L 2 stability. From this, convergence

follows as well as a relative insensitivity to round off errors.

We have now developed a simple technical tool which allows us to decide the

qualities of all the schemes suggested for the heat equation. The results will he

reveal ed shortly.

The dissipation of the Lax-Wendroff scheme acts to damp out the higher frequency

modes of the solution and this is frequently just as well because they must contain

rather serious phase errors. For sufficiently smooth solutions, which means quickly

decreasing Fourier coefficients with increasing v, these modes play a vezyunimpor-

tant part in the representation of the solution. Similar considerations frequently

make very much sense in cases when we do not have constant coefficients. Heuristi-

cally we can argue that the variability of the coefficients will make various

Fourier modes interact in a way which is very hard to analyse. However we can

expect serious phase errors for high modes. Not only will these components be in

error but they will interact with other components in a totally erroneous way. In

such a case it seems advisable to damp out such modes in the discrete model.

1 4 1

Sometimes we however are quite anxious to have an energ~ preserving discrete

model. This is for instance the case when we have to calculate over long periods of

time and with only very weak forcing functions. One simple scheme which preserves

the energy for the case atu = 8xU is the leap frog scheme also known as the mid-point

rule when it is used for ordinary differential equations,

v(x,t+k) = ~(x,t-k) + 2~. v(x,t)

Another one is the Crank-Nicolson scheme

(I - k/2 Do) v(x,t+k) : (I + k/2 Do) v(x,t) .

Fourier analysis shows that the amplification per step for the Crank-Nicolson scheme

is (I + i k/2 sin vh)/(1 - i k/2 sin vh) and thus that the amplitude is preserved.

For the leap frog scheme we look for solutions of the form v(x,nk) = ~,t eiVX and

get, by the solution of a quadratic equation, the two roots ~,,~ = ik sin vh +

J1 - k 2 sin 2 vh which, for k ~ I, both lie on the unit circle. The multistep

character is reflected in the existence of two independent solutions.

I I 4 1 which A similar analysis for the backward scheme shows that ['I _ ik sin vh

again implies strong stability.

We now turn to aFourier analysis of the schemes suggested for the heat equation.

Using k for k/h 2 we find that the amplification factor for Euler's method is

I - )J+ sin 2 (vh/2), thus it is stable for k ~ ½. For the Crank-Nicolson scheme:

(I - 2k sin 2 (vh/2))/(1 + 2k sin 2 (vh/2)); unconditionally stable° For the backward

scheme I/(I + 4k sin 2 (wh/2)), unconditionally stable. For the mid-point rule an

Ansatz of the form "~e ivx leads to ~bL= -4k sin~(vh/2) + ~I + (&k sin2(wh/2)) ~

which shows that the scheme is unstable for any constant ko In a similar way we

could also show that the Dufort-Frankel scheme is unconditionally stableG

Another interesting method, of fourth order accuracy in time, is Milne's method

I 4 kQv(x, t) (i - y~ kQ) v(x,t+k) = (~ + y kQ) v(x,t k) + y

where Q stands for Do or D÷D.. The roots of the corresponding quadratic equation

are (i kq _+ J1 + (kq)" / (I - i kq) where q stands for i sin vh or -4 sin2~h/2 o

142

Thus the method preserves energy for the hyperbolic case, provided k ~ ~, an& is

violently unstable for the parabolic case.

The instability of the Milne and mid-point methods is closely related to the

well known weak stability of these methods when applied to ordinary differenial

equations. Cf. Dahlquist K1956], [1963]. In fact parabolic equations are very stiff

equations and weakly stable schemes are therefore quite useless.

We are now well prepared for the following definition.

Definition Consider a linear finite difference scheme

u.+~ = Qun

and define its symbol by

~) = e~( i ~ ~, x,) Q e~(i < ~, x 7) .

The difference scheme satisfies the yon Neumann condition if the spectral radius of

its symbol is bounded by e ~k for some constant ~.

The von Ne,,m~nu condition is the finite difference analogue of the Petrowskii

condition. It is a necessary condition for stability° It ~as shown by Kreiss [1962]

to be equivalent to weak stability in L 2 in the case of constant coefficients. It

is not a sufficient stability condition for problems with variable coefficients for

it suffers from the same inadequacies at the Petrowskii condition. Cf. Kreiss [1962].

8. Semibounde~ operators

We now formulate an abstract condition on the differential operator P and its

boundary conditions in order to assure well posedness. We will also see that we

can use the same argument to prove stability for finite difference schemes. We

begin by forming, using the L 2 inner product I u v dx,

8t(u,u) = (u, Pu) + (Pu, u) = 2Re(Pu, u) o

Definition An operator P, with its boundary conditions, is semibounded if

Re(Pu, u) ~ Const. llull 2

for all sufficiently smooth functions u satisfying the boundary conditions°

143

For a semibounded operator we clearly get the a priori inequality

llu(t)ll, ~ exp(Const, t) llu(O)ll~

and the problem is well posed in L 2 if solutions exist. Uniqueness Immediately

follows from the inequality. In order to assure existence we have to be sure that

we do not have too many boundary conditions. Cf. the discussion in §2. For certain

types of equations it ~ known that a solution exists if the problem is minlmally

semibounded.

Definition An operator P, with its boundary conditions, is minimally semibounded

if P is semibounded and this property is lost by removing any linearly independent

boundary conditions.

In periodic cases it is easy to verify that the following expressions are semi-

bounded:

A~ x + @x A

@ A8 , x X

i@ A@ , x X

@x Aax '

A hermitian (the constant = 0);

A real, symmetric and positive definite;

A hermitian (the constant = O);

A skew hermitian (the constant = O)~

A sum of such expression is also semibounded. We note that we are in no way

restricted to only one space variable. We leave to the reader to verify which of

our examples in §2 are semibounded for periodic cases.

To illustrate the theory we consider the heat equation on 0 ~ x ~ I with the

boundary conditiens Sou(O) + ~oSxU(O ) = O and u~u(1 ) + ~axU(1 ) = O, ui' ~i real.

The corresponding periodic problem is easy to treat because

8t(u(t), u(t)) = 2Re(u, 8~u) = - 2118xUll 2 ~ O .

No term comes from the boundary because of the periodicity°

For the actual case we get

2

8t(u(t), u(t)) = -2Hakull ' ÷ 2ReSx;(1)u(1) - 2Reax~(0)u(O ).

This illustrates the necessity of boundary conditions° The first expression is

144

negative but the L 2 norm of 8 u cannot be used to give a bound for a u at a parti- x x

cular point. Without the boundary conditions we could therefore not conclude that

the right hand side is bounded by ccnst, llull ~ for all u. Using the boundary eondi-

tions we see that

2 o ~ ( ~ ) . 0 ) - 2 a u ( o ) u ( o ) = - 2 ~, luO )1 ~ + 2 ~o I~(o)1

if ~o and ~I ~ 0. Our first observation is that we clearly have semiboundedmess if

~161 ) 0 and ~oBe ~ 0. We now ask the question if we can do without such conditions.

In fact we can because the pointwise value of a function can, in one dimension, be

estimated by the L 2 norms of the functions and its first derivatives. This is a

simple special case of a Sobolev inequality. To get this inequality carry out the

following calculations : x min

xm~X lu(x)l 2 - ~n lu(x)l 2 = 2Re f USxU dx

x max

where Xmax,(Xmi n) is a value for which the maximum, (minimum) of lu(x)l 2 is obtained.

A standard inequality shows that

x max I

f / I 2 u a u ~ 1 ~ 2 [uaxul ~ ~ c, la, ull + l i / , l lu l i " X ..2 2 "

X , 0 mln

Therefore

~.= lu(x)l" ~ ~lla. ull" + I/~ llull" + ~ lu(x)l' x x 2 2 x

but I

~ tu(~)l • <. [ lu (x ) l " ~ = Ilull ~

and thus

max i u ( x ) l ' ~ ~ll~=~ll~ + (1 + 1 /~) llull" 2

All we have to do to use this result for our heat equation is to make a choice

of a sufficiently small ~.

The argument can also be carried out when ~o or ~i, or both are zero.

We have used an L 2 norm instead of a maximum norm in this study of the heat

equation. At this point mathematical convenience has thus been allowed to dominate

145

over our physical intuition.

It is obvious that we can prove well posedness in L 2 if we can find any norm,

equivalent to the L 2 norm, with respect to which an operator is semiboundedo One

of the big problems with the energy method is often the difficulty of finding an

appropriate norm for which an operator is semibounded. The failure to show that a

certain problem, which satisfies the necessary Petrowskii condition, is semibounded

can therefore depend on that the problem is ill posed but also on a lack of expertise

on our part in dealing with Sobolev inequalities or that we have chosen the wrong

inner product or even the incorrect dependent variables.

We will now indicate how we can use these ideas to construct difference

expressions which satisfy similar bounds. This in fact will provide us with a most

useful guide line for the construction of stable schemes. It is thus appropriate

to replace an expression of the form A(x)8 x + 8xA(X), A hermitian, by A(x)Do +

DoA(x). In the periodic case this will give us an antisymmetric operator just as

for the differential operator. Similarly 8xA(X)8 x could be replaced by DoA(x)Do or

even better, by D_A(x+h/2)D÷. Higher order accuracy can also be obtained. Thus

Do - h2/6 DoD_D÷ is a more accurate approximation to 8 x but it still preserves the

crucial quality of being an antisymmetrie operator. The proof that these difference

operators have the required properties follows easily from sun~ation by parts. When

the boundary conditions are more involved it is advisable to start by carrying out

detailed analysis of the boundary terms for the differential equations to assure

semiboundedness and then try to pick sufficiently accurate difference analogues to

the boundary conditions to ensure a preservation of semiboundedness.

For a very good presentation of these matters we refer to Kreiss [1963]. Cf.

also Richtmyer and Morton [1967].

9. Some application~ of the energy method

The purpose of this section is to use the building blocks from §7 and §8, the

semibounded difference operators and the stable multistep scheme, in order to con-

struct stable schemes for a wide class of problems.

146

We first consider the backward scheme

(I - kQ)u~÷1 = u.

where we assume that Re(Qu. u) ~ C llull 2.

Take the scalar product of both sides with u.÷1 and use a series of simple

inequalities •

liu,,+,ll Jlu,,ll +, ( u . + , , u,,,) -- ( u . + , , u . + , - ~:Q u . + , )

2 = ll,+,+,.,+,Ii - k:(,.+.,,,.,.,, Q u , , + , ) ~ l lu ,+ . , l J" - c k 11 l l . .u ,+ , , . +' .

Therefore. if C k < I

I flu. lJ

and the scheme is L 2 stable, for small enough k.

This calculation also shows that (I - kQ) always has a bounded inverse in L 2

for k sufficiently small if Q is semibounded, a fact most important for all implicit

schemes.

We now consider the Crank-Nicolscn scheme

(I - ~/2Q)un+, = (I + k/2Q)u.

Rewrite it as

~++ - u. = k/2Q(u.++ + u.)

and take the scalar product with un+ I + u.. The left han~ side comes out to be

llu.++ll 2 - llu.ll 2 and the right hand side is less than k/2 C llun÷1 + u.li a, which is

less than or equal to k C (flu.÷ill 2 + IlunIl2)o Thus

llu,+,,li" < (I + kc) / (1 -kc) l lu , . , l l" °

Of great interest is the study of schemes of the form

(I - kQ, )un+, ~QoU. + (I + kQ, )u._, +

The leap frog and Milne's methods have this form and if Qo = 0 it re@uces to the

Crank-Nicolson scheme+ For Qo and QI we want to choose semibounded difference

operators such as those arising from a proper discretization of the spatial opera,re

II .

from a hyperbolic, parabolic. Schrodlnger equation or from any other problem which

147

is semibounded in L 2. Frequently we also find both first order antisymmetric

operators as well as even order ones with symmetric positive definite coefficients

in the same problem. Where should we put these different pieces?

We know from a previous discussion that the mid-point rule is bad for stiff

problems such as parabolic ones. We should therefore avoid making Qo elliptic. We

therefore require that Qo is antisymmetric or (Qou,v) = - (u,Qov) for all u and v.

This is equivalent, for an operator with real coefficients, to require that

(Qeu, u) = 0. We note that QI could also contain antisymmetric parts such as in

Milne's method for 8tu = 8xU. For the operator QI, we only require that it is semi-

bounded. To simplify our arguments we assume that (Q1u, u) ~ O.

We want to establish the L 2 stability of this class of schemes. Being a two

step scheme the norm (lJUnll 2 + ljUn÷lil=) ½ suggests itself. However we will find that

Ln = UUn÷111 ~ + IIunl[ 2- 2k(Qoun, Un÷1) will be the appropriate expression in the sense

that L n is equivalent to the one first suggested and is a non increasing function of

n. We have to put a restriction on Qo namely kilQoll ~ I - 6, 8 a constant > 0 in

order to make Ln strictly positive. We will see by an example that this is a most

natural restriction°

We first show that Ln ~ Ln-1. Rewrite the equation as

u .+ l - un- I : k Q t ( u . ÷ , + u . - t ) + 2 ~ o U .

and take the scalar product with Un÷t + Un-1. Then

ilun÷,ll ~- llu~_,il ~ = (u.÷, + u._,, kQ,(u.~, + Un-,) + 2k(un÷, + u._,, Qou~).

The first term on the right hand side is less than or equal to zero because of one

of our assumptions. Rearranging and adding llUnll 2 on both sides we get Ln ~ Ln-1 o

To show that Ln is positive and equivalent to the natural L 2 norm we start by

observing that

12k(QoUn÷~, u , ) l ~ 2(1-8)IlUn÷tll llu.II ~ (1-6) (llun+,ll2 + Ilu, ll~).

Therefore

6(llu.÷,ll 2 + llu.il 2) ~ Ln ~ (2-6) (Ilun+~lJ 2 + ljUnIl2).

148

To see that kIIQell ~ S - 6 is a natural condition consider the case Qo = Do and

QI = Oo This Qe has, as is easily verified, an L 2 norm equal to I/h. Thus the

restriction just means k/h ~ i - 8, essentially the Courant-Friedrichs-Lewy condit~n.

This is a natural condition in terms of Qo alone because in the case QI = 0 the

method is explicito

For a more general discussion and a comparison of the growth rates of the exact

and approximate solutions we refer to Johansson and Kreiss [1963].

Schemes of Dufort-Frankel type can be discussed in very much the same way°

We will now show that these ideas can be used to design stable and efficient

schemes, so called alternating direction implicit schemes, for certain two dimensional

equations. We suppose that our problem has the form

8tu = P,u + P2u

and that the operators and the boundary conditions are such that PI and P2 are semi-

bounded. For simplicity we assume that

Re(u, Pju) ~ o , j = 1,2,

and that we have finite difference approximations Qj to Pj, j = I ,2, such that

Re(u, Qju) ~ 0 , j = 1,2.

We will consider the following two schemes

(I - kQ,)(I - kQ2)un÷, = u.

and

(I -k/2Q I )(I - k/2Q~)un+, = (I + k/2Q, )(I + k/2Qa)un .

These schemes are particularly convenient if QI and Qz are one dimensional finite

difference operators. In that case we only have to invert one dimensional operators

of the form (I - akQi) and this frequently leads to considerable savings. This

becomes clear if we compare the work involved in solving a two dimensional heat

equation~ using an alternating direction implicit method with QI = D-zD+:,

Q2 = D_~D+9, and the application of the standard backward or Crank-Nicolson scheme

with Q = D_zD÷: + D_gD+~© The former approach only involves solutions of linear

systems of tridiagonal type while the other, in general, requires more work.

149

The L 2 stability o f the first scheme is very simple to prove. Thus I - k Qi'

i = 1,2, both have inverses the L 2 norms of which are bounded by I. The proof of the

stability of the other scheme is more involved. Let

and

Then

o r

Y.÷ t = (1 - k /2Q2)un÷ 1

z n = (1 4,, k / 2 Q , ) u n •

(1 - k /2Qt )Yn+1 = (1 + k /2Qt ) z .

y.+, - :. : k/2 Q,(y.., + z.) o

Forming the inner product with Yn+, + z,, just as in the proof of the stability of

the Crank-Nicolseu method, we get

I t y . . , t l " - II.,,11" = k / 2 R e ( Q , ( y , + t + Zn) , Yr,+, + zn) ~ O.

Now

l i y , , , , l l ' = Ilu,,÷,ll" - w '2 Re(Q.un., . , , u .+ , ) + k 2 / 4 llQ, u , , . , l l '

and

1t.,,ti ~ = Ilu,,ll ~ + k / 2 ~e (Q~u, , , u, , ) + z ' / ~ IlQ~u,,ll ~ .

Therefore~ because Re(Q2u , u) ~ O,

l i ~ . . , t i ~ , k ~ / ~ ilQ~u,,, . , I I ~ ,~ i tu. I I ~ + k ~ / 4 liQ~u, II ~ ,,

It is easy to see that this implies L 2 stability if kQ2 is a bounded operator. If

kQ 2 is not bounded we instead get stability with respect to a stronger norm, a result

which serves our purpose equally well.

We refer to an interesting paper by Strang [1968] for the construction of other

stable accurate methods, based on one dimensional operators@

lO. Maximum norm convergence for L 2 stable schemes

In this section we will explain a result by Strang [1960] whi~'h shows that

solutions of L 2 stable schemes of a certain accuracy converge in maximum norm with

the same rate of convergence as in L 2 provided the solution of the differential

equation is sufficiently smooth.

1 5 0

Let

u. . , = Qu. , U o ( X ) -- f ( x )

be a finite difference approximation to a linear problem,

8tu = ~ , u(x,O) = f(x) ,

well posed in L 2.

To simplify matters we assume that the two problems are periodic. We also

assume that we have an L 2 stable scheme. It is known that if f is a sufficiently

smooth function the solution will also be quite smooth. We now attempt to establish

the existence of an asymptotic error expansion of the error° Make the Ansatz

u, ix) : u(x,~) + hrer(x,~) + hr''er.,(x,nk) + ....

where we choose r as the rate of convergence in L2o If we substitute this

expression into the difference equation we find that the appropriate choice for er,

er+1, are solutions of equation of the form

8te j = Pej + Lju

ej(O) = o

where Lj are differential operators which appear in the formal expansion of the

truncation error. The solutiormof a finite number of these equations are under our

assumptions quite smooth.

To end our discussion we have to verify that

~n,N(X,h) = u,(x) - u(x,nk) - ~h 1%(x,nk)

l=r

is O(h r ) in the maximum norm for some finite N, i.e. that hre is indeed the leading r

error term. This is done by a slight modification of the error estimate of §6@ We

derive a difference equation for eL, N and find that its L 2 norm is O(h r+N+1 ). By

assumption we have a periodic problem. The maximum norm of a mesh function is there-

fore bounded by h -s/2 times its L2, h norm over a period, where s is the number of

space dimensions. This concludes our proof.

151

We remark that an almost identical argument shows that we can relax our stab4_li~

requirements and require only weak stability (Cf. §6) and still get the same results

for sufficiently smooth solutions.

REFERENCES

Brenner, P.; 1966, Math. Scand., V.19, 27-37.

Courant, R., Friedrichs, K. and Lewy, H.; 1928, Math. Annal., V.100, 32-7~ also;

1967, IBM J. of Research and Development, V.ii, 213-247.

Dahlqulst, G.; 1956, Math. Scand., V.~, 33-53.

Dahlqulst, G.; 1963, Prec. Sympes. Appl. Math., V.15, i~7-158.

Friedman, A.; 196~, Partial Differential Equations of Parabolic Type. Prentice-Hall.

Garabedian, P. ; 1964, Partial Differentiml Equations. Wiley.

Gelfand, I.M., Shilev, G.E.~ 1967, Generalized Functions, V.3 Academic Press.

Haaam~rd, J.; 1921, Lectures on Cauc~ problem in linear partial differential

equations. Yale University Press.

Jehanssen, ~, Kreiss, H.0.; 1963, BIT, V.3, 97-107.

John, F.; 1952, Comm. Pure. Appl. Math., V.5, 155-211.

Kreiss, H.0.; 1959, Math. Scand., V.7 71-80.

Kreiss, H.O.; 1962, BIT, V.2, 153-181

Kreiss, H.0.; 1963, Math. Scand., V.13, 109-128.

Kreiss, H.0o; 1963, Numer. Math., V.5, 27-77.

Krelss, H.0., Widlund, 0.; 1967, Report, Computer Science Department, Uppsala, S ~

Lax, P.D.; 1957, Duke Math. J., V.24

Lax, P.D. ; 1963, Lectures on hyperbolic partial differential equations, Stanford

University (lecture notes).

Littman, W.; 1963, J. Math. Mech., V.12, 55-68.

Richtm2er , R.D.; 1957, Difference methods for initial-value problems.

Wiley Interscience.

Richtmyer, R.D., Morton, K.W.; 1967, Difference methods for initial-value problems.

2nd Edition Wiley Interscience.

152

Strang, W.G.; 1960, Duke Math. J., V.27, 221-231o

Strang, W.Go; 1966, J. Diff. Eqo, V.2, 107-114.

Strang, W.G.! 1968, SWAM J. Numer. Anal., V.5, 506-617.

Thomele, V.; 1962, J. SIAM, V.lO, 229-245.

Thomee, V.; 1969, SIAM Review, V.11, 152-195.

symposium on the theory of numerical analysis

Documents