symposium on the theory of numerical analysis
TRANSCRIPT
Lecture Notes in Mathematics A collection of informal reports and seminars Edited by A. Dold, Heidelberg and B. Eckmann, Z0rich
193
Symposium on the Theory of Numerical Analysis Held in Dundee/Scotland, September 15-23, 1970
Edited by John LI. Morris, University of Dundee, Dundee/Scotland
Springer-Verlag Berlin. Heidelbera • New York 1971
A M S Subjec t Classif icat ions (1970): 65M05, 65M10, 6 5 M 15, 65M30, 6 5 N 0 5 , 6 5 N 10, 6 5 N 15,
6 5 N 2 0 , 6 5 N 2 5
I S B N 3-540-05422-7 Springer-Verlag Berlin • He ide lbe rg • N e w York I S B N 0-387-05422-7 Springer-Verlag Near Y o r k • He ide lbe rg • Berlin
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks.
Under § 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher.
© by Springer-Verlag Berlin • Heidelberg 1971. Library of Congress Catalog Card Number 70-155916. Printed in Germany.
Offsetdruck: Julius Beltz, Hemsbach
Foreword
This publication by Springer Verlag represents the proceedings of a series
of lectures given by four eminent Numerical Analysts, namely Professors Golub,
Thomee, Wachspress and Widlund, at the University of Dundee between September
15th and September 23rd, 1970o
The lectures marked the beginning of the British Science Research Council's
sponsored Numerical Analysis Year which is being held at the University of Dundee
from September 1970 to August 1971. The aim of this year is to promote the theory
of numerical methods and in particular to upgrade the study of Numerical Analysis
in British universities and technical colleges. This is being effected by the
arranging of lecture courses and seminars which are being held in Dundee through-
out the Year. In addition to lecture courses research conferences are being
held to allow workers in touch with modern developments in the field of Numerical
Analysis to hear and discuss the most recent research work in their field. To
achieve these aims, some thirty four Numerical Analysts of international repute
are visiting the University of Dundee during the Numerical Analysis Year. The
complete project is financed by the Science Research Council, and we acknowledge
with gratitude their generous support. The present proceedings, contain a great
deal of theoretical work which has been developed over recent years. There are
however new results contained within the notes. In particular the lectures pre-
sented by Professor Golub represent results recently obtained by him and his co-
workers. Consequently a detailed account of the methods outlined in Professor
Golub's lectures will appear in a forthcoming issue of the Journal of the Society
for Industrial and Applied Mathematics (SIAM) Numerical Analysis, published
jointly by &club, Buzbee and Nielson.
In the main the lecture notes have been provided by the authors and the
proceedings have been produced from these original manuscripts. The exception
is the course of lectures given by Professor Golub. These notes were taken at
the lectures by members of the staff and research students of the Department of
Mathematics, the University of Dundee. In this context it is a pleasure to ack-
nowledge the invaluable assistance provided to the editor by Dr. A. Watson, Mr.
IV
R. Wait, Mr. K. Brodlie and Mr. G. McGuire.
Finally we owe thanks to Misses Y. Nedelec and F. Duncan Secretaries in
the Mathematics Department for their patient typing and retyping of the manu-
scripts and notes.
J. L1. Morris
Dundee, January 1971
Contents
G.Golub: Direct Methods for Solving Elliptic Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . . I
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2 2. Matrix Decomposition . . . . . . . . . . . . . . . . . . . 2 3. Block Cyclic Reduction . . . . . . . . . . . . . . . . . . 6 4. Applications . . . . . . . . . . . . . . . . . . . . . . . 10 5. The Buneman Algorithm and Variants . . . . . . . . . . . . 12 6. A~curacy of the Buneman Algorithms . . . . . . . . . . . . 14 7. Non-Rectangular Regions . . . . . . . . . . . . . . . . . 15 8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . 18
G.Golub: Matrix Methods in Mathematical Programming . . . . . . . 21
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 22 2. Linear Programming . . . . . . . . . . . . . . . . . . . . 22 3. A Stable Implementation of the Simplex Algorithm ..... 24 4. Iterative Refinement of the Solution . . . . . . . . . . . 28 5. Householder Triangularization . . . . . . . . . . . . . . 28 6. Projections . . . . . . . . . . . . . . . . . . . . . . . 31 7. Linear Least-Squares Problem . . . . . . . . . . . . . . . 33 8. Least-Squares Problem with Linear Constraints ...... 35
Bibliography . . . . . . . . . . . . . . . . . . . . . . . 37
V.Thom@e: Topics in Stability Theory for Partial Difference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Preface . . . . . . . . . . . . . . . . . . . . . . . . . 42 I. Introduction . . . . . . . . . . . . . . . . . . . 43 2. Initial-Value Problems in L ~ w~th Constant Coefficients . 51 3. Difference Approximations in L ~ to Initial-Value Problems
with Constant Coefficients . . . . . . . . . . . . . . . . 59 4. Estimates in the Maximum-Norm . . . . . . . . . . . . . . 70 5. On the Rate of Convergence of Difference Schemes ..... 79
References . . . . . . . . . . . . . . . . . . . . . . . . 89
E.L.Wachspress: Iteration Parameters in the Numerical Solution of Elliptic Problems . . . . . . . . . . . . . . . . . . . . . . 93
I. A Concise Review of the General Topic and Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2. Successive Overrelaxation: Theory . . . . . . . . . . . . 98 3. Successive Overrelaxation: Practice . . . . . . . . . . . 100 4. Residual Polynomials: Chebyshev Extrapolation: Theory .102 5. Residual Polynomials: Practice . . . . . . . . . . . . . . 103 6. Alternating-Direction-lmplicit Iteration . . . . . . . . . 106 7. Parameters for the Peaceman-Rachford Variant of Adi .107
0.Widlund: Introduction to Finite Difference Approximations to Initial Value Problems for Partial Differential Equations .111
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . 112 2. The Form of the Partial Differential Equations ...... 114 3. The Form of the Finite Difference Schemes . . . . . . . . 117 4. An Example of Divergence. The Maximum Principle ..... 121 5. The Choice of Norms and Stability Definitions ...... 124 6. Stability, Error Bounds and a Perturbation Theorem . .133
VI
7. The yon Neumann Condition, Dissipative and Multistep Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 138
8. Semibounded Operators . . . . . . . . . . . . . . . . . . 142 9. Some Applications of the Energy Method . . . . . . . . . 145
10. Maximum Norm Convergence for L 2 Stable Schemes ..... 149 References . . . . . . . . . . . . . . . . . . . . . . . 151
Direct Methods for Solving Elliptic Difference Equations
GENE GOLUB
Stanford University
i. Introduction
General methods exist for solving elliptic partial equations of general type
in general regions. However, it is often the ease that physical problems such as
those of plasma physics give rise to several elliptic equations which require to be
solved mauy times. It is not unco~non that the elliptic equations which arise re-
duce to Poisson's equation with differing right hand side. For this reason it is
judicious to use direct methods which take advantage of this structure and which
thereby yield fast and accurate techniques for solving the associated linear
equations.
Direct methods for solving such equations are attractive since in theory they
yield the exact solution to the difference equation, whereas commonly used methods
seek to approximate the solution by iterative procedures [12]. Hockney [8] has
devised an efficient direct method which uses the reduction process• Also Buneman
[2] recently developed an efficient direct method for solving the reduced system
of equations. Since these methods offer considerable economy over older tech-
niques [5], the purpose of this paper is to present a unified mathematical deve-
lopment and generalization of them. Additional generalizations are given by
George [6].
2. Matrix Decomposition
Consider the system of equations
= ~ , (2 .1 )
where M is an NxN real symmetric matrix cf block tridiagonal form,
M =
A T
T A e
• • W
T A
(2 .2 )
The matrices A and T are p×p symmetric matrices and we assume that
AT = TA .
This situation arises in many systems• However, other direct methods which are
applicable for more general systems are less efficient to implement in this case.
Moreover the classical methods require more computer storage than the methods te be
discussed here which will require only the storage of the vector ~. Since A and T
commute and are s~etric, it is well known Ill that there exists an orthogonal
matrix Q such that
QT A Q = A, QT T Q = 0 ,
and A and O are real diagonal matrices.
(2.3)
The matrix Q is the set ef eigenvectars of
A and T, and A and n are the diagonal matrices of the p-distinct eigenvalues cf A
and T, respectively•
To conform with the matrix M, we write the vectors x and ~ in partitioned form,
x --
X ~q
° i I
Furthermore, it is quite natural to write
x2j I
xj = .
I X
, p j I
L j
System (2.2) may be written
~Cj = 2J
YPJ I
J = 2,3,...,q-1 ,
(2.~_)
(2.5a) (2.5b)
T~q_ I + AX~q = ~ . (2.5e)
Frem Eq. (2.3) we have
A = Q A QT and T = Q O QT •
Substituting A and T into Eq. (2.5) and pre-multiplying by QT we obtain
where
z f ~
rewritten as
(,i = 2,3 , . . . ,q - i ) (2.6)
- = Q~x = Q~ x..i ~CI ' Z,i '~J ' J = 1 ,2 , . . . ,q .
and ~j are partitioned as before then the ith components of Eq. (2,6) may be
u N u
~iXij_l + kiXij + ~ixij+l
wiXiq-I + klXiq = Ylq j
fer i = 1,2,...pp.
= ~-~j , (j = 2,...,q-~) ,
If we rewrite the equatio~by reversing the rolls of i and J we may write
r i =
% - -=
P
6o i X i - q x q
" N
Xil
x i 2
Xiq
A
-] Yil
Yi2
1
so that Eq. (2.7) is equivalent to the block diagonal system of equations,
r i ~ o ~ , ( i ~ 1 , 2 , . . . , p ) . (2.8)
Thus, the vector ~isatisfies a symmetric tridiagonal system of equations that has a
constant diagonal element and a constant super- and sub- diagonal element.
(2.8) has been solved block by block it is possible to solve for ~j = Q~j.
have:
Algorithm 1
1. Compute or determine the eigensystem of A and T.
2. 0o~pute ij Q~j (J 1,2,.. . ,ql.
3. Solve ri~i = ~ (i = 1,2,...,p).
~. Compute xj = ~j (j . 1,2,...,q).
After Eq.
Thus we
r i =
For our system
k i w i
and the eigenvalues may be written down as
v = 2~ i r_~ ir k i + cos q+l
" " ~i
si ki
r = 1,2,..., q
It should be noted that only Q, and the yj, j = 1,2,...,q have to be stored,
A since _~ oan over~rite the ~j the ̂ ~ can overwrite the ~ and the ~joan overwrite
the ~j. A simple °aleulatien will show that approximately 2plq + 5Pq arithmetic opera-
tors are required for the algorithm when step 3 is solved using @aussian el4m4~a-
tion for a tridiagonal matrix when r i are positive definite. The arithemtic opera-
ters are dominated by the 2p2q multiplications arising from the matrix multiplica-
tions of steps 2 + 4. It is not easy to reduce this re,tuber unless the matrix Q ham
special properties (as in Poisson's equation) when the fast Fourier transform can be
used (see Hookney [8]) .
er that
r i = Z V i Z T ,
rs~ V i the diagonal matrix ef eigenvalues of r i and Zrs = o s sin ~ . Since r i and rj
have the same set of eigenvectors
r i rj = rj r i .
Because of this decomposition, step (3) can be solved by computing
~i = Z V~' Z T
where the Z is stored for each r i. This therefore requires of the order of 2pq"
multiplications and this approximately doubles the computing time for the algorithm.
Thus performing the fast Fourier transform method in step 3 as well as steps 2 and
is not advisable.
3. Block C,yclic Reductien
In Section 2, we gave a method for which one had to know the eigenvalues and
eigenvectors of some matrix. We now give a more direct method for solving the
system of Eq. (2.1).
We assume again that A and T are symmetric and that A and T commute. Further-
more, we assume that q = m-I and
m = 2 k+l
where k is some positive integer. Let us rewrite Eq. (2.5b) as follows:
~.i-2 + A~j-I + ~J = ~J-l '
TXj_l + A~j + Txj+ 1 = ~j ,
~j ÷ ~J+l + ~J+2 = ~j+l "
Multiplying the first and third equation by T, the second equation by -A, and addim@
we h a v e
T2xj_ 2 + (2T" - A 2)xj + T2xj+ 2 = T~j_I - A~j + T~j+I .
Thus if j is even, the new system of equations involves x.'s with even indices. ~j
Similar equations held for x• and Xm_ 2. The process of reducing the equations in
this fashion is known as c2clic reduction. Then Eq. (2.1) may be written as the
following equivalent system:
( 2T 2 -A" )
T" ( 2T 2-A" ) T 2
• • •
• @
k, -
~ + ~ - ~
e
e
(2'~'~')
F
o
o
~ m _ n ,
I.
(3.1)
and
~ j = Zj + ~(Xj_l + X i+l) J = 3,5,...,m-3 (3.2)
2 k+l , Since m = and the new system of Eq. (3.1) involves xj's with even indlcesp the
block dimension ef the new system of eqtmticns is 2k-l. Note that once Eq. (3.1) is
solved, it is easy to solve for the xj's with odd indices as evidenced by Eq. (3.2)•
We shall refer to the system of Eq. (3.2) as the eliminated equations.
Also, note that Algorithm i may be applied to System (3.1). Since A and T
commute, the matrix (2Ta-A a) has the same set of eigenvectors as A and T. Also, if
~(A) = ki, ~(T) = %, for i = 1,2,...,m-l,
= - .
Heckney [8] has advocated this procedure.
Since System (3.1) is block tridiagonal and of the form of Eq. (2.2), we can
apply the reduction repeatedly until we have one block. However, as noted above, we
can stop the process after any step and use the toothed of Section 2 to solve the
resulting equations.
To define the procedure recursively, let
~o) (j = 1,2, .,m-l). A (°) = A, T (e) = T; ~ = Zj, -" (3.3)
Then for r = O,l,..,k
A (r+l) = 2(T(r)) = _ (A(r)) =,
T (r+z) = (T(r))" , (3.~)
~(r-1) = T(r) • (r) . (r) - A(r) (r) j ~J-2 r + ~j+2 r Yj •
The eliminated equations at each stage are the solu~on of the diagonal system
(r-l) - T(r-l) A (r-l) X2r_2r_ , = ~2r_2r-, X2r
(r-l) - T(r-1) (xj2 r ) A( r -1 ) X j 2 r - 2 r " = ~ j 2 r - 2 r - ' + x ( j - 1 ) 2 r (3 .5)
j = 1,2,...,2 k-r
. (r-l) A(r-1) ~.I _2r., =~k+l_2r. , - T(r'l) X2k+l_2r
After all of the k steps, we must solve the system of equations
A(k) . (k) ~2 k -- ~2 k . (3.6)
In either ease, we must solve Eq. (3.5) to find the eliminated unknowns, Just a~ in
Eq. (3.2). If it is done by direct solution, an ill-conditloned system may arise.
Furthermore A = A(°)is tridiag~nal A (i) is quindiagonal and so on destroying the
simple structure of the original system. Alternatively polynomial factorization
retains the simple structure of A.
From Eq. (3.1), we note that A (1) is a polynomial of degree 2 in A and T. By
induction, it is easy to show that A (r) is a polynomial of degree 2 r in the matrices
A and T, so that 2r-I
A(r) = ~ e(r)2j A2j T2r-2j "~ P2 r(A'T)"
We shall proceed %0 determine the linear factors of P2r(A,T).
Let 2r-I
j--o
For t ~ O, we make the substitution
a/~ : -2 OOS e .
From Eq. (3.3), we note that
I p2r÷1(a,t) = 2t 2~÷ _ (p2r(a,t))~
It is then easy to verify using Eq~. (3.7) and (3.8), that
P2r(a,t) =-2t 2r cos 2re ,
and, consequently 2 r
J=l
and, hence,
-~-2-i (, + 2t cos ~2~+,~ , ) ,
(3.7)
(3.8)
A (r) = -~ (A + 2 cos e!r)T)~ , (3.9)
01
(r) = (2j_I)~/2~+, where ~j
Thus to solve the original system it is only necessary to solve the factored system
recursively. For example when r = 2, we obtain
A (1) = 2~ - A m = (~ T - A)(~ T + A)
whence the simple tridiagonal systems
(J: T-A) ~=~
(4~ T +A) x = w
are used to solve the system
A(1)x = ~ •
We call this method the cyclic odd-even reduction and factorization (CORF) algorithm.
10
4. Applications
Exampie I Poissen's Equation wit h Dirichlet Boundar~ Conditions,
It is instructive to apply the results of Section 3 to the solution of the
finite-difference approximation to Poisson's equation on a rectangle, R, with speci-
fied boundary values• Consider the equation
u + u : f(x,y) for (x,y)ER, ~x yy (~.l)
u(x,y) : g(x,y) for (x,y)¢aR .
(Here aR indicates the boundary of R.) We assume that the reader is familiar with
the general technique of imposing a mesh of discrete points onto R and approximating
~q. (4.Z). The eq~tion u + Uyy : f(x,y) is approximated at (xl,Yj) by
Vi-l.j - 2vi,j + Vi+l.j vi,j-1 - 2vi. j + vi.j+l
C ~ ) " + (Ay)"
= fi,J (i < i < n-l, I < J < m-i) ,
with appropriate values taken on the boundary
VO,J = gC,~ ' Vm, j = gm,J ( 1 g J g m - l ) ,
and
Vi,@ = gi,o' vi,m : gi,J (i < i ~ n-l).
Then vii is an approximation to u(xi,Yj) , and fi,j = f(xi'Yj)' gi,j : g(xl,Yj)-
Hereafter, we assume that
2k+l m -~ •
When u(x,y) is specified on the boundary, we have the Dirichlet boundary con-
dition. For simplicity, we shall assume hereafter that Ax = Ay. Then
1
l © • -4 I
(~ • .
1
and T = I . . l •
1
- 4 ( n - l ) x ( n - l )
11
The matrix In_ I indicates the identity matrix of order (n-l). A and T are symmetric
and co~ute, and, thus the results of Sections 2 and 3 are applicable• In addition,
since A is tridlagcnal, the use of the facterization (3.10) is greatly simplified.
The nine-polnt difference formula for the same Poisson's equation can be treated
similarly when m
-20 4
4 -20
A = •
0
O
•
& -20
, T=
(n-l)×~n-ll
"~ z 0
1 4 1
(~ . . I
1 &
Example II
The method can also be used for Poisson's equation in rectangular regions
under natural boundary conditions provided one uses
au = u(x + ~.y) - u(x - ~.y) Ox 2h
and similarly ~ • at the boundarie S,
Example III
Poisson's equation in a rectangle with doubly periodic boundary conditions is
an additional example when the algorithm can be applied.
Example IV
The method can be extended successfully to three dimensions for Foissents
equation.
For all the above examples the eigensystems are known an~ the fast Fourier
transform can ~e applied,
Example V
The equation of the form
(~(x)~)x + (KY)~)y + u(x,y) = q(x,y)
on a rectangular region can be solved by the CORF algorithm provided the eigensystem
is calculated since this is not generally known.
12
The counterparts in cylindrical polar co-ordinates can also be solved using
CORF on the ractangle~ in the appropriate co-ordinates.
5. The Buneman algorithm and variants
In this section, we shall describe in detail the Buneman algorithm [2] and a
variation of it. The difference between the Buneman algorithm and the CORF algo-
rithm lies in the way the right hand side is calculated at each stage of the reduc-
tion. Henceforth, we shall assume that in the system of Eqs (2.5) T = Ip, the
identity matrix of order p.
Again consider the system of equations as given by Eqs. (2.5) with q = 2k+l-1.
After one stage of cyclic reduction, we have
+ (21 - A')~j (5.1) 5j-2 p + 5j+2 = ZJ-I + ZJ+I -AZJ
for J = 2,4,...,q-I with ~e = ~+l = ~ ~ the null vector. Note that the right han~
side of Eq. (5.1) may be written as
(i) (5.2) ~J = ZJ-1 + ZJ+I -~J = A(1) A-'~j + ZJ-I + ~J+l - 2A-'~j
where A (1) = (21p- A') .
Let us define
(i) ~J-(1) ~j-I ~j+l " 22~ I)_ 2j : A-'Zj ; = +
(These are easily calculated since A is a tridiagon~l matrix.) Then
(1) = A(1) _(1) (1) (5-3) Z~ £j + %j •
After r reductions, we have by Eq. (3.i)
(r+l) , (r) (r)) -A(r) (r) j = ~ j - 2 ~ + ~j+2 ~j . (5.4)
Let us write
Substituting
21 - A (r+1) P
in a fashion similar to Eq. (5.3)
(5.5)
Eq. (5.5) into Eq. (5.4) and making use of the identity (A(r)) ' =
from Eq. (3.4), we have the following relationships:
(r+l) = 2(r) (A(r))_~ , (r) ~(r) ~r)) (5.6a) J J - ~j_2 r + ~j+2 r -
13
• ( r + l ) ~(r) (r) ^ (r+l)
For J = i2 r+l (i = 1,2,...,2k-r-l) with
~!r) = ~(r) (r) = ~(r) = O 2k+l = 2k+l - •
~r) Because the number of vectors ~ is reduced by a factor of two for each successive
r, the computer storage requirements becomes equal to almost twice the number of
data points.
To compute
of equations
A(r) , (r) (r+l) (r)r (r) (r) !,~j - ~,J ) == ~J-2 + ~ j+2 r - ~j '
where A (r) is given by the factorization Eq. (3.9); namely,
2 r A (r) ~ (A + 2 cos 8 (r)j = - Ip) ,
J=l
o~ r) = (2~ - ~ ) ~ / 2 r ~ l •
After k reductions, one has the equation
. = A(k) (k) ,~(k) A (k) x k ~2 k + ~2k
2
and hence
(A(r))-'(~J-2(r)r + ~J +2r~(r) _ ~r)) in Eq. (5.6a). we solve the system
~(k) (A(k))_1 ~(k) ~2k = ~2k + ~2k •
Again one uses the factorization of A (k) for computing (A(k)) -I ~I~ ) .
solve, we use the relationship
~J -2r + A(r) ~J + ~J +2r = A(r) ~r) + ~r)
for J = i2r(l,2,...,2k+l-r-1) with ~o = ~2k+ 1 = ~ •
For J = 2 r, 3.2r,...,2k+l-2 r, we solve the system of equations
A(r)(xj - ~r)) = ~r) _ (xj_2r + xj+2r) ,
Te back
(5.7)
14
using the factorlzation of A(r); hence
~J 2~r) (r)) = + (~J - £d "
(5.~3)
Thus to summarise, the Buneman algorithm proceeds as follows:
((r) (r)~ 1. Compute the sequence ~j , ~j } by Eq. (5.6) for r = l,...,k with
(o) e for J = 0,...,2 k+l ,ana ~O)Z = ~J for j = l, 2,...,2k+l-1.
2. Back-solve for ~j using Eqs. (5.7) and (5.8).
The use of the p(r) and q(r) produce a stable algorithm. Numerical experi- ~J ~J
ments by the author and his colleagues have shown that computationally the Buneman
algorihhm requires approximately 30% less time than the fast Fourier transform
method of Hockney.
6. Accuracy of the BunemanAl~orithms
As was shown in Section 5, the Bunemau algorithms consist of generating the
(r)l. Let us write, using Eqs. (5.12) and (5.13) sequence of vectors I~ r), ~J
£~r) : ~r) + ~J(r) (6.la)
(r) = Xj 2 r + x r - Afr) (r) (6.1h) ~J ~ - ~j+2 ~j '
where
and
Then
and
whe re
k : l (6.2)
S ( r ) = ( A ( r - l ) . . . A(O)) - ' . (6.3)
I1~.~ r) .(r)ll IIs(r)ll2 - ~ i l 2 ~i
l i ~ r ) - (~j_2r + ~j+2rl12
11.t1' (6.~)
IIs (r) ACr)il2 i1~1' , ( 6 . 5 )
llVll 2 indicates the Euclidean norm of a vector v ,
IICII 2 indicates the spectral norm of a matrix C, and
15
1~1t'. ~ll~jll 2 . j=l
Thus for A : A T r-1
Ns(r)II2 - ~ I](A(J))-III2
j:o
and since A ( j ) are polynomials of degree 2 j in A we have
r-I
lls(r)ll2 Vt [P2 j(×±) max I ] " [ ,
j:o [xif
where p2j(Xi) are polynomials in Ikil , the eigenvalues of A.
For Poisson's equation it may be shown that
lls(r)II2 < e'°r e
where o : 2r-1 and e > O. r
Thus Hs(r)ll2 _, o and h~ce
I12~ r) - ~H 2 ~ 0
~r) That is p tends to the exact solution wihh increasing r.
that llq~ r)N2 remains bounded throughout the calculation, the Buneman
leads to numerically stable results.
(6.6)
Since it can be shown
algorithm
7. Non-Rectangular Regions
In many situations, one wishes to solve an elliptic equation over the region
R
where there are n I data points in R I , n2 data points in R z and ne data points in
R, (~ R2. We shall assume that Dirichlet boundary conditions are given. When Ax is
16
the same throughout the region, one has a matrix equation of the form
m
G
© P
pT
@ ~(2)J ~c (2)
where
"A T
T A
G= ©
©
• #
e . T
T A n I x n l
B s
$ B .
(~ " .
S
H = S
B n 2 xn=
and P is (noXno).
Also, we write
x~ z) x ! 2 ~
x (1) =
o
x(a) ,,,r
x(2) • q I
We assume again that AT = TA and BS = SB.
From Eq. (7.1), we see that
0
0 x(1) = ~-I y(1) _ ~-1 .
(7.1)
(7.2)
(7.3)
( 7 . ~ )
an~
17
x(2) = H-I Z(2) - H-I
pT
0
0
x(1) ,,.,r (7.5)
Now let us write
G~(1) = ~(1), H~(2)
~w (I) =
= ~(2) ,
~(2)=
(7.6)
;l o I
"I
oJ
(7.7)
Then as -e partition the vectors z (i), z (2) and the matrices W (1) and W (2) as
in Eq• (7.3), Eqs• (7.4) and (7.5) becomes
(i) ~i) ~(1) x!2) (j = 1,2,...,r), ~j = £ - ,,j ~ ,
(2) = (2) _ w!2) x(1) (j = 1,2,..•,s)• £J ~j J ,,~ ,
For Eq. (7.8), we have
I w (1) r
w~ 2) i
(7.8)
(1)
z(2) (7.9)
It can be noted that W ~lj( ~ and W ~2j( ~ are dependent only on the given region and hence
the algorithm becomes useful if many problems on the same region are to be conside-
red.
Thus, the algorithm proceeds as follows•
i. Solve z(I) aria z! 2) using the methods of Section 2 or 3.
18
2. Solve for W (I) and W! 2) using the methods of Section 2 or 3. r
3. Solve Eq. (7.9) using Gaussian elimination. Save the LU decomposition of
Eq. (7.9).
h. Solve for the unknown components of ~(1) and ~(2) •
8. Conclusion
Numerous applications require the repeated solution of a Poisson equation.
The operation counts given by Dorr [5] indicate that the methods we have discussed
should offer significant economies over older techniques; and this has been veri-
fied in practice by many users. Computational experiments comparing the Buneman
algorithm, the MD algorithm, the Peaceman-Raohford alternating direction algorithm,
and the point successive over-relaxation algorithm are given by Buzbee, at al [3].
We conclude that the method of matrix decomposition, the Buneman algorithm, and
Hookney's algorithm (when used with care) are valuable methods.
This paper has benefited greatly from the comments of Dr. F. Dorr,
Mr. J. Alan George, Dr. R. Hockney and Professor 0. Widlund.
9. References
1. Richard Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York, 1960.
2. Oscar Buneman, Stanford University Institute for Plasma Research, Report No.294, 1969.
B.L. Buzbee, G.H. Golub and C.W. Nielson, "The Method of Odd/Even Reduction and Factorization with Application to Poisson's Equation, Part II," LA-h288, Los Alamos Scientific Laboratory. (To appear SIAM J. Num. Anal. )
J.W. Cooley and J.W. Tukey, "An algorithm for machine calculation of complex Fourier series," Math. Comp., Vol.19, No.90 (1965), pp. 297-301.
F.W. Dorr, "The direct solution to the discrete Poisson equation on a rectangle," to appear in SIAM Review.
J.A. George, "An Embedding Approach to the Solution of Poisson's Equation on an Arbitrary Bounded Region," to appear as a Stanford Report.
G.H. Golub, R. Underwood and J. Wilkinson, "Solution of Ax = kBx when B is positive definite," (to be published). ~
R.W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM., Vol.12 No.1 (1965), pp. 95-113.
3.
4.
o
6.
.
8.
19
9.
lO.
R.W. Hockney, in Methods in Computational Physics (B. Adler, S. Fernbach an~ M. Rotenberg, Eds.), Vol.S Academic Press, New York and London, 1969.
R.E. Lynch, J.R. Rice and D.H. Thomas, "Direct solution of partial difference equations by tensor product methods," Num. Math., Vol.6 (196A), pp. 185-199.
ii. R.S. Varga, Matrix Interative Anal2sis, Prentice Hall, New York, 1962.
Matrix Methods in Mathematical Programming
GENE GOLUB
Stanford University
22
I. Introduction
With the advent of modern computers, there has been a great development in
matrix algorithms. A major contributer to this advance is J. H. Wilkinson [30].
Simultaneously, a considerable growth has occurred in the field of mathematical
programming. However, in this field, until recently, very little analysis has been
carried out for the matrix algorithms involved.
In the following lectures, matrix algorithms will be developed which can be
efficiently applied in certain areas of mathematical programming and which give
rise to stable processes.
We consider problems of the following types:
maximize ~ (~) , where ~ = (x,, x,, .. Xn) T
subject to Ax= b
Gx ~h
where the objective function ~ (~) is linear or quadratic.
2. Linear Programming
The linear programming problem can be posed as follows:
T m~x~i,e ~ (~) = ~
subject to A~_ = b (2.1)
) 0 (2.2)
We assume that A is an m x n matrix, with m < n, which satisfies the Haar
condition (that is, every m x m submatrix of A is non-singular). The vector ~ is
said to be feasible if it satisfies the constraints (2.1) and (2.2).
Let I = lil, i2, .. iml be a set of m indices such that, on setting xj = O,
j $ I, we can solve the remaining m equations in (2.1) and obtain a solution such
that
xij > 0 , J = I, 2, .. m .
Thi8 vector x is said to be a basic feasible solution. It is well-known that
the vector ~ which maximizes ~ (~) = o T x is a basic feasible solution, and this
suggests a possible algorithm for obtaining the optimum solution, namely, examine
all possible basic feasible solutions.
23
Such a process is generally inefficient. A more systematic procedure, due to
Dantzig, is the SimylexAl~orithm. In this algorithm, a series of basic feasible
solutions is generated by changing one variable at a time in such a way that the
value of the objective function is increased at each step. There seems to be no
way of determining the rate of convergence of the simplex method; however, it works
well in practice.
The steps involved may be given as follows:
(i) Assume that we can determine a set of m indices I = liI , i,, .. iml such that
the corresponding x i are the non-zero variables in a basic feasible solution.
J Define the basis matrix
B = [ai , Ai2, .. aim ]
where the a are columns of A corresponding to the basic variables. --lj
(ii) Solve the system of equations:
B~=b
where ~.T= [Xil, Xi, ' .. Xim]
(iii) Solve the system of equations:
B T ^ W = C
where _~T__ [ci,, ci2' .. cim] are the coefficients of the basic variables in the
objective function.
(iv) Calculate
~T w] - T w , say. ( .cj - ~ ~. = Cr ~r - max
j £ I T
If c r - ~ w • 0 , then the optimum solution has been reached. Otherwise, a is to ~r
be introduced into the basis.
(v) Solve the system of equations:
Bt = - a --r
If t ~ 0 , k = I, 2, r k
bounded.
• . m , then this indicates that the optimum solution is un-
Otherwise determine the component s for which
x i x s = min - ~ trk 0
t r s 1 ~k~m t r k
24
Eliminate the column a i from the basis matrix and introduce column a r. s
This process is continued from step (ii) until an optimum solution is obtained (or
shown to be unbounded).
We have defined the complete algorithm explicitly, provided a termination rule,
and indicated how to detect an unbounded solution. We now show how the simplex
algorithm can be implemented in a stable numerical fashion.
~. A stable implementation of the simplex al6orithm
Throughout the algorithm, there are three systems of linear equations to be
solved at each iteration. These are:
B~ = b , m
BTw = c ,
Bt = -a --r -r
Assuming Gaussian elimination is used, this requires about m3/3 multiplica-
tions for each system. However, if it is assumed that the triangular factors of B
are available, then only O(m 2) multiplications are needed. An important considera-
tion is that only one column of B is changed in one iteration, and it seem, reasonable
to assume that the number of multiplications can be reduced if use is made of this.
We would hope to reduce the m3/3 multiplications to O(m 2) multiplications per step.
This is the basis of the classical simplex method. The disadvantage of this method
is that the pivoting strategy which is generally used does not take numerical
stability into consideration. We now show that it is possible to implement the
simplex algorithm in a more stable manner, the cost being that more storage is re-
quired.
Consider methods for the solution of a set of linear equations. It is well-
known that there exists a permutation matrix n such that
HB = LU
where L is a lower triangular matrix, and U is an upper triangular matrix.
If Gaussian elimination with partial (row) pivoting is used, then we proceed
as follows :
Choose a permutation matrix H, such that the maximum modulus element of the
25
first column of B becomes the (I, 1) - element of 1"] 1 B.
Define an elementary lower triangular matrix F k as
k ~ | -
r k = I ' ! - ! f
" i |
". ~ I
'LL I ' l , I ' | " ~ , J ".
Now~ can be chosen so that
P, HI B
has all elements below the diagonal in the first column set equal to zero.
Now choose 92 so that
92 r, 9, B
has the maximum modulus element in the second column in position (2, 2), and
choose r e so that
r= fl~ 1"t H2 B
has all elements below the diagonal in the second column set equal to zero. This
can be done without affecting the zeros already computed in the first column.
Continuing in this way we obtain:
rm- , ~ m - , . . . P 2 ~ , r, 9, B = U
where U is an upper triangular matrix.
Note that permuting the rows of the matrix B merely implies a re-ordering of
the right-hand-side elements. Thus, no actual permutation need be performed,
merely a record kept. Further any product of elementary lower triangular matrices
is a lower triangular matrix, as may easily be shown. Thus on the left-hand side
we have essentially a lower triangular matrix, and thus the required factorization.
The relevant elements of the successive matrices F k can be stored in the
lower triangle of B, in the space where zeros have been introduced. Thus the
method is economical in storage.
26
To return to the linear programming problem, we require to solve a system of
equations of the form
B ( 1 ) ~ = v ( 3 . ~ )
where B (i) and B (i-I) differ in only one column (although the columns may be re-
ordered)°
Consider the first iteration of the algorithm. Suppose that we have obtained
the factorization:
B (°) = S (°) U(o)
where the right-hand-side vector has been re-ordered to take account of the permuta-
tions.
The solution to (3 . i ) with i = 0 is obtained by computing
= (L~°)) -~ x
and solving the triangular system
v(O) = ~ ,
2 each of which requires m + 0 (m) multiplications.
Suppose that the column b (°) is eliminated from B (°) and the column g(O) is S O
introduced as the last column, then
BO) = [b(O) b(O) . b(O) bCo) ~(o)] L t • ~ 2 ' " ~ S " t • ~ S * 1 ' " "
0 0
Therefore,
( ~ ( o ) ) .1 BO) = HO) ,
where H (I) has the form:
/
{ <
27
Such a matrix is called an upper Hessenberg matrix. 0nly the last column need be
computed, as all others are available from the previous step. We require to apply
a sequence of transformations to restore the upper triangular form. It is clear
that we have a particularly simple case of the LU factorization procedure as
previously described, where r! I) is of the form: i
R I ' I i I #-~
• I k_Y ' I " I
11 1
,q~'/ I1 . J i "
I I 0
r~ I) =
only one element requiring to be calculated. On applying a sequence of transforma-
tion matrices and permutation matrices as before, we obtain
1) 1) . . r (1) H(1) = u (1) s s
o o
where U (I) is upper triangular.
(I) it is only necessary to compare two Note that in this case to obtain Hj
(I) and elements. Thus the storage required is very small: (m - So) multipliers gi
(m - So) bits to indicate whether or not interchanges are necessary.
All elements in the computation are bounded, and so we have good numerical
accuracy throughout. The whole procedure compares favourably with standard forms,
for example, the product form of the inverse where no account of numerical accuracy
is taken. Further this procedure requires fewer operations than the method which
uses the product form of the inverse. If we consider the steps involved, forward
and backward substitution with L (°) and U (i) require a total of m 2 multiplications
and the application of the remaining transformation in (L(i)) -I requires at most
i(m - I) multiplications. (If we assume that on the average the middle column of
the Basis matrix is eliminated, then this will be closer to (i/2) (m - I) ). Thus
a total of m 2 + i (m - I) multiplications are required to solve the system at each
28
stage, assuming an initial factorization is available. Note that if the matrix A
is sparse, then the algorithm can make use of this structure as is done in the
method using the product form of the inverse.
4" Iterative refinement of the.solution
Consider the set of equations
B~ = X
and suppose that ~ is a computed approximation to ~ . Let
-- ~+£
Therefore,
that is,
B(~ + 2) : v ,
Be_ -- v-B~
We can now solve for c very efficiently, since the LU decomposition of B is
available. This process can be repeated until ~ is obtained to the required accur-
acy. The algorithm can be outlined as follows:
(i) Compute ~j = ~ - B~_j
(ii) Solve B_cj = r -j
(iii) Compute ~j+1 = ~J + ~J
It is necessary for r to be computed in double precision and then rounded to --j
single precision. Note that step (ii) requires 0(m 2) operations, since the LU de-
composition of B is available. This procedure can be used in the following sections.
~. Householder Trian~ularization
Householder transformations have been widely discussed in the literature. In
this section we are concerned with their use in reducing a matrix A to upper-
triangular form, and in particular we wish to show how to update the decomposition
of A when its columns are changed one by one. This will open the way to implemen-
tation of efficient and stable algorithms for solving problems involving linear
constraints.
Householder transformations are symmetric orthogonal matrices of the form
Pk = I - k UkUk where u k is a vector and Ck = 2/( ). Their utility in this
29
context is due to the fact that for any non-zero vector 2 it is possible to choos~
u k in such a way that the transformed vector Pk a is zero except for its first
element. Householder [15] used this property to construct a sequence of transfor-
mations to reduce a matrix to upper-triangular form. In [29], Wilkinson describes
the process and his error analysis shows it to be very stable.
Given any A, we can construct a sequence of transformations such that A is
reduced to upper triangular form. Premultiplying by P annihilates (m - 1) O
elements in the first column. Similarly, premultiplying by PI eliminates (m - 2)
elements in the second column, and so on.
Therefore,
em-1 Pm-2 "'PI PoA = [ RO ] ' (5.1)
where R is an upper triangular matrix.
Since the product of orthogonal matrices is an orthogonal matrix, we can
write (5.1) as
QA = [ R] 0
A=QT[ R ] 0
The above process is close to the Gram-Schmidt process in that it produces
a set of orthogonal vectors spanning E . In addition, the Householder transforma- n
tion produces a complementary set of vectors which is often useful. Since this
process has been shown to be numerically stable, it does produce an orthogonal
matrix, in contrast to the Gram-Schmidt process.
If A = (~I ,...,~n) is an mxn matrix of rank r, then at the k-th stage of the
triangularization (k < r ) we have
where R k
A (k) PoA= = Pk-1Pk-2 "'"
0
is an upper-triangular matrix of order r.
T k
The next step is to compute
A.k+1.( ~ = Pk A'k" ( ~ where Pk is chosen to reduce the first column of T k to zero
except for the first component. This component becomes the last diagonal element
30
of ~+I and since its modulus is equal to the Euclidean length of the first column
of T k it should in general be maximized by a suitable interchange of the columns
of Sk . After r steps, T will be effectively zero (the length of each of its r
T k
col~Im=~ will be smaller than some tolerance) and the process stops.
Hence we conclude that if rank(A) = r then for some permutation matrix H the
Householder decomposition (or "QR decomposition") of A is
Q A ~ = Pr-1 Pr-2 "'" PO A =
r
O 0
where Q = Pr-1Pr-2 "'" PO is an m x m orthogonal matrix and R is upper-triangular
and non-singular.
We are now concerned with the manner in which Q should be stored and the
means by which Q, R, S may be updated if the columns of A are changed. We will
suppose that a column a is deleted from A and that a column a is added. It will ~p ~q
be clear what is to be done if only one or the other takes place.
Since the Householder transformations Pk are defined by the vectors u k the
usual method is to store the Uk'S in the area beneath R, with a few extra words of
memory being used to store the ~k'S and the diagonal elements of R. The product
Q~ for some vector ~ is then easily computed in the form Pr-1Pr-2 "'" PO ~ where,
T T for example, PO ~ = (I - ~0~O~0)~ = ~ - ~o(Uo~)Uo . The updating is best
accomplished as follows. The first p-1 columns of the new R are the same as before;
the other columns p through n are simply overwritten by columns ap+1, ..., an, aq
and transformed by the product Pp-1Pp-2 "'" PO to obtain a new
I (Sp_ I ~ I' then T is triangularized as usual.
\%1 ] p-1
This method allows Q to be kept in product form always, and there is no accumula-
tion of errors. Of course, if p = I the complete decomposition must be re-done
and since with m~ n the work is roughly proportional to (m-n/3)n 2 this can mean
a lot of work. But if p A n/2 on the average, then only about I/8 of the original
work must be repeated each updating.
31
Assume that we have a matrix A which is to be replaced by a matrix ~ formed
from A by eliminating column a and inserting a new vector g as the last column.
As in the simplex method, we can produce an updating procedure using Householder
transformations. If ~ is premultiplied by Q, the resulting matrix has upper
Qi = /
/
<
As before, this can be reduced to an upper triangular matrix in O(m 2) multiplica-
tions.
6. Projections
In optimization problems involving linear constraints it is often necessary
to compute the projections of some vector either into or orthogonal to the space
defined by a subset of the constraints (usually the current "basis"). In this
section we show how Householder transformations may be used to compute such pro-
jections. As we have shown, it is possible to update the Householder decomposi-
tion of a matrix when the number of columns in the matrix is changed, and thus we
will have an efficient and stable means of orthogonalizing vectors with respect to
basis sets whose component vectors are changing one by one.
Let the basis set of vectors a 1,a2,...,a n form the columns of an m x n
matrix A, and let S be the sub-space spanned by fail • We shall assume that the r
first r vectors are linearly independent and that rank(A) = r. In general,
m > n > r , although the following is true even if m < n •
Given an arbitrary vector z we wish to compute the projections
u = Pz , v = (I - P) z
for some projection matrix P , such that
Diagramatically, Hessenberg form as before.
32
a) z = u + v
(b) 2v = 0
(o) ~s r (i.e., 3~ ~uoh that ~ = ~)
(i.e., ATv (d) v is orthogonal to S r ~ = o)
One method is to write P as AA + where A + is the n x m generalized inverse of A,
and in [7~ Fletcher shows how A + may be updated upon changes of basis. In contrast,
the method based on Householder transformations does not deal with A + explicitly
but instead keeps AA + in factorized form and simply updates the orthogonal matrix
required to produce this form. Apart from being more stable and just as efficient,
the method has the added advantage that there are always two orthonormal sets of
vectors available, one spanning S and the other spanning its complement. r
As already shown, we can construct an m x n orthogona~ matrix Q such that
r n-r
QA = £i 0 S1
where R is an r x r upper-triangular matrix. Let
W = Qz =
I r
m-r
(6.~)
and define
~ ' X= ~2 (6.2)
Then it is easily verified that ~,~ are the required projections of ~, which is to
say they satisfy the above four properties. Also, the x in (c) is readily shown
to be
In effect, we are representing the projection matrices in the form
33
and
P Q C: r) = (z r o)Q ( 6 . ~ )
I-P =QT (im_rO ) (OI r)Q (6.A)
and we are computing ~ = P z, Z = (I - P)~ by means of (6.1), (6.2) • The first r
col,m~R of Q span S and the remaining m-r span its complement. Since Q and R may r
be updated accurately and efficiently if they are computed using Householder
transformations, we have as claimed the means of orthogonalizing vectors with re-
spect to varying bases.
As an example of the use of the projection (6.4), consider the problem of
finding the stationary values of xTAx subject to xTx = I and cTx = O, where A is a
real symmetric matrix of order n and C is an n x p matrix of rank r, with r ! P <~ n.
It is shown in [12] that if the usual Householder decomposition of C is
r n-r
Qc= (Ro OS )
th@n the problem is equivalent to that of finding the eigenvalues and eigenvectors
of the matrix PA , where
P = I-P = O O Q
0 In_ r
is the projection matrix in (6.4).
Note that, although PA is not symmetric, since P~ = P , then
PA = P2A
and further the eigenvalues of P2A are equal to the eigsnvaluee of the s~etric
matrix PAP. The dimensionality of the problem is not reduced; some of the eigen-
values will be zero.
~. Linear least-squares problem
The least-squares problem to be considered here is'
34
m£n l l b - A~_It 2
where we assume that the rank of A is n.
Since length is invariant under an orthogonal transformation we have
where QA =
lib - A x l l 2 = l l Q b - QA~_II "+ 2 2
[ 1{ ]. Let 0
Qb = c : [o_, ]. - - - - C 2 m- n
Then,
2, 1{] x U' = Ha_,- ~_H" + lla.il" " [~_,] - [o - , ,
and the solution to the least-squares problem is given by
= 1{ -1 c,
Thus it is easy to solve the least-squares problem using orthogonal transformations.
Alternatively, the least-squares problem can be solved by constructing the
normal equations
A x = A D
However these are well-known to be ill-conditioned.
Nevertheless the normal equations can be used in the following way.
Let the residual vector r be defined by:
r = b - A ~
Then,
ATr = ATb - ATA~ = 0
These equations can be written:
[IA A]O Ir> (:Jx+ Thus,
0 I
Multiplying out:
(1{7o) o
IAT AIi TO IOii IO
C CO/o
(r) X
(7.~)
:I(:)
35
where ~ = QE and S = Q~ .
This system can easily be solved for ~ and ~. The method of iterative refine-
ment may he applied to obtain a very accurate solution.
This method has been analysed by BJhrck [2].
8. Least-squares problem with linear constraints
Here we consider the problem
minimize ~ - A~_~ 2 2
subject to G~ = ~ .
Using Lagrange multipliers ~ , we may incorporate the constraints into
equation (7.1) and obtain
0 I A
G T A T 0 1 b
0
The methods of the previous sections can be applied to obtain the solution of this
system of equations, without actually constructing the above matrix. The problem
simplifies and a very accurate solution may be obtained.
Now we consider the problem
minimize llb - A~_~ 2 2
subject to Gx ~> h.
Such a problem might arise in the following manner. Suppose we wish to approximate
given aata by the polynomial
y(t) = ~t ~ + @t 2 + yt +
such that y(t) is convex. This implies
y(')(t) = 6at + 2~ ) 0 .
Thus, we require
6 a t i + 2~ ) 0
where t. are the data points, (This aces not necessarily guarantee that the poly- l
hernial will be convex throughout the interval. ) Introduce slack variables w such
that Gx - w = h
where w ~ _O .
36
Introducing Lagrange multipliers as before, we may write the system as:
i O 0 G -I 0 I A 0
G T A T 0 0
r
x
w
h
b
0
At the solution, we must have
T • _~o, w~o, _z_w=0.
This implies that when a Lagrange multiplier is non-zero then the corresponding
constraint holds with equality.
Conversely, corresponding to a non-zero w i the Lagrange multiplier must be
zero. Therefore, if we know which constraints held with equality at the solution,
we could treat the problem as a linear least-squares problem with linear equality
constraints. A technique, due to Cottle and Dantzig [5], exists for solving the
problem inthis way.
37
Bibliography
[11 Beale, E.M.L., "Numerical Methods", in Ngn~.inear Programming, J. Abadie (ed.).
John Wiley, New York, 1967; pp. 133-205.
[2] Bjorck, ~., "Iterative Refinement of Linear Least Squares Solutions II", BIT 8
(1968), pp. 8-30.
[3] and G. H. Golub, "Iterative Refinement of Linear Least Squares
Solutions by Householder Transformations", BIT 7 (1967), pp. 322-37.
[4] and V. Pereyra, "Solution of Vandermonde Systems of Equations",
Publicaion 70-02, Universidad Central de Venezuela, Caracas, Venezuela, 1970.
[5] Cottle, R. W. and @. B. Dantzig, "Complementary Pivot Theory of Mathematical
Programming", Mathematics of the Decision Sclences~ Part 1, G. B. Dantzig and
A. F. Veinott (eds.), American Mathematical Societ 2 (1968), pp. 115-136.
[6] Dantzig, G. B., R. P. Harvey, R. D. McKnight, and S. S. Smith, "Sparse Matrix
Techniques in Two Mathematical Programming Codes", Proceedinss of the S.ymposium
on Sparse Matrices and Their Appllcations, T. J. Watson Research Publications
RAI, no. 11707, 1969.
[7] Fletcher, R., "A Technique for Orthogonalization", J. Inst. Maths. Applics. 5
(1969), pp. 162-66.
[8] Forsythe, G. E., and G. H. Golub, "On the Stationary Values of a Second-Degree
Polynomial on the Unit Sphere", J. SIAM, 13 (1965), pp. 1050-68.
[9] and C. B. Moler, Computer Solution of Linear Algebraic Systems,
Prentice-Hall, Englewood Cliffs, New Jersey, 1967.
[10] Francis, J., "The QR Transformation. A Unitary Analogue to the LR Transforma-
tion," Comput. J. 4 (1961-62), pp. 265-71.
[11] golub, G. H., and C. Reinsch, "Singular Value Decomposition and Least Squares
Solutions", Numer. Math., 14(1970), pp. 403-20.
[12] and R. Underwood, "Stationary Values of the Ratio of Quadratic
Forms Subject to Linear Constraints", Technical Report No. CS 142, Computer
Science Department, Stanford University, 1969.
[13] Hanson, R. J., "Computing Quadratic Programming Problems: Linear Inequality
and Equality Constraints", Technical Memorandum No. 240, Jet Propulsion
38
Laboratory, Pasadena, California, 1970.
[14] and C. L. Lawson, "Extensions and Applications of the House-
holder Algorithm for Solving Linear Least Squares Problems", Math. Comp., 23
(1969), pp. 787-812.
[15] Householder, A.S., "Unitary Triangularization of a Nonsymmetric Matrix",
J. Assoc. Comp. Mach., 5 (1968), pp. 339-42.
[16] Lanozos, C., Linear Differential Operators. Van Nostrand, London, 1961.
Chapter 3 •
[17] Leringe, 0., and P. Wedln, "A Comparison Betweem Different Methods to Compute
a Vector x Which Minimizes JJAx - bH2 When Gx = h", Technical Report, Depart-
ment of Computer Sciences, Lund University, Sweden.
[18] Levenberg, K., "A Method for the solution of Certain Non-Linear Problems in
Least Squares", ~uart. Appl. Math., 2 (1944), pp. 164-68.
[19] Marquardt, D. W., "An Algorithm for Least-Squares Estimation of Non-Linear
Parameters", J. SIAM, 11 (1963), pp. 431-41.
[20] Meyer, R. R., "Theoretical and Computational Aspects of Nonlinear Regression",
P-181 9, Shell Development Company, Emeryville, California.
[21] Penrose, R., "A Generalized Inverse for Matrices", Proceedings of the
Cambridge Philosophical Society, 51 (1955), pp. 406-13.
[22] Peters, G., and J. H. Wilkinson, "Eigenvalues of Ax = kB x with Band Symmetric
A and B", Comput. J., 12 (1969), pp. 398-404.
[23] Powell, M.J.D., "Rank One Methods for Unconstrained Optimization", T. P. 372,
Atomic Energy Research Establishment, Harwell, England, (1969).
[24] Rosen, J. B., "Gradient Projection Method for Non-linear Programming. Part
I. Linear Constraints", J. SIAM, 8 (1960), pp. 181-217.
[25] Shanno, D. C. "Parameter Selection for Modified Newton Methods for Function
Minimization", J. SIAM, Numer. Anal., Ser. B,7 (1970).
[26] Stoer, J., "On the Numerical Solution of Constrained Least Squares Problems",
(private communication), 1970.
[27] Tewarson, R. P., "The Gaussian Elimination and Sparse Systems", Proceedings
of the Symposium on Sparse Matrices and Their Applications~ T. J. Watson
39
Research Publication RA1, no. 11707, 1969.
[28] Wilkinson, J. H., "Error Analysis of Direct Methods of Matrix Inversion",
J. Assoc. Comp. Mach., 8 (1961), pp. 281-330.
[29] "Error Analysis of Transformations Based on the Use of
Matrices of the Form I - 2ww H', in Error in Digital Computation, Vol. ii, L.
B. Rall (ed.), John Wiley and Sons, Inc., New York, 1965, pp. 77-101.
[30] The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,
1 965.
[31] ZoutendiJk, G., Methods of Feasible Directions, Elsevier Publishing Company,
Amsterdam (1960), pp. 80-90.
Topics in Stability Theory for Partial Difference Operators
VIDAR THOM~E
University of Gothenburg
42
PREFACE
The purpose of these lectures is to present a short introduction to some aspects
of the theory of difference schemes for the solution of initial value problems for
linear systems of partial differential equations. In particular, we shall discuss
various stability concepts for finite difference operators and the related question
of convergence of the solution of the discrete problem to the solution of the con-
tinuous problem. Special emphasis will be given to the strong relationship between
stability of difference schemes and correctness of initial value problems.
In practice, most important applications deal with mixed initial boundary value
problems for non-linear equations. It will net be possible in this short course to
develop the theory to such a general context. However, the results in the particular
cases we shall treat have intuitive implications for the more complicated situations.
The two most important methods in stability theory for difference operators have been
the Fourier method and the energy method. The former applies in its pure form only
to equations with constant coefficients whereas the latter is more directly appli-
cable to variable coefficients and even to non-linear situations. Often different
methods have to be combined so that for instance Fourier methods are first used to
analyse the linearized equations with coefficients fixed at some point and then the
energy method, or some other method, is applied to appraise the error comm~tte~ by
treating the simplified case. We have selected in these lectures to concentrate on
Fourier techniques.
These notes were developed from material used previously by the author for a
similar course held in the summer of 1968 in a University of Michigan engineering
summer conference on numerical analysis and also used for the author's survey paper
~361. Some of the relevant literature is collected in the list of references. A
thorough account of the theory can be obtained by combining the book by Richtmyer
and Morton E28] with the above mentioned survey paper E36S. Both these sources con-
rain extensive lists of further references~
43
I. Introduction
Let ~ be the set of uniformly continuous, bounded functions of x, and let
be the set of functions v with (d/dx)Jv in ~ for J ~k. For v ~ ~ set
X
For any v C ~)amy k, and ~ >0we can findv G ~Ksuch that
~1 v - v / / < £ .~
is dense in ~--.
Consider the ~_uitial-value problem
t .>.0 (1)
If v ~ C ~ this problem admits one and only one solution in C D
(2)
(3)
It is clear that the solution u depends for fixed t linearly on v; we define a
linear operator Ee(t ) By
where u is defined by (3) and where v C C~A The solution operator Eo(t ) has the
properties
and
II ~-~ b') v /t <~ II v//
In particular, the inequality means that a small change in v only causes a small
change in the solu~ o n u.
Although for v f ~% the function u defined by (3) is not a "genuine" or
classical solution for % = O, the integral still converges if v C- C and it is
natural to define a "generalized" solution operator by
• . . / + ' t . •
L vL',,? , t - : o , .
~ > 0
44
The operator E(t) still has the properties
= ~ " 0+)
l i e (~:~~ I/ ~< /i v I\ , (~)
and is continuous in t for t ~ O. For this particular equation we actually get a
c l a s s i o ~ solutio~ for t ~ o~ even i f ~ i s o ~ y i n C . e have E ( t ) . ~ (_ - - / ~ K=O
for t > O,
Consider new the initial-value problem
, ( ~ )
For v g ~
Clearly
this problem admits one and only one genuine solution, namely
(7)
(act~mlly we have equality) and it is again natural to define a generalized solution
operator, continuous in t by
This has again the properties (~), (5). In this case, the solution is as irregular
for t >0 as it is for t = O.
Both these problems are thus "correctly posed" in ~ ; they can be uniquel~
solved for a dense subset of ~ and the solution operator is bounded.
We could instead of ~ also have considered ether Basic classes of functions.
Thus let L ~ be the set of square integrable functions with
,, (LI 1 Consider again the initial-value problem (1),(2) and assume that u(x,t) is a classi-
cal solution and that u(x,t) tends to zero as fast as necessary whsm I~I .-~o for
the following to hold. Assume for simplicity that u is real-valued. We then have
~ t ( 8 )
45
so that for t ~ O,
i~ ~ [., ~-~'~ II ~ II v I \ (9)
Relative to the present framework it is also possible to define genuine and gene-
ralized solution operators; the latter is defined on the whole of L 2 and satisfies
(~-), (5) .
For the problem (6), (7) the calculation corresponding to (8) goes similarly~
b'£
One other way of looking at this is to introduce the Fourier transform; for
integrable v, set ~o ~X
v : ( lO)
Notice the Parseval relation, for v ~nx addition in L 2 we have ~ L a and
II '~ Ii = / i - ~ il v i ~ .
For the Fourier-transform u(~ ,t) with respect to x of the solution u(x,t) we then
get i~itial-value problems for the ordinary differential equations, namely,
for (l), (2) an~ a~ . ~, A ~ ~ = A v ( ~
f o r (6), (7). ~ e s e have the ~oZut~ons _~L -~ "~-,~
u ~ ( n )
_~ (12)
respectively, and the actual solutions can be obtained, under certain conditions,
by the inverse Fourier transform. Also by Parseval's formula we have for both (ll)
and (12), __~I I
which is again (9).
For the purpose of approximate solution of the initial-value problem (1), (2),
where h,k are small positive numbers which we shall later make tend to zero in such
a fashion that ~=k/h 2 is kept constant. Solving for u(x,t+k), we get
46
This suggests that for the exact (generalized) solution to (I), (2),
. (z~)
or after n steps
We s h ~ l l prove t h a t t h i s i s e s ~ e n t i ~ l l y c o r r e c t f o r any v ~- ~ i f , bu t on ly i f ~ ½
Thus, let us first notice that if ~ ~ ~ , then the coefficients of ~ are all non-
negative and add up to 1 so that (the norm is again the sup-norm)
or generally
iiE vll .< tl ll The boundedness of the powers of ~ is referred to as stability of ~.
Assume now that v 6 ~ We then know that the classical solution of (i), (2)
exists and if n(x,t) E(t)v = Eo(t)v , the~ u E g ~ = for t '~ 0 an~
We shall prove that~if nk = t, then ~ ~ II ~J II
To see this let us consider
Notice now that we can write
47
Therefore "
~-, E- ~- V(~l
which we wanted to prove.
We shall new prove that for v not necessarily in
have for nk = t,
To see this, let ~ ~ 0 be arbitrary, and choose 'v"
¢
but only in ~ , we still
when k ~ 0 .
such that
We then have
- -K
Therefore, choosing ~ -- ~'z(~il~l')-w~' have for h ~ ~
which concludes the proof.
Consider now the case ~
Taking
we get
~o
X~
The middle coeffic ~nt in ~ is th~ negative.
so that the effect of ~ is multiplication by (i-~). We generally get
48
Since ~ > ½ we have 1 - ~ ' ~ ~ -i and it follows that it is not possible to have
an inequality of the form
// T.
Th is can a l s o be i n t e r p r e t e d to mean t h a t sma l l e r r o r s i n the i n i t i a l d~ ta a re b lown
up to an extent where they overshadow the real solution. This phenomenon is oalle~
instability.
Instead of the simple difference scheme (13) we could study a more general
type of operator, e.g.
If we wa~t this to be "consistent" with the equation (i) we have to demand that E k
apprexi~tes E(k), or if u(x,t) is a solution, then
Taylor series development gives for smooth u,
o r
(\
J
Assuming these consistency relations to hold and assuming that all the aj
we get as above
are ~ O,
(15)
and the convergence analysis above can be carried over to this more general case
with few chs~ges.
49
However, the reasons for choosing an operator of the form (14) which is not our
old operator (13) would be to obtain higher accuracy in the approximation and it will
turn out then that all the coefficients are in general not non-negative. We cannot
have (15) then, but we may still have
f o r some C depend~mg on To
When we work with the L2-norm rather than the maximum norm, Fourier transforms
are again helpful; indeed in most of the subsequent lectures, Fourier analysis will
be the foremost tool.
Thus, let ~ be the Fourier transform of v defined by (lO). We then have
,.J
3 J
or, introducing the characteristic (trigonometric) polynomial of the operator ~,
Jj i~_,,,f we find that the effect of E k on the Fourier transform side is multiplication by
n a(h~ )n. One easily findsthat similarly, the effect of E k is multiplication by
a(h ~)n. Using Parseval's relation, one then easily finds (the norm is now the L z-
norm)
IIE I ]11 and that this inequality is the best possible. It follows that we have stability if
and only if la(~ ) I ~ 1 for all real~ . We then actually have (15) in the L2-norm
Consider again the special operator (13).
and a(~ ) takes all values in the interval[l-&/\
We have in this case
2A
, i]. We therfore find that also in
L 2 we have stability if and only if 1-4~ ~ -1, that is ~ ~ ½.
Difference approximations to the initial value problem (6), (7) can be analysed
similarly.
50
We shall put the above considerations in a more general setting and discuss an
initial-value problem in a Banach space B. Thus let A be a linear operator with
domain D(A) and let v ~ B.
such that
Consider then the problem of finding u(t) E B, t ~ O,
A~*~t-) , E--, o (16)
v ( 1 7 )
More precisely, we shall say that u(t), t ~ O, is a genuine solution of (16), (17)
if (17) holds and
(ii) Ii u(t,<)'-"([-) -- ~L~LI.) / / ~ 0 whenk-* O,
uniformly for 0 ~ t 4 T for any T > O,
Let D O be a subspace of B such that for v 6 Do, the problem (16), (17) has a
unique genuine solution, Then u(t) can be seen to depend linearly on v so that
u(t) = Eo(t)v defines a linear operator with D(Eo(t)) = D O, We say that the problem
(16), (17) is correctly posed if D can be chosen to be dense in B so that Eo(t) is o
a bounded operator for any t >. O, and for any T > 0 there is a C with
Clearly Eo(t ) then has a uniquely defined bounded linear extension E(t) with
D(E(t)) = B such that (18) still holds~ We call E(t) the (generalized) solution
operator. Thus for v 6 B, E(t)v = u(t) is a generalized solution of (16), (17),
and this solution depends continuously on v.
One can show that the solution operator has the semi-group property.
~(t+s) = ~(t)~(s), s , t ~ 0:
in the terminology of semi-group theory one can define (16), (17) to be correctly
posed if the densely defined operator A generates a strongly continuous semi-group
for t >~ O.
51
We shall now study the approximation of a solution u(t) = E(t)v of a correctly
posed Initial-value problem (16), (17). We will then for small k,k $ ke, consider
an approximation ~ of E(k), where ~ is a bounded linear operator with D(E k) = B
which depends continuously on k for 0 $ k ~ k e. The thought is then that ~
is going to approximate E(nk)v = E(k)nv.
We say that the eperator E k is consistent with the initlal-value problem (16),
(17) if there is a set'~ of genuine solutio~ of (16), (17) such that
(i) % ~ ~ ~(0~ ~ ~ ~ ~ ~ is dense in B.
for any T > O.
If the operator ~ is consistent with (16), (17), we say that it is convergent
(in B) if for any v e B and any t ~ O, and any pair of sequences ~i ~ 9
with kj ~-> O, njkj -->t for j ~ , we have
II g vI/ o whenj-e . We say that the operator ~ is stable (in B) if for any T ~ 0 there is a con-
stant C such that
It turns out that consistency alone does not guarantee convergence; we have the
following theorem which is referred to as Lax's equivalence theorem [22].
Theorem Assume that (16), (17) is correctly posed and that ~ is a consistent
approximation operator. Then stability is necessary and sufficient for convergence.
The proof of the sufficiency of stability for convergence is similar to the
proof in the particular case treated above; the proof of the necessity depends on
the Banach-Steinhaus theorem.
2. Ini~al-value problems in L 2 with constant coefficients
We begin with some notation. We shall work here with the Banach space
L ~ = L2(R d) with the norm
52
We define for a multi-index ~< = C~,~ ' j~%~ with ~<~ non-negative integers
setting D = (. ~ ~'~, ~ ..... ~ ~" ~'~d we then ~ve the fo~o~g no~tion for ~eneral
derivatives of order [~I = ~ ~J ~ n~ely
We denote ~y ~ the set of infinitely differentiahle complex-valued functions in
R d and ~y C~ the su~set of functions with compact support. We also introduce the ~o
set ~ of u ~ ~ such that for any multi-indices ~<9 ~
Clearly Cj C J C C and it is well known that Co and are denae in L*.
R d • For u integrable on we define the Fourier transform
We reoall that if U e J ~ S then u 6 . Further, for u g ~ we have Fourier's
inversion formula ~ < ~ ~ ~2
and Parseval ' s relation
C J and as a consequence of the latter, the set c~ of functions in with Fourier
transforms in C o is dense in L'
In the sequel we shall consider N-vector valued functions u(x)=(u,(x),...,uN(x)).
It is clearly natural %o define u(x) G L', f9 Co ~ etc. by demanding that this
holds for each component uj, J = I,...,N. Single bars will denote norms with respect
to N-vectors e.g. ~ ~ ~
I
and for NxN matrices,
v . o Ivi
53
and double bars will indicate norms with respect to L a , so that for the N-vector
u(x) 6 L ' , ~_ \ ~_
For later use we need the following
Lemma I Let ~ be a dense subset of L a and let a(~ ) be a continuous NxN matrix.
Then il~ V II
veglv ~i v/\ ~
Let u(x,t) ~e an N-vector-function defined for x ~ R d and t ~ O.
initial-value problem
-~ _ ~ ( , ~ = "5_ ~.~)~ ~ > ~ o
Consider the
where P~ are constant N~N matrices and where we can consider Pu to be defined for
u ~ ~ Let ~
We have :
Theorem I
for any T ~. O, there is a C such that
Proof
The initial-value ~oblem (I), (2) is correctly posed in L" if and onlyif,
a
~ IL o~-~ ~ T
Assume that (3) holds. Let v ~ ~ and consider
(~)
By differentiation under the integral sign we find that u(x,t) satisfies (i), and so
is a solution te (i), (2). Since for t >I O, u(x,t) ~ ~) it is a genuine solution
in the sense of Lecture land is also unique. Thus Eo(t)v = u(x,t) with D = 2 . By O
Fourier's inversion formula smd Parseval's theorem
54
Since S is dense in L ~ it follows that the initial-value problem is correctly
posed° Jk
We now want to prove the necessity of (3) for correctness. Let now v ~ ~o an~
define u(x,t) by (4). We find at once that u(x,t) satisfies the initial-value
problem (I), (2) and so u(x,t) = E(t)v. Again, by Fourier's inversion formula and
Parseval' s theorem
[I v//
so that by Lemma I,
which proves the necessity of (3) since Co is dense in L z.
Ex. i Consider the symmetric hyperbolic system
(5)
Then the in~ial-value problem for (5) is correctly posed in L 2 for 8
since this is a unitary matrix.
Before proceeding to the next example we state a lemma.
A with eigenvalues ~ d' j = I,...,N, we introduce
We then have
For an arbitrary NxN matrix
Lemma 2 If A is an NxN matrix we have for t ~ 0
.j --0
Proof See [9].
Ex~2 Consider the system (I) and consider also the principal part P of P which
corresponds to the polynomi~l
55
We say that the system (i) is parabolic in Petrovskii's sense if there is a ~, > 0
such that
.)
By homogeneity this is equivalent to the existence of a ~> 0 and a C such that
We then have that if (I) is parabolic in Petrovskii's sense, the corresponding
initial-value problem is correctly posed in L 2. For by Lemma 2 we have for
0 ~ t ~ T,
which is clearly bounded. In particular, the heat equation
clearly falls into this category.
Solutions of parabolic systems are smooth for t ~ 0; we have
Theorem 2 Assume that (1) is parabolic in Petrovsk~'s sense. Then for t ~ O,
D E(t)v ~ L 2 for any ~ and for ar~y T > 0 and any°Q there is a C such that
6<t:.<. -i"
Proof Via the Fourier transform and Parseval's relation this reduces to
But this follows at once by (6).
Ex. Consider the Schrodinger equation (N=I) 4
The initial-value problem for this equation is also correctly posed in L 2. For
Ex, 4 The Cauchy-Riemann equations can be written (d=l)
For these we shall prove the negative result, that the corresponding initial-value
problem is not correctly posed in L z. For here
56
-i o -~
and a simple calculation yields
which is not bounded for any t > 0 whe~ V~ cO .
Ex. 5 Although our theory only deals with systems which are first-order systems with
respect to t, it is actually possible to consider also hi~her-order systems by
reducing them to first-order systems. We shall only exemplify this in one particu-
lar case. Consider the initial-value problem (d=l)
.~"~ _- ~ - ~ " ~ ~ "k >~ ~
Introducing
~-~ ~
(7)
(a)
we have for u the initial-value problem
ul~o5 = vL~,5. ~ere
(9)
c o ~ ~ = / _~
so that we have that the initial-value problem (9) obtained by the transformation (8)
from (7) is correctly posed in L i.
In order that an initial-value problem of the type (I), (2) be correctly posed in
L 2, it is necessary that it be correctly posed in the sense of Petrovskii, more pre-
cisely:
Theorem 3 If (i), (2) is correctly posed in L 2 then there is a constant C such that
57
Proof Follows at once by
We shall see at once by the following example that (I0) is not sufficiemt for
correctness in Ls•
Ex. 6 Take the initial-value problem corresponding to (d~l)
0 _ . ~ = - f T_ -v -l: We get then
However, a simple calculation yields ~ 1
which is easily seen to he unbounded for 0 $ t ~ I (take t ~ = i).
Necessary and sufficient conditions for correctness have been given by Kreiss
[19]. The main contents in Kreiss' result are concentrated in the following lena.
Here for a NxN matrix A we denote by Re A the matrix
Also recall that for hermitian matrices A and B, A ~ B means
for all N-vectors v. We denote the resolvent of A by R(A;z);
It will be implicitly assume~, when we write down R(A;z ), that z is not an eigen-
value of A.
Lemma ~ Let ~ be a fa~ly of NxN matrices. Then the following four conditions are
equivalent
j
?---
(iii) For A ~ ~ ~A(A) <~ 0 and there are two constants C, and C 2 and for each
A a matrix S = S(A) such that
58
and such that
(, i s i ~ /s- '~) <- c~
Sf/ S -~ -- %~.
0 ~X.
is a triangular matrix with \
(iv) There is a constant C > 0 such that for each A ~ ~ there is a hermitian
matrix H = H(A) with
C - ~ ~ <-0
Proof
I~ ~ Q ~ and ~ ~ ~ ~
See [l~].
To be able to apply this lemma to our problem we need:
Lemma 4 Assume that (1), (2) is correctly posed in L 2. Then there exist constants
xt ~ ~a g and C such that
Pr, oo,,f, Let C and ~" be such that ~ "fO~" 05~ ~< ~ ~ ~e ~k~
For arbitrary t > 0 let It] be its integral part. We have ~t~ ~t
which proves the lemma.
Combining Le~as 3 and 4 we have at once:
Theorem 4 If (1), (2) is correctly posed in L z then there is a constant ~ such
that
satisfies the conditions of Lemma 3. On the other hand if there is a constant
such that~ satisfies at least one of the conditions of Lemma 3, then (1), (2), is
correctly posed in L z.
59
One commonly used criterion is:
Theorem 5 Let P(~) be a normal matrix. Then (i), (2) is correctly posed if and
only if (IO) holds.
Proof By Theorem 3 we only have to prove the sufficiency. Since P( ~ ) is normal we
can find a unitary U(~ ) such that
is diagonal. Hence
which proves the result.
For later use we state:
Theorem 6 If (1), (2) is correctly posed in L 2 then (lO) holds and there are posi-
tive constant C I and C a and for each ~ ~ R d a positive definite hermitian matrix
H(~ ) such ~t
-I
and.
(13)
l:'roof By Theorem 4 there is a constant 7 such that the family S in (11) satisfies
condition (iv) of Lemma 3 with C = C I. Thus for each ~g R d there is a positive
definite H(~ ) such that
But by (12) this implies (13).
3. Difference appr0ximations in L i to initlal-value problems with constant
coefficients
Consider again the initial-value problem
ae - ? (~ )~: - Z P,~'~u. , ~ , o (l) "oe bzl ._ M
u( ' , ,o) ~ v~×) (2)
60
For the approximate solution of (i), (2) we consider explicit difference operators
of the form
where h is a small positive parameter, ~ = (~, .... ,~d) with ~j integer, e~(h) are
NxN matrices which are polynomials in h, and the sunwaation is over a finite set of ~.
We introduce the symbol of the operator ~,
which is periodic with period 2,U/h in ] and notice that for v 6 ~ ~ the Fourier
transform of ~v is n /k A
Assume that the initial-value problem (i), ( 2 ) is correctly posed. Pie then want
to choose ~ so that it approximates the solution operator E(k) when k is a positive
parameter tied to h by the relation
~/h ~ = ~ = constant;
we actually want to approximate u(x,nk) = E(nk)v = E(k)nv by ~v. In the future we
shall emphasise the dependence on k rather than h and write ~ as in Lecture i.
To accomplish this, we shall assume that E k satisfies the condition in the
following definition. We say that ~ is consistent with (i) if for any solution of
(1) ~ C ~o
i f o (k ) can be ~ p ~ c , ~ by k(~(h~), w, ~ y that ~ ie aco~ate of o r d e r f . C1~arly
any consistent scheme is accurate of order at least i.
We can express consistency and accuracy in terms of the symbol (cf. [35]):
Lemma i The operator ~ is consistent with (I) if and only if
The operator ~ is accurate of order ~ i f and only i f ~ v ~
61
The proof of (3), say, consists in proving like in the special case in Lecture
1 that consistency is equivalent to a number of algebraic conditions for the coeffi-
cients, which turn out to be equivalent to the analytic functions exp(kP(h -I ~ )) and
Ek(h-' ~ ) having the same coefficients for h j ~ up to a certain order.
Using LemmA 1 it is easy to deduce that if ~ is consistent with (1) in the
present sense then we also have consistency in the sense of Lecture 1. For the set
@ of genuine solutions in the previous definition we can for instance take the ones
corresponding to v £ ~ . From Lax's equivalence theorem it is clear that we want
to discuss the stability of operators ~ of the form described. We have
Theorem 1 khe operator ~ is stable if and only if for any T > O,
c O~
,Proof We notice that F-,k( ~ )n
in Lecture 2 that
which praves the theorem.
is the ~ymbol of ~k" It follows in the same way as
and so
Proof We have for nk ~ l,
It is easy to prove by counter-examples that (4) is not sufficient for stabili~
Necessary and sufficient conditions for stability have been given by Kreiss [18] and
Buchanan [51 ; we quote here Kreiss' result. The main content in Kreiss' theorem is
concentrated in the following Lemma. Here we have introduced the following
notation: For H hermitian and positive definite, we introduce
(4)
We now turn to the algebraic characterization of stability. We first prove
the necessity of the yon Neumann condition. For any NxN matrix A we denote byp (A)
its spectral radius, the maximum of the moduli of the eigenvalues of A.
Theorem 2 If ~ is stable in L 2, there exists a constant ~ such that
62
IAu Ii~
Recall again that for hermitlan matrices, A ~ B means (Au,u) ~ (Bu,u).
Lemma 2 Let ~ be a family of NxN matrices. Then the following four conditions are
equivalent.
(i) sup ~ ~ ~ ~ ~-~ ~--~, ~ ~ ~o
(iii) For A ~ ~, ~ (A) ~ l, and there are two constants C, and Cm and for each
A @ Jl- a matrix S = S(A) such that
and such that
Sg =
is a triangular matrix wlth
(iv) There is a constant C ~ 0 such that for each A ~ ~ there is a hermitian
matrix H = H(A) with
c- 'T_ ~ ~ ~ C I
and
Proof see [28].
To be able to apply this lemma to our problem we need the following analogue of
Lemma 2.4.
Lemma 3 Assume that ~ is stable in L 2. Then there exists a constant
for ~,~ (~ ~ ~ ~t~) one has
such that
~O
63
An alternative way of expressing this result is that for some
any n we have ~
Y,k ~ i, and
Combining Lemmas 2 and 3 we have at once :
Theorem ~ If ~he operator ~ is stable in L i, then there is a ~ such that
satisfies the conditions of Lemma 2. On the other hand, if there is a constant
such that kF satisfies at least one of the conditions of Lemma 2, then E k is stable
in L ~.
One commonly used criterion is :
~eo~m ~ Let ~ be su~ t~t ~(~) is a no~l matrix. ~en ~cn ~e~'s con-
dition is necessary and sufficient for stability.
Proof By Theorem 2 we only have to prove the sufficiency. Since ~( ~ ) is nor,~al
there is for each k ~I and ~ ~ R d a unitary matrix Uk(~) such that
is diagonal. Her~e
I ('
which proves the result. To see the relation with Lemmas 2 and 3, we could also have
formulated this as fellows. We have with the same ~ as in (4) for Fk(~ ) = e Ek(~)
that _ ~ ~\
which is diagonal with eigenvalues of modulus ~ i. Thus, afortiori it is triangu-
lar, and the estimates in condition (iii) of Lemma 2 hold.
As for existence of stable operators, we have (cf. [17]):
Theorem ~ There exist L2-stable operators consistent with (i), (2) if and only if
(I), (2) is correctly posed in L z.
Proof We first prove that the correctness is necessary.
the stability that l~p ~i: ~ f ~ / "= ~m / ~.~]a~/
It follows by Lemma i, ar~1
6't
which implies correctness.
On the other hand if (I), (2) is correctly posed one can construct a consistent
difference operator, er which is equivalent, its symbol, by setting
Using Kreiss' stability theorems one can prove that this ~ is stable for small
~= K/h~ ~ The part of this operator corresponding to the second term in (5) is
referred to as an artificial viscosity.
We shall consider some examples.
Consider the initial-value problem for a symmetric hyperbolic system
We know from Lecture 2 that this problem is correctly posed in L ~.
before a difference operator
where for simplicity we assume e~ independent of h.
by Friedrichs [8].
(6)
Consider as
We have the following result
~ = - ' I then Theorem 6 If e~ are hermitian, positive semi-definite and
and thus E k is stable.
Proof We have the generalized Cauchy-Sohwartz inequality
where (u,v) = ~ uj ~j. Therefore ~-
65
and hence with w = Ek( ~ )v, ~.
which proves the lemma.
We have
so that
satisfies
As an application, take
: , - 4
t , 3
is consistent with (6) and accurate of order i. It is clear that if
o < ~ .< ~ '~ C a / ~ l / - ~ ,
the coefficients are positive semi-definite and so the operator ~ is stable.
The operator ~ can be considered as obtained from replacing (6) by
Consider for a moment the perhaps more natural equation
which gives the consistent operatorj ~i
E~vC~} --- vl~) ÷ ~ ~ aa~ L~ l~*~-~ ( '~ -~ ' ~
with J
66
We shall prove that this operator is not stable in L ~ if any of the Aj
Assume e.g. A, ~ 0 and set ~ j = 0 for J ~ I, ~ I h =~/2.
which has the eigenvalues
is non-zero.
With this choice,
where the r e a l numbers #~ are the e igenva lues o f A~. Thus the von Neum~nn c o n d i -
t i o n is not satisfied and the operator is unstable for any ~ .
It can be shown that in general the operator ~ defined in (8) is accurate of
order exactly 1. We shall now look at an operator which is accurate of order 2 in
the case of one space dimension (d=l).
= ~ --
thus have the ~stem
(9)
Consider the difference operator
=
with
This operator is often referred to as the Lax-Wendroff operator. We have
and so, E k is consistent with (9), and in general accurate of order 2. We shall
prove :
Theorem
(lO) is stable in L 2 if and only if
Pro0f It is easy to see that the eigenvalues of ~(h -I ~ ) are
Let~j, j = 1,...,N, be the eigenvalues of A. Then the operator E k in
(ll)
and we obtain after a simple calculation
if and only if (II) holds. Since E k is clearly normal, this proves the theorem.
For a NxN matrix A consider the numerical range
67
We have :
Theorem 8 If ~ is a family of NxN matrices such that
then ~ is a stable family, that is there is a constant C such that
Proof
since
We shall prove that condition (ii) in Kreiss' theorem is satisfied. Clearly
we have ~(A) ~ 1 so that R(A;z) exists for ~z~ >I.
and v = R(A;z)w we have
or
Therefore, if w is arbitrary,
which proves the result.
Remark One can actually prove that IAnl ~ 2, A ~
This result can be used to prove the stability of certain generalizations of
the Lax-Wendroff operator to two dimensions (see [2~]).
Consider again the symmetric hyperbolic system (6) and a difference operator
of the form (7), consistent with (6). Then A(~ ) = ~(h-1~ ) is independent of h.
We say with Kreiss that E k is dissipative or order O (~ even) if there is a ~ ~ 0
such that /X)
We shall prove
Theorem 9 Under the above assumptions, if E k
pative of order ~ it is stable in L 2 .
is acct~ate of order~ -I an~ dissi-
68
Proof By the definition of accuracy, we have
o,.s :~ -? O
Let U = U( "~ ) be a unitary matrix which triangulates A( } ) so that
Since B(~ ) is upper triangular it follows that the below-diagonal elements in
exp(~UP(~ )U ~) are O(~). Since this matrix is unitary, the same can easily be
proved to hold for its above-diagonal terms, and thus the same holds for the above-
diagonal terms in B( ~ ) so that
\ o
. . . .
and the s t a b i l i t y f o l l o w ~ by c o n d i t i o n ( i i i ) i n ~ e i s s t ~ e o r e m , v
Consider now the initial-value ~roblem for a Petrovskii parabolic system
. _ ~ ~ 0
so that
We know from Lecture 2 that this problem is correctly posed in L 2. Consider a
69
aifference operator
We say, foZlo~r.l.ng John [15] and ~ id lund [38] tha t E i s a parabol ic d i f f e r e ~ e k
operator if there are constants ~ and C, S ~ 0 such that
Notice the close analogy with the concept of a dissipative operator.
Theorem 10 Let E be consistent with (12) and parabolic. Then it is stable in L ~. k
We shall base a proof on the following lemma, which we shall also need later for
other purposes.
Lemma 4 There exists a constant C N depending only on N such that for any NxN
matrix A with spectral radius ~ we have for n ~ N, !
IP, l <. C. +( Proof See [35].
Proof of Theorem 10 By consistency we have
We therefore have for n ~ N, nk ~ T, ~_Nfj
s
which proves the stability.
Consider forward difference quotients
and. for a general ~,
We then easily have the following discrete analogue of Theorem 2.2.
Theorem ii Assume that E k is parabolic. Then for any o( and T > O there is a C
such that -- ~
70
,Proof By Fourier transformation this reduces to proving . _ ~ l
and the result therefcre easily follows by (13).
We know by Lax's equivalence theorem that the stability of the parabolic
difference operators considered above implies convergence. We shall now see
that the difference quotients also converge to the corresponding derivatives,
which we know to exist for t > 0 since the systems are parabolic.
Theorem 12 Assume that (12) is parabolic and that ~ is consistent with (12) an~
parabolic. Then for any t > O, any o~ , and any v 6 L 2 we have for nk = t,
• ~ i [ ~ _ ~ li b-~ ,, v f~ ( , ) v ii - -> o ~ ~, ---. o , ( ~ , )
Proo____~f By Theorems 2,2 and ii one finds that it is sufficient to prove (14) for v A~
in the dense subset C~ . But then, by Parseval's relation,
• "~ t~ - %~
The result therefore follows by the following lemma which is a simple consequence
of Lemma i.
Lemma ~ If ~ is consistent with (12) then
l e ~ i _ i ~ ~'
uniformly for '~ in a compact set.
4. Estimates in the maximum-norm
Consider the initlal-value prohl~n for a symmetric hyperbolic system with
constant coefficients ~i
' ~ ;)=i (1)
As we recall from Lecture 2, this problem is correctly posed in L ~. However, thls
is not necessarily the case in other natural Banach spaces.
71
In this lecture we shall consider the Banach space C
continuous functions in R d with norm
In ~ one has the somewhat surprising result
Theorem I
if
of bounded, uniformly
b~ Brenner [2],
The i al- alue probl (1), (2) is oo eot posed in if and only
(3)
Let us comment that it is well known that the condition (3) is equivalent to the simultaneous diagonalizability of the Aj, that is (3) is satisfied if and only
if there exists a unitary matrix U such that
is a real diagonal matrix for all J = l,...,d. This means that if we introduce ,V
u = Uu as a new variable in (i) we can write (i) in the form
d . -b
~t--~ = ~ , ~ .~ --~^ (4)
But this is a system of N uncoupled first order differential equations. Thus, only in the case that (i) can be transformed into a system of uncoupled equations is (I),
(2) correctly posed in ~.
It can be shown that in the case of non-correctness, that is when (3) is not
satisfied, there are no consistent difference operators which are stable in the max,-
mIJ~-no X~le
We shall now consider a very special case of a system of the form (4), namely
one single equation and ~=I,
~ ~ ~ ~ real (5)
We then want to discuss the stability in the maximum-norm of consistent explicit
operators of the form
where aj are constants and only a finite number of terms cccuro Introducing the
characteristic polynomial Ck ~ "-- ~-~ ~) = ~ C~ ~
72
we have stability in L" :if and only if la( } )1 ~ 1 fur r ea l
We have
Lemma I The norm of the operator ~ in d is
We clearly have Proof
so that
On the other hand, let v(x) ~ ~ be a function withlv(x)l~ 1 such that
- ~ i.~ o,- t : o
J
Then
so that
3
This proves the lemma.
We have earlier observed that ~ has the symbol ~( ~ )n, that is the
characteristic polynomial a( ~ )n If
,I
we therefore have ~i
3
(63
It follows from Lemma i above that
3
ang the discussion of the stability will depend on estimates for the anj.
We now state the main result for this problem.
73
Theorem 2 The operator ~ is stable in the maximum-norm if and only if one of the
following two condi~ens is satisfied
.)
(~) I~ [~ i < i e ~ e p t for a t , o s t a f i n i t e n ~ b s r of points '~ ~, ~ : Z , . . . , ~
in I~1 r . ~ where I~t~l~-~ , For q= l , . . , ,Q there are constants ~ ) ~ % ~35
where ~ i s r ea l , Re ~ % > O, and'O~ is an even natural number, such that
We shall sketch a proof of the theorem in the case that E k satisfies the addi-
tional assumption
" (8)
We have
Lemma 2 Assume tha t a ( ~ ) is a trigonometric po lynomia l such that (8) i s s a t i s f i e a
and such that
= ~ (9)
where ~ is real, Re ~ > O, and ~J is even. Then, if anj is defined by (6),
there is a positive constant C independent of n and J such that
Proof By (8) and (9) there i s a ~ > 0 such that
We therefore get ~ I ~ ~Y~ i ~_,~ I0~(~II C~"~ ~ -, ~ ( -~"~ ~'01C~"~
which proves the first half of the lemma. To prove the second half, we define
74
" ~ o,~ t ~ ~1
After two integrations ~y parts, using the periodicity of a(V ) we get
Aooor~_'i.nt!l to (9) we have
and it follows for ~ \ %
~l I ~ -%
We thus get ~,
and since
~C -~ ~) ~ ~ + , , ._,~
-c , -~ ~ ~~- t-~ f ~
the result follows.
We then have
CorollaI~ Assume that ~ has the characteristic polynomial a( ~ ) which satisfies
the assumptiens ef Lemma 2. Then ~ is stable in ~,
We have 1 II ~" v, ~ i~-~f,I- <'~ i~-<~ ~,t> n y'~
C U '/'> ~ - , ,,,.~ (.~_~< ~/ . l~,l )'
<~ C(~ -~ ~'~ ~ (,,+
whioh proves the corollary.
~C~
Consider now the n e c e s s i t y o f the c o n d i t i o n ( 7 ) i which under the assumption
(~ ) ~ & c e . ~ ( 9 ) . A s ~ ~ a t (9) i s not ~ t t s f i e d . We t h ~ must have
75
- : - (,4
~ real, q('~ ) r e a l polynomial, q(O) $ O, 1 < ..~ <. " ,3
By Parseval's relation for periodic functions we have
O
and using (iO), it is easy to deduce from this hhat
.a
Using a lemma ~y van der Corput it is also possible to prove
© even, Re)~ > O .
j
~ d s o ~ ~' - ~'~
which tends to infinity with n since t < ~ '
As an application, consider for the solution of the equation (5) the Lax-
Wendroff operator, which in this case reduces to
E
We have
and
and so ~ is stable in L 2 if and only i f i ~ i ~ i On the other hand, i f O<l~l~ ~ i
. e h a v e i°'C i<i for O <
It follows from Theorem 2 that ~ is unstable in ~ . By the above proof we have
- K
Serdjukova [30 ] , [31] and Hedstrom [ 11 ] , [12] have, by using more refined
techniques of estimating the a . above, been able to give more precise estimates of na
the growth of II ~ II for the case when ~ is stable in L z but unstable in Co In
76
the particular case of the Lax-Wendroff operator, the exact result is
C i - ~ ..,
more generally, when a(~ ) has the form (i0) one has
c~ ~,~(,-~'/-.,5 .< tl ~11 -< c~_, ',~'~('-~/'~ (n) The instability present here is of course quite weak.
The proof of the sufficiency part of Theorem 2 is due to John [15] an~ Stran~
[32]. The proof of the necessity part can he found in [33]. Theorem 2 has also
been generalize~ to variable coefficients in Th°mee [3~] an~ to L p, I ~ p ~ oo,
p ~ 2, in Brenner and Thome~e [3]. The analogue of (ii) then reads
c, r,~-~'~°-~'/"~ <./i ~li,_~ .-< C~ r, '~-y~i('- H"/ Consider now the initial-value problem for a system with constant coefficients
J
which is parabolic in Petrovskli's sense, that is
- ~ i~ i~ H (12)
(l~)
where
We recall from Lecture 2 that this problem is correctly posed in L a and also that
derivatives can be estimated in L ~.
~e shall now see that (12), (13) is actually correctly posed in ~ and that
again also the derivatives can be estimated in the maximum-norm.
Theorem ~ Assume that (12) is parabolic in Petrovskii's sense. Then for t > 0
the solution operator has the form
where
. . , , , e r e is . > O . . c , t h . ,
(l?)
77
The problem is correctly posed ink.
that
and for ar~v T )~ 0 and o~ there is a C such
(18)
Proo,f To give a hint of the proof, we recall that if v g ~ then after Fourier
transformation the initial-value problem becomes a problem in ordinary differential
equations with solution
/k u (. L*")
which is the Fourier transform of (15) if exp(tP(~ )) That this is the case
follows from (17) which still remains to prove. On the other hand, once this is
proved, the estimate (18) is a trivial consequence. To hint at the proof of (17) we
need the following extension of (14) to complex numbers:
Lem~ ~ If (12) is parabolic in Petrovskii's sense there are positive constants
an~[ C such that for any ~ ~- ~ ~ ~ "1~ ~ ~j~ ~ ~,~d .,'x ( .< + c/ 1" + c .
Proof See [7].
To complete the proof of (17) we first notice that by Lena 2.2 we have for
0 <t~ T,
Using this estimate one can see that the domain of integration in (16) may be moved
so that for any /~ ~ ~
or after differentiation,
~hus, by(19) wegetforO < t S T,w~th ¢ =~.i~ ~,~ N-t + W ~
78
where the constants are independent of ."~ . Now choose
I We then have ~1~-, (!~_~
_ ~ m> +c .£ lq i ~ ~-(~c - t t J
and the estimate (17) now easily follows.
Consider now explicit difference operators
consistent with (12). We recall from Lecture 3 that ~ is called parabolic if
- ~ C ~ . ~ . . \ ~ i \ < - ~ ~ > o
that such an operator is stable in L~ and also that difference quotients may be
estimated in L 2.
We shall now see that a corresponding result holds also in the maximum-norm.
Theorem 4 Assume that E k is parabolic, and let
Q
where ~(~) is the symbol of ~ and Q = I~ ~I ~ ~I. The end(h) satisfy the /
following estimates (where difference quotients are taken with respect to ~ ),
The o p e r a t o r E k i s s t ab l e i n ~ and f o r any ~( and T ~ 0 t he re ks a O such t h a t
II "a~ I~. v II ~< C (,nK~ ~ II~/l/.) 0 4 ,~. ~ -1- (2l)
Proof For details see [39]. The fact that (21) follows from (20) is due to the
can be considerea as a Riemann sum for the integral
79
and therefore this sum is bounded independently of n.
To prove (20) one goes through essentially the same steps as in the proof of
(17) in Theorem 3, utilizing Le~ma 3.4 instead of Lena 2.2 and the following lemma
instead of Lena 3.
Lemma 4 Assume that ~ is parabolic, and let ~o be given. Then there are positive
constants ~ and C such that for 4 = "~'~" ~ /~ ~- A,} E: L~ -
Again, the estimates for the difference quotients can be used to prove their
convergence to the corresponding derivatives. We state the following result without
proof.
Theorem 5
parabolic.
Assume that (12) is parabolic and that ~ is ~nsistent with (12) and
Then for a~yt ~0, any ~ , and any v ~ ~ we have for v~ = t,
il -
The choice of ~ as the operator approximating D ~ is again rather arbit-
rary. One can indeed show that the same result as in Theorems ~ and 5 hold for any
difference operators consistent with D~
Theorems 3, 4 and 5 generalize to variable coefficients.
5. On the rate of convergence of difference schemes
In this lecture we shall work in a slightly more general setting than bef~e.
Let L p = LP(Rd), 1 ~ p < co, denote the set of measurable functions (or vector-
functions) v such that
and consider the family of Banach spaces
i/w = { )
Consider also the Sobolev spaces ~ of distributions v such that D v 6 W P P
This is also a Banach space with norm
for I~ % ~o
80
For 1 ~ p ~ ~ this can be thought of as the closure with respect to the norm (1)
$ w~ w ~ of or (~0 " Let finally be the set of v which are in for all m. This P P
means that D v is continuous for any ~ and D v~W . The set ~#o is dense in W . P P P
Consider again an initial-value problem
~-'~ - ~ ~, ~ ~ (2)
where as before P~(x) have all derivatives in (~,
In the sequel we shall demand not only that the initlal-value problem be
(3)
order, then strong correctness in W is an automatic consequence of correctness. P
Further, systems which are parabolic in Petrovskii's sense are strongly correctly
posed in W for any p with I% p ~oo. P
Consider difference operators of the same form as before, namely
We have previously defined consistency of ~ with (2) to mean that for any suffi-
ciently smooth solution u(x,t) of (2),
J
more precisely, E k is said to be accurate of order #
k --> 0 (~)
V~en (2), (3) is strongly correctly posed in Wp, it is sufficient to assume this
local condition to obtain the following global estimate:
if for such u,
correctly posed in W , but that it satisfies the stronger requirement of the follow- P
ing definition. We say that the initial-value problem is strongly correctly posed in
W if for any positive m and T, v6W m implies E(t)v~W m and there is a constant C P P P
such that for all v&W m, P
In particular, this definition implies that E(t) ]l~°C W °° . P P
It can be proved that if P(x,D) has constant coefficients, or if it is of first
81
Theorem i Assume that the initial-value problem (2), (3) is strongly correctly
posed in W and that ~ is consistent with (2) and accurate of order ~ . Then P
there exists a constant C such that for any v ~ W m+~ P
Ii <.c. ii vll * Proof See [27] . The proof consists in expanding E(k)v = u(x,k) and ~v in Taylor
series around the point (x,O), using (4), and estimating the remainder terms in
integral form. In doing so it is sufficient to consider v in the dense subset W OO of P
W. P
We now easily obtain the following estimate for the rate of convergence:
Theorem 2 Assume that the initial-value problem (2), (3) is strongly correctly
posed in W and that ~ is stable in W , consist~at with (2) and accurate of orderf P P
Then for any T > 0 there is a constant C such that for v~ ~+ ~ nk ~ T, P
Proof Vie have n ~ ~. ~-~ -S
,~--0 o (5)
and so by the stability of ~, Theorem i, and the strong correctness,
which proves the theorem.
Thus, the situation is that for initial-values in W we have (by Lax's equiva- P
lence theorem) convergence without any added information on its rate, and if the
initial-values are known to be in ~+~ we can conclude that the rate of convergence P
is O(h ~) when h ~0. It is natural to ask what one can say if the initial-values
belong to a space "intermediate" to W and ~+/~. To answer this question we shall P P
introduce some spaces of functions which are interpolation spaces between W and W m P P
in the sense of the theory of interpolation of Banach spaces (cf. [27] and ref~enems~
Let s be a positive real number and write s = 8+c~ , S integer, 0 < ~ ~ 1.
Set T~v(x) = v(x+ T). We then denote by B s the space of v~W such that the follow- P P
ing norm is finite, namely
= iI. -, i l ll T - b ~i= S t : . $o ~" °
Thus, B s is defined by a Lipsehitz type condition for the derivatives of order S; P
82
these spaces are sometimes called Lipschitz spaces.
L we have for 1 ~ p < co
For the Heavysi&e function (4=1)
we h a v e S_
' -~ ~ = ~ • (7) ~< CC, c ~ ~I~II
Theorem 2 and (7) with A = ~-E(nk) prove immediately the following result:
Theorem ~ Assume that the initlal-value problem (2), (3) is strongly correctly
posed in W and that ~ is stable in W , consistent with (2), and accurate of order P p
• Then for O~ s < M+ ~ and T >0 there is a constant C such that for any V~Bp,
nk~T,
' E " ,< C k ~ ' ; - '~ - ~ . II k ~. -~(."~'Y~ v//',,'at, / / v l / ~ ~ X-r,, . .t . , (~)
t,. l", ',;."t " Notice that ~ = grows with~ and lim ~ = 1. This means that the
estimate (8) becomes increasingly better for fixed s when ~ grows. In other weras,
if for a given strongly correctly posed initial-value problem one can construct
stable difference schemes of arbitrarily high order of accuracy, then given a~ s ~ 0
one can obtain rates of convergence arbitrarily close to O(h s) when h *0, for all
initial-values in B s p"
As an application, consider an L 2 stable operator ~ with order of accuracy
for the hyperbolic equation
~-'-~ = " -by , -~
and it follows th~ if ~ C~ th~n ¢ ~ ~ ~
One can prove that B sl C B s2 if s I ~ s2, and that for integer s and ~ > 0 P P
arbitrary, B s+~ ~ W s CB s . The main property of these spaces that we will need is P P P
then the following interpolation property: Assume that 1 ~ p ~ ~ , m is a natural
number, and m is a real number with 0 < s ~ m. Then there is a constant C such
that any bounded linear operator A in W with P
85
and letv= ~,~where ~ eo and'~ we have in this case
is the Heavyside function (~). By above
For dissipative operators ~, stronger results have been obtained in Apelkrans [I],
and Brenner and Thcme~e [4], where also the spreading of discontinuities is discussed.
It is natural to ask if for a parabolic system, the smoothing property of the
solution operator can be used to reduce the regularity demands on the initial data
in Theorems 2 and 3. This is indeed the case. Before we state the result we give
the following result, which follows easily from properties of fundamental solutions.
v
Theorem A Assume that (I) is parabolic in Petrovskii's sense. Then for any p with
1 $ p ~ co, any m> @ and T > 0 there is a constant C such that
0 -~-k ~ -'f"
We can now state and prove the result about the rate of convergence in the
parabolic case.
Theorem ~ Assume that (2) is parabolic in Petrovskii's sense and that E k is stable
in %, consistent with (2) and accurate of order ~ . Then for any s > O, T ~ O,
there is a constant C such that for v E B s p, nk ~ T,
t/
Proo___~f For details, see [27]. Here we shall only sketch the proof for the case
v C~ B s where ~ ~Q ~*~the other cases can be treated similarly. P
We shall use (5). For J = 0 we have by the stability and Theorem l,
,,_, ~ ~Ck ~'~ It~1/ ~*~
L and sc by (7), since s > y
s
For J > 0 we have by Theorems i and 4,
84
.<
and hence by (7),
where
Since therefore _~
we get by adding over j
which proves (9) im the case considered.
S
Investigations by Hedstrom [13], [14], Lefstrom [25], and Widlund [40] have
1 shown that in special cases the factor log ~ can be removed from the middle inequa-
lity in (9). In particular the following result Ms been proved by Widlund [40].
Theorem 6 Assume that (2) and ~ are parabolic in the sense of Petrovskii an~ John,
respectively, and that ~ is accurate of order ~ . Then for any T ~ O there is a
constant C such that for V~Bp ~, nk~T,
The proof of this fact is considerably more complicated than the above and
depends on estimates for the discrete fundamental solution. Using these estimates
it is also possible to get estimates for the rate of convergence of difference
quotients to the c rresponding derivatives. We have thus the following more precise
version of Theorem 4.5.
Theorem 7 Assume that (2) and ~ are parabolic in the sense of Petrovskii and John
respectively, and that ~ is accurate of order ~i ° Let
85
be a finite difference operator which is consistent with the differential operator Q
of order q and also accurate of order~ . Then for O< nk = t ~ T,
In view of the fact that unsmooth initial-data give rise to lower rates of con-
vergence it is natural to ask if the convergence be made faster by first smoothing
the initial-data. This is indeed the case for parabolic systems and we shall des-
cribe a result to this effect(Kreiss, Thome~e and Widlund [21]).
We shall consider operators of the form
a,, v .._ ¢~,-~- v ~,c-,/= ~_a ¢ ( , < " , , / , :9
where ~ is a function such that its Fourier transform satisfies A
.A
Here b~ ~), " j = 0,I are such that for some S ~ 0, b~ V) '~ and b~ ) " coincide with multi-
pliers on ~W for ~ ~ ~ <~ ~ and I~ 1 >~ respectively. Such an aperator is said to P
be a smoothing operator of order ~ in W . Since the multipliers on ~L 2 are simply P
the functions in Lee , the above condition can be seen to be satisfied for p=2 if
and for any multi-index ~ } O 3
~9'~ : ¢ C 1 ~ - ~ ~1~] "" ~=~ ~ uniformly in .~.
Special examples of smoothing operators of orders 1 and 2, respectively, in ~he
case d=l are [ k
'1
86
and for general
the form
, a smoothing operator of order ~ can easily be constructed in
D
where ~ is a function which is piecewise a polynomial of degree ~ -I and which \
vanishes outside i~4~j~,~-. -~ for ~ odd and L--}x4\jy-~) for~ even.
p=2, the operator ~ corresponding to
For
where 0 ~ ~ ~ Y~ is a smoothing operator of arbitrarily high order. In this
c a s e
Smoothing operators in higher dimensions can be obtained by taking products of one-
dimensional operators d
where ~,j is a smoothing operator with respect to xj.
The result on the rate of convergence is then the following.
Theorem 8 Assume that (2) and ~ are parabolic in the sense of Petrovskii and John,
respectively, amd that ~ and ~ are accurate of order ~ . Then there is a constant
C=Cp, T such that for 0 ~ t = ~k ~ T,
We shall complete this lecture by considering the case of the simple hyper-
bolic equation
--y ~ ~ ~ real constant ~
and a consistent difference operator of the form \
with characteristic polynomial
87
we shall examine the case when ~ is stable in L a but unstable in ~ ; more pro-
t[~ real pol~o~al , t( °~ ~ 0
The following result on the rate of convergence is due to Hedstrom [13] and Brenner
and Thomde [3].
Theorem ~ Under the above assumptions on the operator Ek, then for s ~ O, s ~*~
and ~ +,/~ ~ -~1 ~ and ,~ ~ T,
: o , °
It can also be proved that this result is best possible in the sense that the
function g(s) above is the largest for which an estimate of the form (I0) holds for
all v6 B s. In the stable cases, i.e. when ~3 =f*~ or p=2~ the order of conver- P
gence is s)~/( )~ +I) when 0 < s < J~- +i im agreement with Theorem 3. In the oppo-
site case t h e e r r o r i s l a r g e r f o r s < /~-" l~ l (% - b * ' \ ~--/" For small s, g(s) is then
negative and for s--O we recognize the exponent in e.g. (&.ll) (with some difference
in notation)e
It is interesting to note that if the irregularity of the initial-function
stems from the behavicur at isolated points then the result above may be improved
so that for p = ~ we obtain the same result as if ~ were stable in ~. We shall
formulate this result in terms of the Banach space ~ of functions v with support in
[-M,O] such that
is finite. Using the L ~ convergence result, Sobolev's inequality, and the fact that
is continuously embedded in B~ +~ one can prove the fonowing result BvS.
cisely we shall assume that
I ~I ~ I for
88
Theorem lO
positive M 8Ln~. s ~ ~ ÷i end for TLk ~ T. h~,e,~j~./(r+,B)
This result also holds when ~ depends upon x. /
Consider a L, stable operator ~ for the equation (5). Then for given
89
REFERENCES
[i] M.Y.T. Apelkrans. On difference schemes for hyperbolic equations with dis-
continuous initial values. Math. Comp. 22 (1968), 525-539.
[2] Ph. Brenner. The Cauchy problem for sy~netric hyperbolic systems in L . P
Math. Scand. 19 (1966), 27-37.
[3] Ph. Brenner and V. Thomee. Stability and convergence rates in L for certain P
difference schemes. Math. Scand. To appear.
/
[Z~] Ph. Brenner and V. Thomee. Estimates near discontinuities for some difference
schemes. To appear.
[5] M.L. Buchanan. A necessary and sufficient condition for stability of diffe-
rence schemes for intial-value problems. J. Soc.Indust.Appl.Math. Ii
(1963), 919-935.
[6] R. Courant, K. Friedrichs and H. Lewy. ~er die partiellen Differenzen-
glelchungen der mathematlschen Physik. Math. Ann. lO0 (1928), 32-74.
[7] A. Friedman. Partial differential equations of parabolic type.
Prentice-Hall. Englewood Cliffs, New Jersey, 1964.
[8] K. Friedrichs. Symmetric hyperbolic linear differential equations.
Comm. Pure Appl. Math. 7 (1954), 3&5-392.
[9] I.M. Gelfand and G.E. Schilow, Verallgemeinerte Funktionen III.
Deutscher Verlag der Wissenschaften, Berlin, 196&.
S.K. Godunov and V.S. Ryabenkii. Introduction to the theory of difference
schemes. Interscience. New York, 196~.
G.W. Hedstrom. The near-stability of the Lax-Wendroff method.
Numer. Math. 7 (1965), 73-77.
G.W. Hedstrom. Norms of powers of absolutely convergent Fourier series.
Michigan Math. J. 13 (1966), 393-&16.
G.W. Hedstrom. The rate of convergence of some difference schemes.
SIAM J. Numer. Anal. 5 (1968), 363-&06.
G.W. Hedstrom, The rate of convergence of difference schemes with constant
coefficients. BIT 9 (1969), 1-17.
F. John, On integration of parabolic equations by difference methods,
Con. Pure AppI. Math. 5 (1952), 155-211. II
H.0. Kreiss. Uber Matrlzen die beschrm~kte Halbgrupp~ erzeugen.
Math, Stand. 7 (1959), 71-80.
[lO]
I l l ]
[12]
[13]
[ l~]
[15]
Ix6]
90
[17] H.0. Kreiss. Uber die Losung des Cauchy problems fur lineare partielle
Differentialgleichungen mlt Hilfe yon Differenzengleichungen.
Acta Math. lO1 (1959), 179-199.
[18] H.0. Kreiss. Uber die Stabilitatsdefinitien fur Differenzengleichumgen
die partielle Differentialgleichungen approximieren. BIT 2(1962 ), 153 -181.
[19] H.0. Kreiss. Uber sachgemasse Cauchyprobleme. Math. Soand. 13 (1963),
109-128.
[20] H.O. Kreiss. On difference approximations of dissipative type for hyper-
bolic differential equations. Comm. Pure Appl. Math. 17(1964), 335-353.
[21] H.O. Kreiss, V. Thom~e and O.B. Widlund. Smoothing of initial data and
rates of convergence for parabolic difference equations.
Comm. Pure Appl. Math. To appear.
[22] P.D. Lax and R.D. Richtmyer. Survey of the stability of linear finite
difference equations. Comm. Pure Appl. Math. 9 (1956), 267-293.
[23] P.D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl.
Math. 13 (1960), 217-237.
[24] P.D. Lax and B. Wendroff. Difference schemes for hyperbolic equations with
high order or accuracy. Comm. Pure Appl. Math. 17 (196~), 381-398.
[25] J. Lofst~m. Besov spaces in theory of approximation.
Ann. Math. Pure Appl. 85 (1970), 93-184.
[26] @.G. O'Brien, M.A. Hymaa and S. Kaplan. A study of the numerical solution
of partial differential equations. J. Math. and Phys. 29(1951), 223-251.
[27] J. Peetre and V. Thom~e. On the rate of convergence for discrete initial-
value problems. Math. Scand. 21 (1967), 159-176.
[28] R.D. Richtmyer and K.W. Morton. Difference methods for initial-value
problems. 2nd ed., Interscience, New York, 1967.
[29] V.S. P%vabenkii and A.F. Fillipow. Uber die Stabilitat yon Differenzen-
gleichungen. Deutscher Verlag der Wissenschaften, Berlin, 1960.
[30] S.I. Serdjukova. A study of stability of explicit schemes with constant w
real coefficients. Z. Vycisl. Mat. i Mat. Fiz. 3 (1963), 365-370.
[31] S.I. Berdjukeva. On the stability in C of linear difference schemes with
constant real coefficients. Z. Vycisl. Mat i Mat. Fiz. 6(1966), &77-486.
[32] W.G. Strang. Polynomial approximation of Bernstein type.
Trans. Amer. Math. Soc. 105 (1962), 525-535.
[33] V. Thomee. Stability of difference schemes in the maximum-norm.
J. Differential Equations 1 (1965), 273-292.
91
[3~]
[55]
[36]
[57]
[38]
[39]
V.
V.
V.
V.
Th°mee. On maximum-norm stable difference operators. Numerical Solution
of Partial Differential Equations (Proc. Syrup.s. Univ. Maryland, 1965),
pp. 125-151. Academic Press. New York.
Theme~e. Parabolic difference operators. Math, Scan~, 19 (1966), 77-107.
Thomee. Stability theory for partial difference operators.
SIAM Rev. Ii (1969), 152-195.
Thomee. 0n the rate of convergence of difference schemes for hyperbolic
equations. Numerical Solution of Partial Differential Equations. (Proc.
Sympos. Univ. Maryland~ 1970). To appear.
0.B. Widlund. On the stability of parabolic difference schemes.
Math. Comp. 19 (1965), 1-13.
O.B. Widlund. Stability of parabolic difference schemes in the maximum-
norm. Numer. Math. 8 (1966), 186-202.
0.B. Widlund. On the rate of convergence for parabolic difference schemes,
II. Comm. Pure AppI. Math. 23 (1970), 79-96.
Iteration Parameters in the
Numerical Solution of Elliptic Problems
EUGENE L. WACHSPRESS
General Electric Company Schenectady, New York
94
These n o t e s a r e i n t e n d e d t o s e r v e a s a g u i d e to a d e e p e r s tud~ o f ~ a t e r i a l
p r e s e n t e d i n a s e r i e s o f l e c t u r e s d e l i v e r e d i n Sep tember , 1970 a t t h e U n i v e r s i t y o f
Dundee as a part of the special one year symposium on The Theory of Numerical
Aridly sis.
Lecture
i
Subject
A Concise Review of the General Topic and
Background Theory
Successive Overrelaxation: Theory
Successive 0verrelaxation: Practice
Residual Polynomials and Chebyshev Extrapolation:
Theory
Residual Polynomials : Practice
Alternating-Direction.Implicit Iteration:
Theory
Parameters for the Peacems~-Rachford Variant
of ADI
Reference text: "Iterative Solution of Elliptic Systems," by Wachspress (Prentice Hall, 1966).
95
i. A CONCISE REVIEW OF THE @ENERAL TOPIC AND BACKGROUND THEORY
We are concerned with iterative approximation to the vector x which satisfies
the system of linear equations
Ax = k m
where: k is a known m-vector
A is a given nonsingular mxm matrix, and
x is an m-vector which is to be found.
An approximation y to x is acceptable when
/ / ~ . - x / / < E,
/ / ~_ / /
(1)
(2)
where E is some prescribed error bound and // . // a designated norm.
We shall first categorize various iteration procedures. We shall then des-
cribe a measure of efficiency or rate of convergence. Having done this, we will
indicate a rather general approach to demonstrating convergence for a wide class of
methods for iterative solution of linear systems. Finally, an example of each of
three major classes of linear iteration procedures will be given.
It is convenient to restrict matrix A in (i) to be real, symmetric, and
positive definite. (Our definition of positive definite is such that a real matrix
which is p.d. must be symmetric. ) Although less restrictive conditions are subject
to ar~lysis, the discussion is greatly simplified in this manner.
A STATIONARY LINEAR ITERATION procedure is characterized by :
xo = a known "trial" vect~ (3)
~, = T ~n-, + R k , n = 1,2, ....
This procedure is convergent if~ x n ~ x = A-'k for any xo and k.
In oraer that x be a stationary point, we require:
x = ~ x + ~ k . (J+) m m
Defining the error vector e_, = ~, - x, and subtracting (4) from (3), we get
= w _on I = T" ao , (5)
en ~ O_ for arbitrary eo iff Tn ~ O.
96
The spectral Tad_lus of T is r(T) = max/gi(T)/ where the gi are eigenvalues
of ~. Thus, r(~) muet be lose t~n unity for oonverge~e. From (~), (:-~)A-'k_ =
R k for any k so that (I-T)A -1 = R, and (3) can be written
x_, = r ~_, + (:-T)A-'~_. (6)
If we could compute A-'k on the right hand side of (6), we would have no need for
iteration. Thus, T must be such that the right hand side of (6) does not require
computation of A "I or A-'k. To clarify this, suppose we can solve the system
Bx = k for x where B approximates A in some sense. We may attempt to iterate by:
B x. = (B-A) x._, + k . cr
_x. = (:-B-'A) x._, + B-'k.
Here, T = I-B-'A and (I-T)A-' = B-' = R.
We note that B is a "good" approximation to A for this iteration when the spectral
radius of (I-B -I A) is much less than unity.
We will now derive a condition sufficient to assure convergence which has
application to many iteration techniques.
A is positive definite and hence has a unique positive-definite square root,
A ~. Thus, (I-B"A) is similar to (I-A~B-'A~), whose spectral radius is bounded
above by the square root of the largest eigenvalue of
, I I I A ½B'' (B+BT-A)BT-I A ½ (7) (I-A:B-'A:) (I-A'~BT-IA:) = I-
B T The condition sufficient for convergence is that B + - A be positive definite.
For then we can define K = A~B-'(B+BT-A) ~ and M = (I-A~B-'A ~) and rewrite (7) as
MM T = I-KKT; we note here that K is nonsingular, being the product of nonsingular
matrices. Thus 0 ~ g(MM T) = g(l-KE T) = I - g(KK T) < i, and we have proved that
r(I-B-'A) < i if B+B T - A is positive definite. To convince ourselves that this is
not a necessary condition, we need only find one counterexample :
is net positive definite.
yield
with r(T) = ~ while B+B T - A
97
If we choose our norm so that // T // "~ r(T) and if xo = O_, then eo = -x and
II e. /I =//~"!o/I <_. II ~ lln /I ~o II ~ r" II ~_ II so tha~ II ~. /I < ~. for
// x ll n "- log E log r " We note that when r is close to uni~ the number of iterations
required to satisfy a prescribed convergence criterion varies as i/(l-r).
A PARTIALLY STATIONARY LINEAR procedure is one for which the approximation
is obtained as a linear combination of vectors which could be obtained by a station-
ary procedure. Let xj (J = 1,2,...,n) be iterates generated hy (6). Then the n-th
iterate of a partially stationary procedure based upon this stationary iteration
would be n V--
y, = ~ ajn xj. If we define fn = Yn - Y = y., - x, then
fn = Pn(T) e_o , where P,(T) is a polynomial of degree n in T, normalized to unity
when T=I. (I is the identity matrix of order the same as T.)
Convergence is established by showing that for a given Pn and T: 1
r limit Isup /P,(gi) /~ 1 = < I
n -, co
where the gi are the eigenvalues of matrix T.
A NONSTATIONARY LINEAR iterative procedure is one for which the iteration
matrix is a function of parameters which may change from iteration to iteration:
n n Xn = Tnxn_, + R.k . or_ Xn = I~I Tj) xo + (I- J I~ Tj) A-'k_ • If r. is the
n I/n spectral radius of jn I= Tj, then the asymptotic convergence rate is n~oolimit rn .
When the Tj all commute, analysis is quite similar to that applied to partially
stationary schemes. When the Tj do not commute convergence theory is often less
definitive.
In examining relative merits of iteration procedures, we endeavor first to
establish convergence for a range of iteration parameters, second to determine the
spectral radius as a function of these parameters, and third to ascertain how these
parameters may be chosen to minimise this spectral radius. Thus, a minimax pro~lam
arises in the analysis of each iterative procedure. Three commonly used techniques
will be considered in subsequent lectures.
98
The se are :
I. Successive Overrelaxation (stationary)
II. Chebyshev Extrapolation (partially stationary)
III. ~JLternating-Direction-Implicit Iteration (nonstationary).
Each of these is well documented in the literature and these notes are intended only
as an introduction to the subject rather than a detailed analysis.
2. SUCCESSIVE 0VERRELAXATION: THEORY
We may "improve" the value of a component of the vector x by computing as a
new value during iteration n that number which yields satisfaction of the p-th equa-
tion with values from iteration n-I substituted in the equation for the remaining
components of x: m
= k from iteration n-I for j % p apjXj P xj nforj =p.
j=l
An iteration consists in improvement of all components of x. We may call this
"Relaxation" or "simultaneous relaxation". On an array computer with many arithmetic
units working in parallel, it would be possible to improve all components simultan-
eously.
We may, alternatively, use new neighbour values as soon as they are computed.
Then ~ apjXjn + J=~p+lapjXjn-1 = kp , p -- 1,2,...,m.
Components are new improved in some order, and we call this "successive relaxation".
Although this latter approach is often better than simultaneous relaxation, a more
significant gain in efficiency is usually achieved by extrapolation. If we denote
the unextrapolated result of successive relaxation at point p by Xpn , then before
proceeding to the next point during the n-th iteration, we compute for a prescribed
extrapolation parameter w :
+ w(x~ - ) . (8) Xpn = Xpn-1 ~n Xpn-1
Numerical solution of elliptic type difference equations is accomplished with
w in (~,2) and since w is greater than unity this is called "successive overrelaxa-
ticn." Among factors responsible for extensive literature on S0R are its simplicity,
99
wide applicability, and firm theoretical foundation.
D.M. Young's analysis provided a basis for efficient utilization of SOR. He
introduced the concept of a "consistent ordering" which is related to what is now
known as Young's Property A. The equations of a system having this property may be
ordered as follows:
An index s(p) is assigned to unknown x associated with the p-th equation. P
Then for every nonzero apj, the ordering s(p) is said to be consistent if
s(j) = s(p)-i for j < p
s(j) = s(p)+l for j > p.
Components of x are improved in order of increasing s. Points (equations) with a
common s-value which are coupled to one another directly must be updated simultane-
ously. Many systems arising in practice may be consistently ordered without a need
for simultaneous improvement even though several components may have colmnon s-values.
Five-point difference stars are an example.
Young established an intimate relationship between eigensolutions of the
simultaneous relaxation and consistently ordered $0R iteration matrices derived
therefrom. If g is an eigenvalue of the SR matrix, then there is a corresponding
eigenvalue h of the S0R matrix satisfying:
The optimum extrapolation parameter w b is obtained by solving the minimax
problem:
H(w,@) = maximum /h(w,g)/
H(Wb,@ ) = minimum H(w,G). w
The solution to this minimax problem (which is a rather simple minimax problem) is
2 wb = 1 + J1-~ 2 and H(wb,~) = Wb-1. (10)
The remarkable gain in efficiency of S0R over SR is evidenced for the case &=-l-r for
r << 1 by comparing the relative number of iterations required by the two methods
for an error reductions of E:
-in E nSR = -in E/r while ns0 R = 2 2~r " (II)
100
The potency of the convergence theorem given in the first lecture is illus-
trated by applying it to the S0R iteration matrix. Let the coefficient matrix be
A = D - R - R T where D is diagonal (and positive since A is positive-definite by
hypothesis) an~ R is strictly lower triangular. If S0R is applied with the natural
ordering (successive updating of components 1,2,... ) then the S0R iteration matrix
is ~. = (I - wD-'R)-' (I - wl + wD-'R ~) .
This may be written in the alternative form
D L = I - (w - R)-' A.
Thus, B in equation (7) is equal to D _ R in this case, and w
. . . . . ~T (2/w - l) D. B + B T A 2D/w R R T D + R + =
The spectral radius of L is less than unity when 2/w - i is greater than zero, or
when w is in (0,2).
It can also be shown by this approach that S0R converges even when the order-
ing and the extrapolation parameter are changed each iteration.
When one digs deeper into the theory, one finds that the S0R iteration matrix
with optimum parameter and consistent ordering does not have a diagonal Jordan form.
The resulting eigenvector deficiency has an adverse effect on convergence.
3. SUCCESSIVE OVERRELAXATION: PRACTICE
For efficient implementation of SOR, one must choose an appropriate ordering
and estimate the optimum parameter w b. One must also choose a strategy consistent
with the characteristics of the computer for which the iteration program is designed.
This latter point is sometimes overlooked. One illustration is that on the CDC-6600
there is a stack feature which leads to a gain in speed by a factor of ten in the
basic arithmetic when one programs the "inner arithmetic loop" in machine language,
taking full advantage of the stack feature rather than relying on FORTRAN. Another
consideration is relative efficiency of getting data in and out of fast memory an~
of computation once the data is in memory. On some machines several iterations can
be performed in the time it takes to read the data in and out of memory. The method
of concurrent iterations enables one to perform several iterations with one pass over
the equations. This is particularly important when solving large problems where all
101
the data cannot be contained in memory.
Periodic boundary conditions present minor difficulties. Line relaxation is
beneficial, and the proper choice of lines enables retention of consistent ordering.
(I often think in terms of problems arising from discretization of partial differen-
tial equations tc obtain five-point or seven-point difference stars. )
A major consideration in any event is choice of w. As the spectral radius of
SR approaches unity, it becomes increasingly more important to choose a good w.
Several elaborate techniques have been described for estimating the extrapolation
parameter. However, a reasonably effective procedt~e which I have had success with
for many years does net require any sophisticated additional programming.
One starts with a parameter, wo, chosen deliberately smaller than w b and iter-
ates until am asymptotic convergence rate is established. This convergence rate is
measured by comparing successive changes in the vector:
II~ _~o II = H(Wo, ~), ~ ~(Wo,~).
?or Wo < %: ~(wo,~) = (Wo~12 + J~'l~ + i - Wo )' .
To estimate w b we estimate, using the above equation:
(i - H (w~,G).) (H(wo,G)n - (Wo - I)') (1-=') (l-C,). (12) = " ~'wo H(wo,=).
We thus estimate after n iterations with Wo :
2
w,= i+ J(l - ~)~ "
The presence of higher modes decaying at the rate Wo - i is such that in practice
the asymptotic rate is approached from below and w I is less than w b. Thus, the
extrapolation parameter may be updated periodically in this fashion.
If one encounters a class of problems for which w b is quite close to two (say,
greater than 1.95), then one should seek alternative procedures to either replace or
supplement the SOR iteration.
102
4. RESIDUAL POLYNOMIALS: CHEBYSHEV EXTRAPOLATION: T~EORY
We may append to the stationary procedure
~_*. : T _~n_, + (T -T)A- ' _k
the extrapolation
(13)
~_. = _ ~ . , + w. (_~'. - _~., ) (14)
This differs from S0R in that improved values do not appear on the right hand side
of (13) and the extrapolation parameter varle s with n. It is easily shown that
= [j~ (T + wj(T-T))]_eo = Pn(T) ~o (15) e_.
where P.(T) is a polynomial of degree n in T (wj~O) and P.(I) = I. To obtain a
least upper bound on the error norm reduction for arbitrary eo and spectrum g of T,
we solve the minimax problem
.p. = sup IP.(gi)I gi
(16) HCn : sin Hpn
P.(1) = l
This classic Chebyshev problem has as its solution for gi arbitrary in the
interval (a,b) with b less than unity:
cos(n cos-' 2g-(~+b) ) 0 . ( 6 ) : b-a (17)
2-~a÷bl ) " oosh(n cosh -I h-a
The roots of this polynomial(b_a)arecos (2j-l)2n + (a+b)
z . : (18) 2 J = 1,2,...,n.
By choosing wj =Izl-!- for use in (14), we generate the Chebyshev polynomial of
degree ~ after n iterations.
This approach has some undesirable features. We must decide upon n in advance
to compute the set of wj. When h is close to unity, some of the wj become quite
large. In this case, large values of n are often required to yield an acceptable
approximation. Roundoff error can become detrimemtal.
By utilizing the recursion formulas for Chebyshev polynomials, we can refine
the iteration. A three-term extrapolation formula is now used:
103
where the pj
xj = xj_ 1 + pj(xj - xj_ I) + qj(xj_ 1
and qj are generated by:
(b-a) U = 2 - (a+b)
- x j_2) (20)
p, = 2~/(b-~) q, = 0
& PJ = (b_aC)j_l(1/u) Ci_2(1/u)
cjU/u) and qj = cj(t/u)
for j > I.
(21)
(22)
The pj and qj approach asymptotic values of order magnitude unity as b approaches
u n i t y , r o u n d o f f e r r o r i s no l o n g e r a s e r i o u s p r o b l e m , and we n e e d no l o n g e r d e c i d e
i n advance upon t h e number o f i t e r a t i o n s .
There are other polynomial extrapolation procedures in common use. One family
of iteration schemes based on Lanczos' work involves computation of the pj and qj
from certain inner products of the xj and x~ vectors. These include the steepest
descent and conjugate gradient methods.
5 • RESIDUAL POLYNOMIALS : PRACTICE
The eigenvalue interval (a,b) upon which the extrapolation parameters are
based may be estimated from observed convergence with assumed bounds. 0s¢illatox,j
error behavior indicates that the assumed lower bound is too large while a uniformly
signed error indicates the upper bound is too low (for the case where the assumed
bounds lie within the true bounds). By noting the sign of the error and the rate of
change, one can update the eigenvalue interval.
Although it is possible to start a new parameter cycle after each updating,
one may use the asymptotic values for pj and qj immediately. This has proven to be
quite satisfactory for many problems.
Systems which satisfy Young's Property A should be treated by the golub-Varga
method which reduces the arithmetic by a factor of two. (See Pp. 155-6 in the ref-
erence text. )
104
The Chebyshev and S0R methods are comparable in efficiency. The Chebyshev
method does often yield a greater reduction in error norm for a given number of
iterations, but other factors often outweigh this. Computer and problem character-
istics often dictate which approach is better.
We may examine more precisely means by which the interval [a,b] may be esti-
mated. Although I have had no occasion to use the procedure which will now be
described, thus making this discussion more "theory" than "practice", the means by
which parameters are updated is one of the more practical aspects of iteration and
thus falls appropriately in this section.
When Chebyahev extrapolation is based upon an assumed eigenvalue interval
[a',b'] which contains the true interval [a,b], the asymptotic convergence rate is
ct_l ( ~ ) ~e e r t l i m =
mE b t - a ! where ~e 2-(a'+b' )
When [a,b] is not contained in[a',b'], error components associated with eigen-
values outside Is,b] ~ [a',b'] will eventually predominate. If it is known that
either a' ~ a or b' ~ b, we seek only one bound and the procedure is analogous tc
that already described for SOR. We shall describe a more sophisticated approach for
estimating both a and b.
After sufficiently ma~V iterations, s, we suppose that successive iterates
satisfy
~+t ~- + ~t z , ( s ) + ~t ~ ( s ) , t = ~ , 2 , 3 , . . .
and define for k = 1,2,3, :
• k-i k-I = --%+k-~ - U s + k = k , ( l - k , ) X, + X, ( 1 - k ~ ) v~ .
It is easily shown that
e~ - (! + i 1 . x, i~)~ + ~ ~ = o .
Let w (a,O) -- e, - ae2 + ~ . ~ .
105
We may ascertain values for ~ and ~ which minimize II w (a,~) [12
~o = C_,, .,~ ) (~.,~) - (,_, .,~ ) (,~.,~)
(,~.,~) (m.,~) - (,~.,~)'
and ~o __ (-', -m ) ('~-'~) - (-" .m ) (m-'~) ('~ -m ) ('~-m) - ('~ -m )"
We may for example, compute =o and ~o after every ten iterations
(s = 10,20,30,... ) until values are obtained which do not change appreciably with s.
Having determined ae and Be we may estimate k, and ka by using the relationship :
1 1 1 r + ^,r : ao , ~,x, - ~o
Thus ~o + 2~4"~°o- ~+S~ and ),2 "~ -
= = 2~o
are the estimated values for k.
Referring to the Chebyshev polynomials with eigenvalue Z of the basic itera-
tion (SR) matrix, we have
k'r
2Z - (a'+~') Now let x = bt_a ,
Then
cosh [(s+t) oosh-' (2Z b-'-a'(a'+b'))] c2z - (a'+b')]
cosh [(s+t-l) cosh-'~ b'-a'
x "- r ( x ÷ ~ ) , and we have
x = r/x + X/r
2
The estimates for a and b are obtained by substitution in the above equation:
b'-a' (z +~)3 if x, ~r, = ½ [Ca'÷b') ÷ ~ x,
b ' - a ' (r_. a ~ a = - , ! , [ ( a ' + b ' ) + ~ ~, + r ) ] i f ~ , ~ - r
This analysis is intended primarily as a guide to further study of methods for
estimating extrapolation parameters. Numerical procedures should be molded to suit
specific problems. We have indicated how a comparison of observed convergence with
theoretical convergence provides a means for updating parameters.
106
6. ALTERNATING-DIRECTION-IMPLICIT ITERATION
A family of non-stationary procedures is generated by splitting A into the stem
of two matrices so that a two step procedure of the following type is formed:
A=H+V
(H + wjl)xj_~ = -(v - wj~)xj_ I + _k (23)
(v + z/)x_j = -(H - ,jl)x_j_~ + _k
J = 1,2, ....
The matrices H and V are chosen so that this iteration is numerically con-
venient. For example, five-point difference equations can be handled when H
includes horizontal (along lines of constant y-value) coupling while V includes all
vertical (along lines of constant x-value) coupling. Then (23) involves a correc-
tion of each horizontal line treated as a block followed by correction of each ver-
tical line. When H and V are both positive-definite and all the wj and sj are
equal to a single positive constant, say w, convergence is easily demonstrated;
_e. = T" eo where T = (V+wI)-'(H-wI)(H+wI)"(V-wI).
T is similar to T' = K(H) . K(V) where K(X) " (X-wI)(X+wI)-' has a spectral norm
less than unity for a~y positive-definite X. Hence, the spectral radius of T' is
less than unity.
Convergence i s not a s su r ed wi thout f u r t h e r c o n d i t i o n s when wj and z j va ry
with ~. Pear~y's theorem asserts that by using a large enough number of parameters
in a monotonically noninoreasing sequence (within the interval of the eigenvalues
of H and V) one can obtain a spectral radius less than unity. Repetitive appli-
cation of such a parameter cycle is convergent.
The theory for this procedure is not as useful in application as theory for
S0R and polynomial extrapolation.
The analysis of the model problem, wherein H and V co-,,ute, is quite elegant.
Optimum parameters are found by solving the minimax problem:
t
g(x,._,_.) - w~ - x (25) Z~ + X
107
H = ~imum tg(x,m,~_>.g(y,"_,=_ll B
a~x~b
c ~ y ~ d ( 2 6 )
H o o = m i n i m u m H E ,z_ E,A W Z
The e x i s t e n c e o f a u n i q u e s o l u t i o n to t h i s min imax ~ o b l e m was r e c e n t l y
e s t a b l i s h e d i n ~y t h e s i s ( s e e ~ I P ' 68 p r o c e e d i n g s f o r a c o n c i s e summary ) . S e v e r a l
means are available for choosing nearly optimum parameters.
It is interesting to review the literature on this problem and note how the
theory has been developed during the past fifteen years. An analytic solution for
O zO w and _ found by W.B. Jordan culminated the search for optimum parameters. Never-
theless, this min~m-x problem was actually solved about i00 years earlier (as obser-
ved by J. Todd)! Jordan first devised a bilinear transformation of variablesto
reduce the problem to an analogous one with identical eigenvalue intervals for both
variables. My thesis could then be used to establish that the set of wj is identi-
cal to the zj (except for order) for the transformed problem.
7. PARAMETERS FOR THE FEACEMAN-RACHFORD VARIANT OF ADI
The optimum parameters are obtained by theory involving modular transformations
of elliptic functions. Numerical evaluation turns out to be quite easy. An appro-
ximation valid over a wide range is:
2(a/4b)rj (1 + (a/4b)2(1-r j )) .b w j
i + (a/@b)2rj (27)
2t ' J = 1,2,...,t.
To illustrate the mathematical elegance of analysis of convergence rates of
(23), we will derive the equations for generating parameters a~ which solve the
minimax problem: 2"
s(x,-_) = j ~ aj+xa'1"x (28)
H = ~imum . - _-.Ig(x,a)l H O = m~n~mum H . (29) a a a a
a ~ x ~ b -- --
M u l t i p l y i n g n u m e r a t o r a n d d e m o n i n a t o r o f e a c h f a c t o r i n t h e p r o d u c t on t h e r i s h t
108
hand side of (28) by ab/ajx, we obtain
ab ab x aj
~(x,~) = ab + ab j=1 x a j
(3o)
ab/ o As x varies from a to B, ab/x varies from b to a. Hence the set /aj is the same
o as the set aj by virtue of the uniqueness of the parameters for any given eigen-
value interval, Combining the factors with aj and ab/aj , we get
(aj - ~)(ab/aj - x)
(aj + x)(ab/aj + x)
(x' + ab) - (aj + ~/aj)x (x" + ab) + (aj + ab/aj)~
(x + ab/x) - (a~+ a~/ajl
(x + ab/x) + (aj + ab/aj)
Now 1,t ~, = ½(~ + ab/x) =d a~ = ½(a:, + ab/aj). ~,n
I~(~,~_)I = a' - ~,i
J=l
where (ab) ½ = a' ( ~' ( b' = (a+b)/2.
(31)
Continuing in this fashion, we successively reduce the number of factors in the
product until we arrive at the one parameter problem:
a(n ) . x(n ) a(n) x(n ) b(n ) g(x(n),a) _- a!n) + x (.) , . . .
This is solved by noting that a~ n)
a~. )
a(n ) b(n ) or
a(")
= (a (") b(.>)"l" (32)
We may work backwards to obtain a parameter "tree" by successive solution of
quadratics:
(s-~) a~S) j(a~)), _ Ca(S)),
(s-1) Ca(S))" (s-1) aj, -. / aj, (33)
109
Although (27) looks a lot simpler, this technique was developed before the
elliptic function solution was known. There is sn intimate connection between this
process an~ Lan~en transformations for evaluation of elliptic functions.
Introduction to Finite Difference Approximations
to Initial Value Problems for
Partial Differential Equations
OLOF WIDLUND
New York University
This work was in part supported by the U.S. Atomic Energy Commission~ Contract AT(50-I)-1480 at the
Courant Institute of Mathematical Sciences, New York University
112
i. Introduction
The study of partial differential equations and methods for their exact and
approximate solution is a most important part of applied mathematics, mathematical
physics and numerical analysis. One of the reasons for this is of course that very
many mathematical models cf continuum physics have the form of partial differential
equations. We can mention problems of heat transfer, diffusion, wave motion and
elasticity in this context. This field of study also seems to provide a virtually
inexhaustable source of research problems of widely varying difficulty. If in
particular we consider finite difference approximations for initial value problems we
find a rapidly growing body of knowledge and a theory which frequently is as
sophisticated as the theoryfor partial differential equations. The work in this
field, as in all of numerical analysis, has of course been greatly influenced by the
development of the electronic computers but also very much by the recent progress in
the development of mathematical tools for problems in the theory of partial diffe-
rential equations and other parts of mathematical analysis.
Much of this progress has centered around the development of sophisticated
Fourier techniques. A typical question is the extension of a result for equations
with constant coefficients, to problems with variable coefficients. In the constant
coefficient case exponential functions are eigenfunctions and such a problem can
therefore, via a Fourier-Laplace transform, be turned into a, frequently quite
difficult, algebraic one. ~uch recent work in the theory of finite difference
schemes, including much of that of the author has been greatly influenced by this
development. These techniques are usually referred to as the Fourier method an4 will
be the topic of several of Thom~e's lectures here in Dundee. The emphasis of these
lectures will be different. Wewill concentrate on explaining what is known as the
energy method after a discussion of the proper choice of norm, stability definition
etc. We will also try to make some effort in relating the mathematics to the under-
lying physics and attempt to explain a philosophy of constructing classes of useful
difference approximations.
113
We have decided to use as simple technical tools as possible, frequently con-
centrating on simple model problems, to illustrate our points. Some generality will
undoubtedly be lost but it will hopefully make things easier to understand and
simplify the notations. A considerable amount of time will be spent on analysing
the differential equations we are approximating. Experience has shown that this is
the most convenient way to teach and work with the material. The properties of the
differential equation are almost always easier to study and a preliminary analysis
of the differential equations can frequently be translated into finite difference
form. This is particularly useful when it comes to choosing proper boundary condi-
tions for our difference schemes.
The objective of our study is essentially to develop error bounds for finite
difference schemes I methods to tell useful from less useful schemes and to give
guidelines as to how reliable classes of schemes can be found. On the simplest
level finite difference methods are generated by replacing derivatives by divided
differences, just as in the definition of a derivative, discretizing coefficient
functions and data by evaluating them at particular points or as averages over sm~ll
neighbourhoods. As we will see there are many choices involved in such discretiza-
tien processes and the quality of the approximate solutions can vary most drasticalS~
The finite difference approsch has some definite advantages as well as disadvan-
tages. Thus the most one can hope, using a finite difference scheme, is to be able
to get a computer program which for any given set of data will give an accurate
answer at a reasonable cost. The detailed structure of the mapping which transforms
the data into the solution will of course in general be much too complicated to
understand. Thus the classical approach giving closed form solutions to differential
equations frequently gives much more information about the influence on the solution
of changes in data or the model. The same is true perhaps to a somewhat lesser
extent, of methods of applied mathematics such as asymptotic and series expansions.
However finite difference schemes and the closely related finite element methods
have proved most useful in many problems where exact or asymptotic solutions are
unknown or prohibitively expensive as a computational tool.
114
The main reference in this field is a book by Richtmyer and Morton [1967]. It
is a second edition of a book by Richtmyer [1957] which, in its theoretical part, is
based to a great extent on work by Lax and Richtmyer. The new edition is heavily
influenced by the work of Kreiss. A second part of the book discusses many specific
applications of finite difference schemes to problems of continuum physics. There
is also a survey article by Kreiss and the author [1967], with few proofs, basea on
lectures by Kreiss which still awaits publication by Springer Verlag. It may still
be available from the Computer Science Department in Uppsala, Sweden. Also to be
mentioned is a classical paper by Courant, Friedrichs and Lewy [1928] which has
appeared in English translation together with three survey articles containing
useful bibliographies [1967]. Another classical paper, by John [1952], is also
very much worth a study. Among recent survey articles we mention one by Thom~e
[1969]. That paper essentially discusses the Fourier method.
2. The form of the partial differenti_al e~uaticns
We will consider partial differential equations of the form,
8tu: P(x,t,ax)U , x ¢ Q, t ¢ [O,T], T < eo
where u is a vector valued function of x and t. The variable x = (xl ,...,Xs),
varies in a region O which is the whole or part of the real Euclidian space R s.
When 0 is all of R s we speak of a pure initial or Cauchy problem; in the opposite
case we have a mixed initial boundary value problem. The differential operator P is
defined by
~ Av(x,t vl u s P(x,t,a x) = ) ax, .... axs Iv m
where {~I = ~'i and the matrices Av(x,t) have sufficiently smooth elements. The
degree of the highest derivative present, m, is called the order of the equation.
If we let the coefficients depend on u and the derivatives of u as well we say that
the problem is nonlinear.
115
We will restrict our attention almost exclusively to linear problems and to the
approximate calculation of classical solutions, i.e. solutions u(x,t) which are
smooth enough to satisfy our equation in the obvious sense.
In order to turn our problem into one with a possible unique solution we provide
initial values u(x,O) = f(x). It is thus quite obvious that for the heat equation
atu = ~z u ' x -co ~ x ~ co ,
a specification of the temperature distribution at some given time is necessary in
order to single out one solution. Frequently when O is not the whole space we have
to provide boundary conditions on at least part of the boundary 80 of O. Sometimes
we also have extra conditions such as in the case of the Navier-Stokes equation
where conservation of mass requires the solution to be divergence free. The
boundary conditions, the form of which might vary between different parts of the
boundary, have the form of linear (or nonlinear) relations between the different
components of the solution and their derivatives. The essential mathematical
questions are of course, whether it is possible to find a unique continuation of the
data and to study the properties of such a solution.
Like in the case of systems of ordinary differential equations, we can hope to
assure the existence of at least a local solution by providing initial data, etc.
It is however, often not immediately clear how many boundary conditions should be
supplied. Clearly the addition of a linearly imdependent boundary condition in a
situation where we already have a unique solution will introduce a contradiction
which in general leads to nonexistence of a solution. Similarly, the removal of
a boundary condition in the same situation will in general lead to a loss cf
uniqueness. Existence is clearly necessary for the problem to make sense, similarly
to require uniqueness is just to ask for a deterministic mathematical model. The
correct number of boundary conditions as well as their form is often suggested by
physical arguments (Cf. §3).
We now mention some simple examples which will be used in illustrating the
theory. The equation
8tU = 8xU
116
is the simplest possible system of first order, a class of problems of great
importance. We have already mentioned the heat equation
~t u = ~=x u .
The equation
8tu = i 8~u
~t
is a simple model for equations of Schrodinger type. It has several features in
common with
a~u = 4~u
which arises in simplified time dependent elasticity theory. Finally we list the
wave equation with one and two space variables:
~u = 0xU , 8~u = 8xU + 8yU .
The last three equations do not have the form we have considered until now being
second order in t. This can however be easily remedied by the introduction of new
variables. Thus let v =Stu and w = 8xU , Then
° (:)= o)0x (:I will be equivalent to the equation 8~u = -8~Uo Initial conditions have to be
provided. We first note that u and 0tu both have to be given like in the case of
ordinary differential equations of second order. This is also clear from the
analogy with a mechanical system with finitely many degrees of freedom. The initial
conditions for w can be formed by taking a second derivative of u(x,0)o
The wave equation, in two space variables, can be transformed into the required
form in several ways. We will mention two of these because of their importance in
the following discussion. Let us first just introduce the new variable v = 0tu and
rewrite the wave equation as
As we will see later on this is not a convenient form and we will instead introduce
three new variables
117
u, = atu, u, = axU, u, = ayu
which gives the equation the form
8 t Cu!) (i °I l ul o I • 0 0 a x u2 + o o
o o u~ I i o o
ul
8y u2 •
uj
Initial conditions for this new system are provided as above.
In order to illustrate our discussion on the number of boundary conditions we
consider
~tu:axu, o: Ix~x~ol , t~o. u(x,o)=f(x)
If f has a continuous derivative then f(x+t) will be a solution of the equation for
x ~ O, t ~ 0 and it can be shown that this solution is unique. The boundary consists
of the origin only. If we introduce a boundary condition at this point say u(0,t) =
g(t), a given function, we will most likely get a contradiction and thus no solution.
The situation is quite different if O = Ix; x ~ Ol (or what is essentially the same,
8tu = -SxU and O = Ix; x ~ 01). The solution is still f(x+t) for x + t ~ O, t ~ 0
but in order to determine it uniquely for other points on the left halfline a boundary
condition is required. It can be given in the form u(O,t) = g(t), f(O) = g(O), g(t)
once continuously differentiable. The solution for 0 ~ x + t ~ t, t ~ 0 will be
g(x+t). Thus different g(t) will give different solutions and the specification of
a boundary condition is necessary for uniqueness° The condition f(O) = g(O) assures
us that no jump occurs across the line x + t = O.
3. The form of the fini_te difference schemes.
We begin by introducing some notations. We will be dealing with functions
defined on lattices of mesh points only~ For simplicity we will consider uniform
meshes: ~ = Ix; x i = nih, n i = O, ~ i, ~ 2,...Io The mesh parameter h is a measure
of the fineness of cur mesh. We also discretize in the t-direction: R k = It; t = nk,
118
n = 0,1,2,..o I . When we study the convergence of our difference schemes we will
let both h and k go to zero. It is then convenient to introduce a relationship
between the timestep k and the meshwidth h of the form k = k(h), k(h) monetonicaS/y
decreasing when h ~ O, k(0) = Oo Often this relationship is given in the form
k = kh m, m = the order of the differential equation and k a positive constant@
The divided differences, which replace the derivatives, can be written in
terms of translation operators Th, i defined by
Th,i~(x ) = ~(x + hei) ,
where e i is the unit vector in the direction of the positive x.-axiso Forward, 1
backward and central divided differences are now defined by
I D+i ~(x) =~ (Th, i - I)~(x)
I (I - T h,i)~(x) D i ~(~) :
# I Doi ~(x) :~ (Th, i - T h, i)~(x) : ~(D+i ÷ D_i)~(x ) o
These difference operators serve as building blocks for our finite difference
schemes° The form of the complete schemes will become apparent as we go along°
We will now look into the mathematical derivation of the heat equation in
order to illustrate a very useful technique for generating finite difference schemss@
Let us consider heat flow in a one dimensional medium@ Denote the absolute
temperature by u(x,t)o The law governing the heat flow involves physical quantities,
the specific heat per unit volume K(x,t) and the heat flow constant Q(x,t) o The
heat energy per unit volume is K(x,t)u(x,t) at the point x at time to The quantity
Q(x,t)BxU is the amount of heat energy that flows per unit time across a cross
section of unit area.
Consider a segment between x and x + AXo The amount that flows into this
segment per unit area per ~lit time is
Q(x + Ax,t) 8xU (x + Ax,t) - Q(x,t) axU (x,t)
and it must in the absense of heat sources be balanced by
x + Ax
a t ]K(x',t) u(x',t) dx' o
x
119
A simple passage to the ]~m~t, after a division by Ax, gives
at(~) -- a x Qaxu •
If the slab is of finite extent, lying between x = 0 and I, physical consiae-
rations lead to boundary conditions. The heat flow out of the slab at x = 0 is
proportional to the difference between the inside and outside temperature uo. With
an appropriate heat flow constant Qe we have a flow of heat energy at x = 0 per unit
area which is
Qe(U - ~) and the balance condition is therefore
QexU + Q,(u- ue) =0 at x =0 .
If Qe is very large we get the Dirichlet condition u = ue at x = O. Similar con-
siderations give a boundary condition for x = I.
This derivation already contained certain discrete features. In order to turn
it into a strict finite difference model we have to replace the derivatives an&
integral in the balance conditions by difference quotients and a numerical quadratic
formula respectively. We can get essentially the same kind of schemes by starting
off with a discrete model dividing the medium into cells of length Ax giving the
discretely defined variable u the interpretation of an average temperature of a cell.
The relation between the values of the discrete variable , i.e. the difference scheme,
is then derived by the use of the basic physical balance conditions.
It is clear that we can get many different discrete schemes this way. In parti-
cular we do not have very much guidance when it comes to a choice of a gooa disereti-
zation of the t derivative . We will now examine a few possible finite difference
schemes, specializing to the case K = Q = I and Dirichlet boundary conditions. The
first two schemes are
u(x,t+k) = u(x,t) + ~_D+u(x,t)
and
u(x,t÷k) --u(x,t-k) ÷ 2~_D÷u(x,t)
for x = h 9 2h,..., l-h, t E ~. We assume that I/h is an integer and we provide the
schemes with the obvious initial and boundary conditions.
120
The first scheme, known as Euler's method or the forward scheme, can immediately
be used in a successive calculation of the values of u(x,t) for t = k, 2k, etc. The
second scheme requires the knowledge of at least approximate values of u(x,k) before
we can start marching. The latter scheme is an example of a multistep scheme. The
extra initial values can be provided easily by the use of a one step scheme such as
Euler's method in a first step. We could also use the first few terms of a Taylor
expansion in t about t = 0 using the differential equation and the initial value
function f(x) to compute the derivatives with respect to t. Thus atu(x,o) = a~f(x),
8~u(x,O) : 8~f(x), eta. The possible advantage in introducing this extra complication
is that the replacement of the t derivative by a centered instead of a forward diffe-
rence quotient should help to ~ke the discrete model closer to the original one.
Such considerations frequently make a great deal of sense. We will, however, see
later that our second scheme is completely useless for computations.
The difference between a finite difference scheme and the corresponding diffe-
rential equation is expressed in terms of the local truncation error which is the
inhomogenous term which appears when we put the exact solution of the differential
equation into the difference scheme. If the solution is sufficiently smooth we can
compute an expression for this error by Taylor series expansions. We will later see
that a small local truncation error will assure us of an accurate numerical procedure
provided the difference scheme is stable. Stability is essentially a requirement of
a uniformly continuous dependence of the discrete solution on its data and it is the
lack of stabili~ which makes our second scheme useless. We will discuss stability
at s o m e length in 86.
These two schemes are explicit, i.e. schemes for which the value at any given
point can be calculated with the help of a few values of the solution at the immed_i-
ate]y preceeding time levels.
Our next scheme is implicit:
(I - ~.D+) u(x,t+k) -- u(x,t) .
It is known as the backward scheme. Each time step requires the solution of a linear
system of equations, However this system is tridiagonal and positive definite and
can therefore be solved by Cholesky decomposition or some other factorization method
I21
at an expense which is only a constant factor greater than taking a step with an
explicit scheme. We will see that the backward scheme has a considerable advantage
over the forward scheme by being unconditionally stable, which means that its
solution will vary continuously with the data for any relation between k and h. F~
the forward scheme a restriction k/h 2 ~ ½ is necessary in order to assure stability.
This forces us to take very many time steps per unit time for small values of h.
0ur fourth scheme can be considered as a refined version of the backward scheme
k D D+) u(x,t+k) = (I + ~ (I - ~ _ k D D+) u(x,t).
This scheme, known as the CrankoNicclson scheme, is also implicit and unconditionally
stable. It treats the two time levels more equally and this is reflected in a
smaller local truncation error.
We have already come across almost all of the basic schemes which are most
useful in practice. We complement the list with the well known Dufcrt-Frankel scheme:
(I + 2k/h') u(x,t+k) = 2k/h z (u(x+h,%) + u(x-h,t)) + (1 - 2k/h') u(x,t-h) .
In order to see that this two step scheme is consistent, which means formally con-
vergent, to the heat equation we rewrite it as
(u(x,t+k) - u(x,t-k))/~ = (u(x+h,t) - u(x,t+h) - u(x,t-h) + u(x-h,t))/h"
and find by the use of Taylor expansions that it is consistent if k/h , O when h , O.
The scheme is unconditionally stable, explicit, suffers from low accuracy but it is
still quite useful becuase of its simplicity.
Another feature of the Dufort-Frankel scheme, worth pointing out, is that the
value of u(x,t+k) does not depend on u(x+2nh,t) or u(x+(2n+l)h,t-k), n = 0,+ i, _+ 2 ....
We therefore have two independent calculations and we can make a 50% saving ])y
carrying out only one of these using a so called staggered net.
4. An example of diver~ence~ The maximum principle.
We will now show that consistency is not enough to ensure useful answers. In
fact we will show by a simple general argument that the error can be arbitrarily
large for any explicit scheme, consistent with the heat equation, if we allow k to go
to zero at a rate not faster than h.
122
Consider a pure initial value problem. The fact that our schemes are explicit
and that k/h is hounded away from zero implies that only the data on a finite subset
of the line t = 0 will influence the solution at any given point. If now, for a
fixed point (x,t), we choose an intial value function which is infinitely man 2 times
differentiable, not indentically zero but equal to zero in the finite subset m~ntior~d
above then the solution of the difference scheme will be zero at the point for all
mesh sizes. On the other hand the solution of the differential equation equals
u(x,t) = ~ Fe-(X-Y)'/~t f(y)dy ,
-o~
an~ thus for amy non negative f it is different from zero for all x and t > O.
Using this solution formula we can prove a maximum principle,
max lu(x,t)l ~ max If(x)I for all t ~ O. x x
Thus, after a simple change of variables,
+oo
lu(x,t)l, L, -''/'*t Jf(x-,)ld.,.,
. -If(x)l f = l f ( x ) l x ~ x
-co
This shows t h a t the s o l u t i o n v a r i e s c o n t i n u o u s l y w i t h the i n i t i a l va lues i n the
maximum norm sense• This property is most essential and has a natural physical
interpretation. It means, of course, that in the absense of heat sources the maxi-
mum temperature cannot increase with time. Similar inequalities hold for a wide
class of problems known as parabolic in Petrowskii's sense, for Cauchy as well as
mixed initial value problems. Cf. Friedman [196&].
We will now show, by simple means, that our first and third difference schemes
satisfy similar inequalities, a fact which will be most essential in deriving usefal
error bounds, etc. First consider Euler's method with the restriction that k ~ ha/2.
The value of the solution at any point is a linear combination of the three values
at the previous time level, the weights are all positive and add up to one. Thus
the maximum cannot increase. For the third scheme we can express the value of the
solution at amy point as a similar mean value of one value at the previous time
123
level and at those of its two neighbours. Therefore a strict maximum is possible
only on the initial line or at a boundary point. This technique can be used for
problems with variable coefficients and also in some nonlinear cases. Unfortunately
it carmot be extended to very many other schemes because it requires a positivity
of coefficients which does not hold in general.
For the finite difference schemes discussed so far we have had no problems with
the boundary conditions. They were inherited in a natural way from the differential
equation and in our computation we were never interested in using more than the next
neighbours to any given point. We could however be interested in decreasing the
local truncation error by replacing 8xU by a difference formula which uses not three
but five or even more points. This creates problems next to the boundaries where
some extra conditions have to be supplied in order for us to be able to proceed with
the calculation. It is not obvious what these extra conditions should be like.
Perhaps the most natural approach, not always successful, is to require that a
divided difference of some order of the discrete solution should be equal to zero.
This problem is similar to that which arises by the introduction of extra initial
values for multistep schemes but frequently causes much more serious complications.
If we go back to our simple first order problem 8tu = 8xU , we see that there
are two possibilities. Either we use a one sided difference such as
u(x,t+k) = u(x,t) + kO+u(x,t)
for which no extra boundary condition is needed or we try a scheme like Euler's
u(x,t÷k) = u(x,t) + ~ou(x,t)
for which a boundary condition has to be introduced at x = O. We leave to the
reader the simple proof that, for k/h ~ l, the solution of our first scheme depends
continuously on its data in the sense of the maximum norm. The second scheme is,
as we will later show, unstable even for the Cauchy case. The problem to provide
extra boundary data however still remains even if we start out with a scheme which
is stable for the Cauchy ease.
124
We also mention another method which has some very interesting features;
u(x,t÷k) + u(x-h,t+k) + ~_u(x,t,k) = u(x,t) + u(x-h,t)
- ~_u(x,t), uCo,t) = o, uCx,o) = f(x), o ~ x < ® .
This difference scheme approximates 8tu = -SxU on the right half line. It has been
studied by Thome~e, [1962J, and is also discussed in Richtm2er and Morton [1967].
It is implicit but can be solved by marching in the x-dlrection and could therefore
be characterized as an effectively explicit method. It is unconditionally stable.
Finally we would like to point out a class of problems for which the boundary
conditions create no difficulties namely those which have periodic solutions. This
allows us to treat every point on the mesh as though it were an interior point. In
the constant coefficient case such problems can be studied successfully by Fourier
series. The analysis of a periodic case is frequently the simplest way to get the
first information about the usefulness of a particular difference scheme.
5. The choice of norms and stability definitions
In the systematic development of a theory for partial differential equations
questions about existence and uniqueness of solutions for equations with analytic
coefficients and data play an important role. Cf. G~rabedlan [1964]. The well
known Cauchy-Kowaleski theorem establishes the existence of unique local solutions
for a wide class of problems of this kind. As was pointed out by Ha~rd [1921],
in a famous series of lectures, such a theory is not however precise enough when
we are interested in mathematical models for physics. We also have to require, amm~g
other things, that the solution will be continuously influenced by changes of the
data, which we of course can never hope to measure exactly. The class of analytic
functions is too narrow for our purposes and we have to work with some wider class
of functions and make a choice of norm. In most cases the ~aximum norm must be
considered the ideal one. We have already seen that for the heat equation such a
choice is quite convenient and that the result on continuous dependence in this nc~m
has a nice physical interpretation. For other types of problems we also have to be
guided by physical considerations or by the study of simplified model problems.
125
Hadamard essentially discussed hyperbolic equations and much of our work in
this section will be concentrated on such problems. A study of available closed
form solutions of the wave equation naturally leads to the following definition.
Definitio~ An initial value problem for a system of partial differential
equations is well posed in Hadamard's sense if,
(i) there exists a unique classical solution for amy sufficiently smooth initial
value function,
(ii) there exists a constant q and for every finite T > 0 a constant C T such that
=az lu(x.t)J ~ CT mx la~u(x,O)l • x t60,T] x,l~1 ~ q
One can ask if it is always possible to choose q equal to zero. A study of our
simplest first order h~perbolic equation 8tu = axu gives us hope that this might be
possible, and so ~es an examination of the wave equation in one space variable
a~u - - o ' a ~ u , u(x,o) = f (x ) , atu(x,o) : g(x)
which, after integration along the rays x = Xo+ct, is seen to have the solution
x+ct u(x,t) f(x+.t) + f(xct) i [
= '" 2 + ~c g(s)d~. k-ct
In fact these two equations are well posed in Lp, 1 4 p 4 oo, in a sense we will
soon specify.
We will soon see that a choice of q = 0 is not possible for the wave equation
in several space variables. Before we explain this further we introduce some
concepts which we will need repeatedly.
Because of the linearity of our equations there is, for any well posed intial
value problem, a linear solution operator E(t,t I ) 0 g tl g t which maps the solution
at time tl into the one at time t. In particular
u(x,t) = ~(t,0)f(x) if u(x,0) = f(x) .
0he of Hagam~ra's requirements for a proper mathematical model for physics is that
the solution operator forms a semigroup i.e.
E(t,r) E(r,tl) = E(t,t) for 0 < t, ~ T ~ t .
126
When we deal with completely reversible physical processes the semigroup is in fact
a group. Such is, for instance, the case for wave propagation without dissipation.
We now introduce the definition of well posedness with which we will finally
choose to work.
Definition An initial value problem is well posed in L if, P
(i) there exists a unique classical solution for any sufficiently smooth
initial value function,
(ii) there exists constants C and a such that
[[E(t,~ )flip ~ c exp(,(t-t, ))Llfllp •
By the L norm, of a vector valued function, we mean the L norm, with respect P P
to x, of the 12- norm of the vector.
Littman [1963] has shown, by a detailed study of the solution formulas for the
wave equations, that except for one space dimension they are well posed only for
p = 2. His result has been extended to all first order systems with symmetric
constant coefficient matrices by Brenner [1966].
Theorem (Brenner [1966]). Consider equations of the form
s
atu=~A 8 u, x
A constant and symmetric matrices. This system is well posed in the L for a P
p # 2 if and only if the matrices A commute.
This leaves us with only two possibilities. Either we only have one space
variable or there is a eommon set of eigenvectors for the matrices A . In the
latter case we can introduce new dependent variables so that the system becomes
entirely uncoupled, consisting of a number of scalar equations.
Brenner's proof is quite interesting but too technical to be explained here.
Instead we will first show the well posedness in ~ of symmetric first order systems
and then proceed to show that the wave equation is not well posed in L for several co
space variables. This will of course answer the question about the possibility of
a choice of q = O.
127
We note that most~perbolic equations of physical interest can be written as
first order systems with symmetric coefficient matrices.
Introducing the standard L 2 inner product we see that for any possible solution
to the equation, which disappears for large values of x,
s ~ s
at(u,u) 2 (Aa= /~=1 /J=1 ~=1
Therefore S
at(u(t), u(t)) =-2L(u,(ax A )u) ( const. (u(t), u(t))
if the elements of A (x) have bounded first derivatives.
From this immediately follows
[lu(t)[l~ ~ exp((const./2)t)[[u(O)H2
In particular we see that the L 2 norm of u(x,t) is unchanged with t if the
coefficients are constant.
The restriction to solutions which disappear at infinity is not a serious one.
Any L 2 function can be approximated arbitrarily closely by a sequence of smooth
functions which are zero outside bounded, closed sets. A generalized solution can
therefore for any initial value in L 2, be defined as a limit of the sequence of
solutions generated by the smooth data. This is of course just an application of a
very standard procedure in functional analysis, for details el. Richtmyer and Morton.
Examining the solution formula for the wave equation in one space variable we
find that information is transmitted with a finite speed less than or equal to c.
This finite speed of propagation is a characteristic of all first order hyperbolic
equations. Thus the solution at a point (x,t) is influenced solely by the initial
values on a bounded subset of the plane t = O. This subset is known as the domain
of dependence of the point. Similarly any point on the initial plane will, for a
fixed t, only influence points in a bounded subset.
We also see, from the same solution formula, that the boundary of the domain
of dependence is of particular importance. This property is shared by other hyper-
bolic equations. In particular for the wave equation in 3, 5, V ... space variables
128
the value at any particular point on the initial plane will, for a fixed t, only
influence the solution on the surface of a certain sphere. This result, known as
Huygen's principle, can be proved by a careful study of the solution formula of the
wave equation. It is of course also well known from physics. Cf. @arabedlan [1964S.
We have now carried out the necessary preparations for our proof that the wave
equation in three space dimensions cannot be well posed in Leo. We write the
equation in the form of a sy~etric first order system. We choose for all compo-
nents of the solution, the same spherically symmetric class of initial values namely
C ~ functions which are equal to one for r ~ ~/2 and zero for r ~ ¢ and having valus~
between 0 and 1 for other values of r. The spherical symmetry of the initial values
will lead to solutions the values of which depend only on the distance r from the
origin an~ on the parameter ¢. It is easy to show that 8tU,axiU ,i = I ,2,3, are
solutions of the wave equation and that they therefore satisfy Huygen's principle.
Therefore, the solution at t = I/c will be zero except for values of r between I - E
and I + ¢. We know that the L 2 norm of the solution is unchanged and it is easy to
see that it is proportional to ~ . Now suppose that the equation is well posed
in L . This means not only that the maximum of all components of the solution at eo
t = I/e is bounded from above by a constant independent of c but also that the norm
of the solution is bounded away from zero uniformly. If this were not the case we
could solve our wave equation backwards and we could not have both well posedness
in L and a solution of the order one at t = O. eo
Denote by CI the maximum of the component of largest absolute value at t = I/o.
This point has to have a neighbour at a distance no larger than constant × cJ where
the value is less than C~/2. In the opposite case the L 2 norm of the solution could
not be of the order ~2 because the volume for which the solution is larger than
C~/2 would exceed constant × ¢3. Our argument thus shows that the signals have to
become sharper, in other words, the gradient of the solution increases. At t = O
it is of the order I/¢ and it has to be proportional to ~-~ at t = I/o. This
however contradicts our assumptions because first derivatives of a solution are
also solution of the wave equation and their maximum cannot grew by more tham
129
constant factor. Thus the wave equation in three space variables is not well
posed in L and a choice of q > 0 sometimes has to be made. ao
The well posedness of the wave equation in L 2 has a nice interpretation in terms
of physics. In one dimension, for example, the kinetic energy is p/2 [(atu)~dx and J
the potential energy is T/2 f(SxU)Zdx , where Tip = c 2 = the square of the speed of J
propagation. The total energy is therefore p/2 /((~tu) 2 + c2(axU)2)dx and it
remains unchanged in time in the absense of energy sources, i.e. inhomogenous or
boundary terms. In fact we note that our proof of the well posedness in L 2 of
symmetric first order system is our first application of what is known as the energy
method.
The fact that all first order hyperbolic equations have finite speeds of propa-
gation has immediate implications for explicit finite difference schemes. Thus,
just as for explicit methods for parabolic problems, we have to impose certain
restrictions on the relation between k and h in order to avoid divergence. The
appropriate condition has the form k/h sufficiently small. It is thus less restric-
tive than the condition in the parabolic case. This is known as the Courant-Friedriohs-
Lewy condition and simply means that, for sufficiently small values of h, any point
of the domain of dependence of the differential equation is arbitrarily close to
points belonging to the domain of dependence of the difference scheme. It is easy
to understand how in the opposite case we can construct initial value functions
which will give us arbitrarily large errors at certain points.
Experience also shows that it is advisable to use schemes and k/h which allows us to
have the domains of ~ependence coincide as much as possible while satisfying the
Courant-Friedrichs-Lewy conditions. This is related to the particular importance
of the boundary of the domain of dependence mentioned above. However it should be
pointed out that this is hard to achieve to any great extent when several propa@ation
speeds are involved and when they vary from point to point.
It should be mentioned that one can show that any first order problem which is
well posed inL 2 (or any Lp space) is well posed in Hadamard's sense. Therefore
Hadamard's definition is less restrictive than the other one. The proof is by
showing that derivatives of solutions also satisfy well posed first order problems
and the use of a so called Sobolev inequality.
130
A result by Lax, [1957] gives an interesting side light on the close relation
between the questions of well posedness, existence and uniqueness. We describe it
without a proof. Thus if a first order system with analytic coefficients, and not
necessarily hyperbolic, has a unique solution for any infinitely differentiable
initial value then it must be properly posed in Hadamard's sense. A corollary is
that Cauchy's problem for Laplace's equation, which can be rewritten as the Cauchy-
Riemamm equations and which is the most common example of a problem which is ill
posed, cannot be solved for all smooth initial data.
Another interesting fact is that it can be shown that homogenous wave motion,
satisfying obvious physical conditions such as finite speed of propagation etc.,
has to satisfy a hyperbolic differential equation of first order. This gives added
insight into the importance of partial differential equations in the description of
nature. For details we refer to Lax [1963].
We still face a choice between the two definitions of well posedness. Hadamard's
choice has the advantage of being equivalent to the Petrowskii condition in the cs~e ef
constant coefficients. The Petrowskii condition states that the real part of the
eigenvalues of of the symbol ~ of our differential operator P should be bounded from
above. The symbol is defined by
S
where ~ ¢ an~ < ~,x ~ -- ~ixi
i=I
and is thus a matrix valued polynomial in ~. This algebraic condition is most
natural for the constant coefficient case. We are ~mmediately led to it if we start
looking for special solution of the form
exp(kt) exp i < ~,x > ~, , ~ some vector .
Thomee will probably discuss these matters in much more detail. For a proof of the
equivalence between the H~Ismard and Petrowskii conditions of. Gelfan& and Shilov
[196~].
131
Due to the effort of Kreiss [1959], [1963] four algebraic conditions which are
equivalent to~ll posedmess in L 2 are known in the constant coefficient case. The
full story is quite involved and subtle. We only mention one of these conditions.
Thus a constant coefficient problem is well posed in L z if for some constants a and
K and for all • such that Res > 0
ICCs + ~) - ~ , ) ) - ' I, ,, z /R~ .
We leave to the reader to verify,using this and the Petrowskii conditions,that cur
first attempt to rewrite the wave equation is well posed in Had~mAr~'s sense but not
in L 2. The intuitive reason why our second attempt was more successful is that the
new variable r~turally defined a norm which defines the energy of the system while
no similar physical interpretation can be made in the first case.
The algebraic conditions just introduced are about the simplest possible criteria
we can hope to find to test whether or not a differential equation is well posed.
Analogous criteria have been developed for finite difference schemes. We will now
try to find out if they could be used for problems with variable coefficients as wall .
It is known from computational experience that instabilities tend to develop
locally and it is therefore natural to hope that a detailed knowledge of problems
with constant coefficients, obtained by freezing the coefficients at fixed points,
should provide a useful guide to problems with variable or even nonlinear coefficdsnts.
The constant coefficients problemscan be treated conclusively by the Fourier trans-
fo I'm.
This idea is quite sound for first order and parabolic problems provided our
theory is based on the second definition of well posedness. It is however not true
for all problems. This is illustrated by the following example, due to Strang [1%6]
atu -- i8 x sin XaxU = i sin XSxU + i cos XSxU •
This is well posed inL 2 because, using the scalar product (u,v) = fu vdx and
integration by parts,
at(uCt), uCt)) = o
for any possible solution. However if we freeze the coefficients at x = 0 we get
132
atu = iaxU
which violates the Petrowskii condition.
For a more detailed discussion we refer to Strang's paper.
The main critisism of thm Petrowskii condition is that it is not stable against
perturbations or a change of variables. This is illustrated by the following exampl~
due to Kreiss [1963],
I I) = U(t) /. U-'(t)
8tu 8xU 9
\ 0 I
U(t) t-cos t sin o
It is easy to see that the eigenvalues of the symbol, for all t, lie on the ima~na~y
axis. The equation is however far from well posed. To see this we change the
dependent variables by introducing v(t) = U -I (t)u(t). This gives us a system with
constant coefficients
(~ 1 0 -I Sty = a v - v ,
I x I 0
after some calculations. The eigenvalues of its symbol equal is ~J~(1 + is)
and the Petrowskii condition is therefore violated.
In itself there is nothing wrong with Hadamard's definition. It is however
much more convenient to base the theory on the other definition. We will soon see
that an addition of a zero order term, which is essentially what happens in our
example above, will not change a problem well posed in L into an ill posed one. P
It is possible, by present day techniques, to answer questions on admissab!e
perturbations for certain problems, even with variable coefficients for which a
loss of derivatives as in Hadamard's definition is unavoidable. A class of problems
of this nature is the so called weakly hyperbolic equations. Some of them are of
physical interest. These questions are very difficult and we therefore conclude
that if there is a chance, possibly by a change of variables, to get a problem which
is well posed in L we should take it. P
133
One of the main conclusions of this long story is that we have to live with L 2
norms in the first order case. It is well known that an L 2 function might be
unbounded in L and the error bounds in L 2 which we will derive shortly might there- co
fore look quite pointless. At the end of this series of talks we will however see
than an assumption of some extra smoothness of the solution of the differential
equation will enable us to get quite satisfactory bounds in the maximum norm as well.
In this section we have seen examples of the use of conservation laws; the
energy was conserved for the wave equation. Similar considerations went into the
derivation of the heat equation. There is an on going controversy if the discrete
models necessarily should be made to satisfy one or more laws of this kind. First
of all, it is of course not always possible to build in all conservation laws into a
difference scheme because the differential equation might have an infinite number of
them. Secondly a difference should be made between problems which have sufficiently
smooth solutions and those which do not. In the latter case the fulfilment of the
most important conservation laws often seems an almost necessary requirement
especially in nor~inear problems. When we have smooth solutions we are however
frequently better off choosing from a wider class of schemes. The error bounds,
soon to be developed, give quite good information on convergence, etc., and it
might even be argued that the accuracy of a scheme, not designed to fulfil a certain
conservation law, might conveniently be checked during the course of a computation
by calculating the appropriate quantity during the calculation.
6. Stability. error bounds and a perturbation theore m
As in the case of a linear differential equation we can introduce a solution
operator Eh(nk,n+k), 0 < nl ~ n, for any finite difference scheme. It is the
mapping of the approximate solution at t = nlk into the one at t = nko For explicit
schemes the solution operator is just a product of the particular difference
operators on the various time levels.
Let us write a one step implicit scheme symbolically as
(I + Q_,) u(x,t+k) = (I + kQo) u(x,t)
134
where Qe and Q-I are difference operators, and assume that (I + kQ. I)'I exists and
is uniformly bounded in the norm to be considered. Then
Eh(t+k,t) = (I + ~.,)" (I + kQo)
and there is no difficulty to write up a formula for Eh(nk,nlk).
A simple device enables us to write multistep schemes as one step systems. We
illustrate this by changing the second difference scheme of Section 3 into this form.
Introducing the vector variables
v(x,t) = /u(x't+k) )
u(x,t) /
the difference scheme takes the form
2kD_D+ I v(x,t+k) = v(x,t) .
I 0
The same device works for any multistep scheme and also when we have a system of
difference equations. When we speak about the solution operator for a multistep
scheme we will always mean the solution operator of the corresponding one step sys~.
We will now introduce our stability definitions. Stability is nothing but the
proper finite difference analogue of well posedness.
Definition A finite difference scheme is stable in L if there exist constants P
and C such that
l~(nk,nlk)fllp, h ~ Cexp(~(nk-n,k)) llfllp, h o
The finite difference schemes are defined at mesh points only. Therefore we
use a discrete L norm in this context defined by P
"U"p,h = I~ hs 'u(x)'P? I/p .
x~ h
It should be stressed, that for each individual mesh size, all our operators
are bounded. Therefore the non trivial feature about the definitions is that the
constants C and ~ are independent of the mesh sizes as well as the initial values.
Frequently we will use another stability definition°
135
Definition A scheme is strongly stable with respect to a norm [II. If Ip, h ,
uniformly equivalent to li.llp,h, if there exists a constant ~ such that
I J l~h((n+1 )k ,~ )e l I Ip,h '~ ( l ~ k ) 1t ~el llp, h •
We recall that ll.llp, h and III-II Ip, h are uniformly equivalent norms if there
exists a constant C > O, independent of h, such that
(1/c ) l lfllp, h g l l l f l l l p , h ~ C I I I f l [ I p , h
for all f ~ Lp, h.
It is easy to verify that a scheme strongly stable with respect to some norm is
stable. The strong stability reflects an effort to control the growth of the solutie~
on a local level. Note that we have already established that certain finite diffe-
rence approximations to the heat equation are strongly stable with respect to the
maximum norm. Our proof that the L 2 norm of any solution of a symmetric first order
system has a limited growth rate gives hope that certain difference schemes for such
problems will turn out to be strongly stable with respect to the L2, h norm.
In many oases we will however be forced to choose a norm different from llQllp, h
in order to assure strong stability. For a discussion of this difficult subject we
refer to Kreiss [1962] and Richtm~er an~ ~orton [1967].
For any stable scheme, the coefficients of which do not defend on n, there ex~ts
a norm with respect to which the scheme is strongly stable. This can be shown by
the following trick which the author learned from Vidar Thom~e.
The fact that the coefficients of the difference schemes do not depend on time
makes Eh(nk,n,k ) a function of n-n, only. We can therefore write it as Eh(nk-n,k)o
Introduce
II lelllp,h = ~8 lle'~lk ~(~)fllP,h "
It is easy to show that this is a normo It is equivalent to ll.ilp, h because by
stability and a choice of i = O,
llfllp, h • l llflllp, h ~ cllfllp, h
136
Our difference scheme is clearly strongly stable because,
[I [Eh(k)fl l ip, h = {u~ lie -~lk Eh(lk)Eh(k)f[lp, h
eak I I l~ l l lp ,h •
= su lie -~ l k Eh((l+l )k)f [ [p, h
We could consider using a weaker stability definition. A closer study gives th~
following analogae of the Hadamard condition.
Definition A finite difference scheme is weakly stable in L if there exis% P
constants ~, C and p such that
IIEh(nk,n,k)f[Ip, h ( C(n-n,+l )P exp(~(nk-n,k)) I[£1p, h o
A theory based on this definition would however suffer from the same weakness
as one based on the Hadam~rd definition of well posedness. For a detailed discuss~n
cf. Kreiss [1962] or Richtmyer and Morton [1967]. It is also clear that, in general~
we will stay closer to the laws of physics if we choose to work with the stronger
stability definit ionso
This far we have only dealt with homogenous problems. Going over to the inhcmo-
genous case is however quite simple. We demonstrate this for an explicit scheme
u(x,t÷k) = u(x,t) + kQou(x,t) + ~(t),
u(x,O) = f(x)
Using the solution operator we get
u(x,nk) = ~(m<,O) f (x) + k n ~ Eh(nk,vk) F((v-1 )k) . v--1
If the scheme is stable in L we get P
Itu(n~)llp, h ~ C(exp(c~nk) Ilfllp, h
n k ~e a(nk-vk)
Now
n
+ k ~e (~(nk-vk) ×t~[O,nk]max iIF(t)llp, h) . is.v. !
~(1-e~k)/(1-eak), if ~ i o
q n k t f ~ = 0 .
137
Notice that this is really essentially a matter of computing compounded interests
on an original capital llfll and periodic savings liF(t)ll. The formalism is known
as Duhamel's principle.
Its most common application is to error bounds for finite difference schemes.
Put the solution of the differential equation into the difference scheme. As was
pointed out before we will then get an extra inhomogenous term~ the local truncatian
error, of the form kT(x,nk,h). Introduce the error which is the difference between
the approximate and the exact solution. Subtract the two difference equations.
The error will then satisfy the same difference equation with the truncation error
as an inhomogenous term. It is easy to see from our estimate that we have conver-
gence in L for 0 g t g T if max llT(nk,h)II goes uniformly to zero with the P Ognk4T p,h
mesh size and that we have a rate of convergence h r if r(nk,h) = O(h r) uniformly.
We recall that w can be computed using just Taylor series.
We now turn to the theorem on perturbation mentioned in the previous section.
For simplicity we give a proof only for time independent coefficients.
Theorem Consider a finite difference scheme
u(x,(n+1 )k) : Q u(x,~)
stable in Lp, statisfying
Ilu(nk)llp, h < C exp(amk) llu(O)llp, h .
Let Qf be an operator, uniformly bounded in Lp, h. Then solutions of
vCx,(n+1 )k) : (Q+kQf) v(x,~)
satisfy
with
Proo f
of these consist of j factors ke-~kQ ' and there at most (j+1) factors of the form
(e -~k Q)~ ~ some natural numbers.
The norm of suck a term is by our assumption bounded by k j C j+1 llQ'il j e -~kj.
llv(~k)llp, h ~ C e~(~) llv(o)llp, h
= ~ + C e -~k ilQ'llp, h .
e-~ ('I Consider the 2 v terms in the development of (e "~k Q + kQ')W, j of
138
Thus
j=O
C(1 + kCe -ak I]Q'll) v ~ C e yvk where y = C e - a k ]IQ'I] .
Thus we have verified a finite difference version of our perturbation theorem.
A differential equation theorem can now be derived by taking, for any particular
differential equations a stable difference scheme, applying the theorem just proved
and taking a limit by letting h and k go to zero.
We have not shown that a stable scheme always can be found for any well posed
differential equation but that is in fact the case. Also notice that there is
never any chance to derive stronger inequalities for a finite difference scheme thm~
those which hold for the corresponding differential equation. The most we can hope
is to get exact analogues of what is true in the continuous case.
We leave it to the reader to give another proof of our latest theorem using
the norm III. If Ip, h which we constructed in our proof that any stable scheme is
strongly stable at least with respect to one norm.
7. The yon Netmmnn condition, dissipative and multistep schemes.
In this section we will use Fourier series to derive stability conditions and
also introduce a number of useful schemes for hyperbolic equations.
We first consider the periodic problem atu = 8xU, t ) O, u(x,O) = f(x) = ~x+~
If f(x) is sufficiently smooth it can be developed in a convergent Fourier
series
eiVX f(x) -- % o
M=--GO
The solution of the equation takes the form
+oo
~, ei~(x+t) u(x,t) = %
V=--O0
139
The Euler difference scheme
vCx,t+k) = vCx,t) + kDo vCx,t) , v(x,O) = fCx)
can be studied in the same way. Its solution is
ivx v(~,,e) : % (I + Ix sin ,, h) n e
where k = k/h.
In contrast to the differential equation case the amplitude of Fourier com-
ponents will grow,, This is so becuase 11 + ik sin ~ h] = J1 + k = sin 2 v h > I for
v ~ O. From our discussion of the Courant-Friedrichs-Lewy condition we know that
k = I would be ideal and we see that choosing k equal to a constant will lead to
very rapidly growing high frequency components. It is easy to show that for many
very smooth initial value functions this very strong amplification of high frequm~y
components will lead to arbitrary large and wildly divergent approximate solutions.
The amplification of the lowest frequency modes is however not very large.
This might lead us to the following idea. Replace the initial value function by a
fixed partial sum of its Fourier series. If the initial data is sufficiently smooth
we can do this changing the values of the initial value function and the solution
by an arbitrarily small amount. If we use the new initial value for the finite
difference scheme we can see, from the explicit solution formula , that the disc1~te
solution will converge to the correct one when h goes to zero= In fact the same
argument shows that we could proceed with a k larger than I because for any constant k
(I + iX sin vh) n will converge to e i~kn for any fixed value of v.
This approach however suffers from the same weakness as a theory for differen-
tial equations based on analytic functions only. In fact a finite Fourier series
represents an analytic function and much of Hadamard's criticism of the Cauchy-
Kowaleski type theory carries over to the finite difference case.
To see that something is drastically wrong with our argument above we urge the
reader to carry out a few steps with the Euler method using k = 10 and an initial
value function which is ~ at one mesh point and zero elsewhere. The rapid growth
of this special solution will assure us of a totally unacceptable growth of round
off errors. One could say that the error of measurements which played an important
140
part in Hadamard's argument are replaced by the round off errors. From the error
bound in section 6 we see that we will not be seriously affected by round off errors
if a difference scheme is stable. This is thus a reason, perhaps the most impor-
tant one, why we insist on using only stable schemes for computations.
There is a simple remedy for the lack of stability of the Euler scheme namely
the addition of a so called dissipation term. The term corresponds to a finite
difference approximation of yet another term in the Taylor expansion of u(x,t+k)
with respect to t. This way we get the Lax-Wendroff scheme.
v(x,t+k) = v(x,t) + kDov(x,t) + k~/2 D_D.v(x,t) , v(x,O) = f(x) .
The coefficient of e ivx is now amplified by a factor
i + iX sin ~h - (k2/2) sin~(~h/2) in each step.
It is elementary to verify that this factor is less than or equal to one in absolute
value for k < i. This clearly ensures strong L 2 stability. From this, convergence
follows as well as a relative insensitivity to round off errors.
We have now developed a simple technical tool which allows us to decide the
qualities of all the schemes suggested for the heat equation. The results will he
reveal ed shortly.
The dissipation of the Lax-Wendroff scheme acts to damp out the higher frequency
modes of the solution and this is frequently just as well because they must contain
rather serious phase errors. For sufficiently smooth solutions, which means quickly
decreasing Fourier coefficients with increasing v, these modes play a vezyunimpor-
tant part in the representation of the solution. Similar considerations frequently
make very much sense in cases when we do not have constant coefficients. Heuristi-
cally we can argue that the variability of the coefficients will make various
Fourier modes interact in a way which is very hard to analyse. However we can
expect serious phase errors for high modes. Not only will these components be in
error but they will interact with other components in a totally erroneous way. In
such a case it seems advisable to damp out such modes in the discrete model.
1 4 1
Sometimes we however are quite anxious to have an energ~ preserving discrete
model. This is for instance the case when we have to calculate over long periods of
time and with only very weak forcing functions. One simple scheme which preserves
the energy for the case atu = 8xU is the leap frog scheme also known as the mid-point
rule when it is used for ordinary differential equations,
v(x,t+k) = ~(x,t-k) + 2~. v(x,t)
Another one is the Crank-Nicolson scheme
(I - k/2 Do) v(x,t+k) : (I + k/2 Do) v(x,t) .
Fourier analysis shows that the amplification per step for the Crank-Nicolson scheme
is (I + i k/2 sin vh)/(1 - i k/2 sin vh) and thus that the amplitude is preserved.
For the leap frog scheme we look for solutions of the form v(x,nk) = ~,t eiVX and
get, by the solution of a quadratic equation, the two roots ~,,~ = ik sin vh +
J1 - k 2 sin 2 vh which, for k ~ I, both lie on the unit circle. The multistep
character is reflected in the existence of two independent solutions.
I I 4 1 which A similar analysis for the backward scheme shows that ['I _ ik sin vh
again implies strong stability.
We now turn to aFourier analysis of the schemes suggested for the heat equation.
Using k for k/h 2 we find that the amplification factor for Euler's method is
I - )J+ sin 2 (vh/2), thus it is stable for k ~ ½. For the Crank-Nicolson scheme:
(I - 2k sin 2 (vh/2))/(1 + 2k sin 2 (vh/2)); unconditionally stable° For the backward
scheme I/(I + 4k sin 2 (wh/2)), unconditionally stable. For the mid-point rule an
Ansatz of the form "~e ivx leads to ~bL= -4k sin~(vh/2) + ~I + (&k sin2(wh/2)) ~
which shows that the scheme is unstable for any constant ko In a similar way we
could also show that the Dufort-Frankel scheme is unconditionally stableG
Another interesting method, of fourth order accuracy in time, is Milne's method
I 4 kQv(x, t) (i - y~ kQ) v(x,t+k) = (~ + y kQ) v(x,t k) + y
where Q stands for Do or D÷D.. The roots of the corresponding quadratic equation
are (i kq _+ J1 + (kq)" / (I - i kq) where q stands for i sin vh or -4 sin2~h/2 o
142
Thus the method preserves energy for the hyperbolic case, provided k ~ ~, an& is
violently unstable for the parabolic case.
The instability of the Milne and mid-point methods is closely related to the
well known weak stability of these methods when applied to ordinary differenial
equations. Cf. Dahlquist K1956], [1963]. In fact parabolic equations are very stiff
equations and weakly stable schemes are therefore quite useless.
We are now well prepared for the following definition.
Definition Consider a linear finite difference scheme
u.+~ = Qun
and define its symbol by
~) = e~( i ~ ~, x,) Q e~(i < ~, x 7) .
The difference scheme satisfies the yon Neumann condition if the spectral radius of
its symbol is bounded by e ~k for some constant ~.
The von Ne,,m~nu condition is the finite difference analogue of the Petrowskii
condition. It is a necessary condition for stability° It ~as shown by Kreiss [1962]
to be equivalent to weak stability in L 2 in the case of constant coefficients. It
is not a sufficient stability condition for problems with variable coefficients for
it suffers from the same inadequacies at the Petrowskii condition. Cf. Kreiss [1962].
8. Semibounde~ operators
We now formulate an abstract condition on the differential operator P and its
boundary conditions in order to assure well posedness. We will also see that we
can use the same argument to prove stability for finite difference schemes. We
begin by forming, using the L 2 inner product I u v dx,
8t(u,u) = (u, Pu) + (Pu, u) = 2Re(Pu, u) o
Definition An operator P, with its boundary conditions, is semibounded if
Re(Pu, u) ~ Const. llull 2
for all sufficiently smooth functions u satisfying the boundary conditions°
143
For a semibounded operator we clearly get the a priori inequality
llu(t)ll, ~ exp(Const, t) llu(O)ll~
and the problem is well posed in L 2 if solutions exist. Uniqueness Immediately
follows from the inequality. In order to assure existence we have to be sure that
we do not have too many boundary conditions. Cf. the discussion in §2. For certain
types of equations it ~ known that a solution exists if the problem is minlmally
semibounded.
Definition An operator P, with its boundary conditions, is minimally semibounded
if P is semibounded and this property is lost by removing any linearly independent
boundary conditions.
In periodic cases it is easy to verify that the following expressions are semi-
bounded:
A~ x + @x A
@ A8 , x X
i@ A@ , x X
@x Aax '
A hermitian (the constant = 0);
A real, symmetric and positive definite;
A hermitian (the constant = O);
A skew hermitian (the constant = O)~
A sum of such expression is also semibounded. We note that we are in no way
restricted to only one space variable. We leave to the reader to verify which of
our examples in §2 are semibounded for periodic cases.
To illustrate the theory we consider the heat equation on 0 ~ x ~ I with the
boundary conditiens Sou(O) + ~oSxU(O ) = O and u~u(1 ) + ~axU(1 ) = O, ui' ~i real.
The corresponding periodic problem is easy to treat because
8t(u(t), u(t)) = 2Re(u, 8~u) = - 2118xUll 2 ~ O .
No term comes from the boundary because of the periodicity°
For the actual case we get
2
8t(u(t), u(t)) = -2Hakull ' ÷ 2ReSx;(1)u(1) - 2Reax~(0)u(O ).
This illustrates the necessity of boundary conditions° The first expression is
144
negative but the L 2 norm of 8 u cannot be used to give a bound for a u at a parti- x x
cular point. Without the boundary conditions we could therefore not conclude that
the right hand side is bounded by ccnst, llull ~ for all u. Using the boundary eondi-
tions we see that
2 o ~ ( ~ ) . 0 ) - 2 a u ( o ) u ( o ) = - 2 ~, luO )1 ~ + 2 ~o I~(o)1
if ~o and ~I ~ 0. Our first observation is that we clearly have semiboundedmess if
~161 ) 0 and ~oBe ~ 0. We now ask the question if we can do without such conditions.
In fact we can because the pointwise value of a function can, in one dimension, be
estimated by the L 2 norms of the functions and its first derivatives. This is a
simple special case of a Sobolev inequality. To get this inequality carry out the
following calculations : x min
xm~X lu(x)l 2 - ~n lu(x)l 2 = 2Re f USxU dx
x max
where Xmax,(Xmi n) is a value for which the maximum, (minimum) of lu(x)l 2 is obtained.
A standard inequality shows that
x max I
f / I 2 u a u ~ 1 ~ 2 [uaxul ~ ~ c, la, ull + l i / , l lu l i " X ..2 2 "
X , 0 mln
Therefore
~.= lu(x)l" ~ ~lla. ull" + I/~ llull" + ~ lu(x)l' x x 2 2 x
but I
~ tu(~)l • <. [ lu (x ) l " ~ = Ilull ~
and thus
max i u ( x ) l ' ~ ~ll~=~ll~ + (1 + 1 /~) llull" 2
All we have to do to use this result for our heat equation is to make a choice
of a sufficiently small ~.
The argument can also be carried out when ~o or ~i, or both are zero.
We have used an L 2 norm instead of a maximum norm in this study of the heat
equation. At this point mathematical convenience has thus been allowed to dominate
145
over our physical intuition.
It is obvious that we can prove well posedness in L 2 if we can find any norm,
equivalent to the L 2 norm, with respect to which an operator is semiboundedo One
of the big problems with the energy method is often the difficulty of finding an
appropriate norm for which an operator is semibounded. The failure to show that a
certain problem, which satisfies the necessary Petrowskii condition, is semibounded
can therefore depend on that the problem is ill posed but also on a lack of expertise
on our part in dealing with Sobolev inequalities or that we have chosen the wrong
inner product or even the incorrect dependent variables.
We will now indicate how we can use these ideas to construct difference
expressions which satisfy similar bounds. This in fact will provide us with a most
useful guide line for the construction of stable schemes. It is thus appropriate
to replace an expression of the form A(x)8 x + 8xA(X), A hermitian, by A(x)Do +
DoA(x). In the periodic case this will give us an antisymmetric operator just as
for the differential operator. Similarly 8xA(X)8 x could be replaced by DoA(x)Do or
even better, by D_A(x+h/2)D÷. Higher order accuracy can also be obtained. Thus
Do - h2/6 DoD_D÷ is a more accurate approximation to 8 x but it still preserves the
crucial quality of being an antisymmetrie operator. The proof that these difference
operators have the required properties follows easily from sun~ation by parts. When
the boundary conditions are more involved it is advisable to start by carrying out
detailed analysis of the boundary terms for the differential equations to assure
semiboundedness and then try to pick sufficiently accurate difference analogues to
the boundary conditions to ensure a preservation of semiboundedness.
For a very good presentation of these matters we refer to Kreiss [1963]. Cf.
also Richtmyer and Morton [1967].
9. Some application~ of the energy method
The purpose of this section is to use the building blocks from §7 and §8, the
semibounded difference operators and the stable multistep scheme, in order to con-
struct stable schemes for a wide class of problems.
146
We first consider the backward scheme
(I - kQ)u~÷1 = u.
where we assume that Re(Qu. u) ~ C llull 2.
Take the scalar product of both sides with u.÷1 and use a series of simple
inequalities •
liu,,+,ll Jlu,,ll +, ( u . + , , u,,,) -- ( u . + , , u . + , - ~:Q u . + , )
2 = ll,+,+,.,+,Ii - k:(,.+.,,,.,.,, Q u , , + , ) ~ l lu ,+ . , l J" - c k 11 l l . .u ,+ , , . +' .
Therefore. if C k < I
I flu. lJ
and the scheme is L 2 stable, for small enough k.
This calculation also shows that (I - kQ) always has a bounded inverse in L 2
for k sufficiently small if Q is semibounded, a fact most important for all implicit
schemes.
We now consider the Crank-Nicolscn scheme
(I - ~/2Q)un+, = (I + k/2Q)u.
Rewrite it as
~++ - u. = k/2Q(u.++ + u.)
and take the scalar product with un+ I + u.. The left han~ side comes out to be
llu.++ll 2 - llu.ll 2 and the right hand side is less than k/2 C llun÷1 + u.li a, which is
less than or equal to k C (flu.÷ill 2 + IlunIl2)o Thus
llu,+,,li" < (I + kc) / (1 -kc) l lu , . , l l" °
Of great interest is the study of schemes of the form
(I - kQ, )un+, ~QoU. + (I + kQ, )u._, +
The leap frog and Milne's methods have this form and if Qo = 0 it re@uces to the
Crank-Nicolson scheme+ For Qo and QI we want to choose semibounded difference
operators such as those arising from a proper discretization of the spatial opera,re
II .
from a hyperbolic, parabolic. Schrodlnger equation or from any other problem which
147
is semibounded in L 2. Frequently we also find both first order antisymmetric
operators as well as even order ones with symmetric positive definite coefficients
in the same problem. Where should we put these different pieces?
We know from a previous discussion that the mid-point rule is bad for stiff
problems such as parabolic ones. We should therefore avoid making Qo elliptic. We
therefore require that Qo is antisymmetric or (Qou,v) = - (u,Qov) for all u and v.
This is equivalent, for an operator with real coefficients, to require that
(Qeu, u) = 0. We note that QI could also contain antisymmetric parts such as in
Milne's method for 8tu = 8xU. For the operator QI, we only require that it is semi-
bounded. To simplify our arguments we assume that (Q1u, u) ~ O.
We want to establish the L 2 stability of this class of schemes. Being a two
step scheme the norm (lJUnll 2 + ljUn÷lil=) ½ suggests itself. However we will find that
Ln = UUn÷111 ~ + IIunl[ 2- 2k(Qoun, Un÷1) will be the appropriate expression in the sense
that L n is equivalent to the one first suggested and is a non increasing function of
n. We have to put a restriction on Qo namely kilQoll ~ I - 6, 8 a constant > 0 in
order to make Ln strictly positive. We will see by an example that this is a most
natural restriction°
We first show that Ln ~ Ln-1. Rewrite the equation as
u .+ l - un- I : k Q t ( u . ÷ , + u . - t ) + 2 ~ o U .
and take the scalar product with Un÷t + Un-1. Then
ilun÷,ll ~- llu~_,il ~ = (u.÷, + u._,, kQ,(u.~, + Un-,) + 2k(un÷, + u._,, Qou~).
The first term on the right hand side is less than or equal to zero because of one
of our assumptions. Rearranging and adding llUnll 2 on both sides we get Ln ~ Ln-1 o
To show that Ln is positive and equivalent to the natural L 2 norm we start by
observing that
12k(QoUn÷~, u , ) l ~ 2(1-8)IlUn÷tll llu.II ~ (1-6) (llun+,ll2 + Ilu, ll~).
Therefore
6(llu.÷,ll 2 + llu.il 2) ~ Ln ~ (2-6) (Ilun+~lJ 2 + ljUnIl2).
148
To see that kIIQell ~ S - 6 is a natural condition consider the case Qo = Do and
QI = Oo This Qe has, as is easily verified, an L 2 norm equal to I/h. Thus the
restriction just means k/h ~ i - 8, essentially the Courant-Friedrichs-Lewy condit~n.
This is a natural condition in terms of Qo alone because in the case QI = 0 the
method is explicito
For a more general discussion and a comparison of the growth rates of the exact
and approximate solutions we refer to Johansson and Kreiss [1963].
Schemes of Dufort-Frankel type can be discussed in very much the same way°
We will now show that these ideas can be used to design stable and efficient
schemes, so called alternating direction implicit schemes, for certain two dimensional
equations. We suppose that our problem has the form
8tu = P,u + P2u
and that the operators and the boundary conditions are such that PI and P2 are semi-
bounded. For simplicity we assume that
Re(u, Pju) ~ o , j = 1,2,
and that we have finite difference approximations Qj to Pj, j = I ,2, such that
Re(u, Qju) ~ 0 , j = 1,2.
We will consider the following two schemes
(I - kQ,)(I - kQ2)un÷, = u.
and
(I -k/2Q I )(I - k/2Q~)un+, = (I + k/2Q, )(I + k/2Qa)un .
These schemes are particularly convenient if QI and Qz are one dimensional finite
difference operators. In that case we only have to invert one dimensional operators
of the form (I - akQi) and this frequently leads to considerable savings. This
becomes clear if we compare the work involved in solving a two dimensional heat
equation~ using an alternating direction implicit method with QI = D-zD+:,
Q2 = D_~D+9, and the application of the standard backward or Crank-Nicolson scheme
with Q = D_zD÷: + D_gD+~© The former approach only involves solutions of linear
systems of tridiagonal type while the other, in general, requires more work.
149
The L 2 stability o f the first scheme is very simple to prove. Thus I - k Qi'
i = 1,2, both have inverses the L 2 norms of which are bounded by I. The proof of the
stability of the other scheme is more involved. Let
and
Then
o r
Y.÷ t = (1 - k /2Q2)un÷ 1
z n = (1 4,, k / 2 Q , ) u n •
(1 - k /2Qt )Yn+1 = (1 + k /2Qt ) z .
y.+, - :. : k/2 Q,(y.., + z.) o
Forming the inner product with Yn+, + z,, just as in the proof of the stability of
the Crank-Nicolseu method, we get
I t y . . , t l " - II.,,11" = k / 2 R e ( Q , ( y , + t + Zn) , Yr,+, + zn) ~ O.
Now
l i y , , , , l l ' = Ilu,,÷,ll" - w '2 Re(Q.un., . , , u .+ , ) + k 2 / 4 llQ, u , , . , l l '
and
1t.,,ti ~ = Ilu,,ll ~ + k / 2 ~e (Q~u, , , u, , ) + z ' / ~ IlQ~u,,ll ~ .
Therefore~ because Re(Q2u , u) ~ O,
l i ~ . . , t i ~ , k ~ / ~ ilQ~u,,, . , I I ~ ,~ i tu. I I ~ + k ~ / 4 liQ~u, II ~ ,,
It is easy to see that this implies L 2 stability if kQ2 is a bounded operator. If
kQ 2 is not bounded we instead get stability with respect to a stronger norm, a result
which serves our purpose equally well.
We refer to an interesting paper by Strang [1968] for the construction of other
stable accurate methods, based on one dimensional operators@
lO. Maximum norm convergence for L 2 stable schemes
In this section we will explain a result by Strang [1960] whi~'h shows that
solutions of L 2 stable schemes of a certain accuracy converge in maximum norm with
the same rate of convergence as in L 2 provided the solution of the differential
equation is sufficiently smooth.
1 5 0
Let
u. . , = Qu. , U o ( X ) -- f ( x )
be a finite difference approximation to a linear problem,
8tu = ~ , u(x,O) = f(x) ,
well posed in L 2.
To simplify matters we assume that the two problems are periodic. We also
assume that we have an L 2 stable scheme. It is known that if f is a sufficiently
smooth function the solution will also be quite smooth. We now attempt to establish
the existence of an asymptotic error expansion of the error° Make the Ansatz
u, ix) : u(x,~) + hrer(x,~) + hr''er.,(x,nk) + ....
where we choose r as the rate of convergence in L2o If we substitute this
expression into the difference equation we find that the appropriate choice for er,
er+1, are solutions of equation of the form
8te j = Pej + Lju
ej(O) = o
where Lj are differential operators which appear in the formal expansion of the
truncation error. The solutiormof a finite number of these equations are under our
assumptions quite smooth.
To end our discussion we have to verify that
~n,N(X,h) = u,(x) - u(x,nk) - ~h 1%(x,nk)
l=r
is O(h r ) in the maximum norm for some finite N, i.e. that hre is indeed the leading r
error term. This is done by a slight modification of the error estimate of §6@ We
derive a difference equation for eL, N and find that its L 2 norm is O(h r+N+1 ). By
assumption we have a periodic problem. The maximum norm of a mesh function is there-
fore bounded by h -s/2 times its L2, h norm over a period, where s is the number of
space dimensions. This concludes our proof.
151
We remark that an almost identical argument shows that we can relax our stab4_li~
requirements and require only weak stability (Cf. §6) and still get the same results
for sufficiently smooth solutions.
REFERENCES
Brenner, P.; 1966, Math. Scand., V.19, 27-37.
Courant, R., Friedrichs, K. and Lewy, H.; 1928, Math. Annal., V.100, 32-7~ also;
1967, IBM J. of Research and Development, V.ii, 213-247.
Dahlqulst, G.; 1956, Math. Scand., V.~, 33-53.
Dahlqulst, G.; 1963, Prec. Sympes. Appl. Math., V.15, i~7-158.
Friedman, A.; 196~, Partial Differential Equations of Parabolic Type. Prentice-Hall.
Garabedian, P. ; 1964, Partial Differentiml Equations. Wiley.
Gelfand, I.M., Shilev, G.E.~ 1967, Generalized Functions, V.3 Academic Press.
Haaam~rd, J.; 1921, Lectures on Cauc~ problem in linear partial differential
equations. Yale University Press.
Jehanssen, ~, Kreiss, H.0.; 1963, BIT, V.3, 97-107.
John, F.; 1952, Comm. Pure. Appl. Math., V.5, 155-211.
Kreiss, H.0.; 1959, Math. Scand., V.7 71-80.
Kreiss, H.O.; 1962, BIT, V.2, 153-181
Kreiss, H.0.; 1963, Math. Scand., V.13, 109-128.
Kreiss, H.0o; 1963, Numer. Math., V.5, 27-77.
Krelss, H.0., Widlund, 0.; 1967, Report, Computer Science Department, Uppsala, S ~
Lax, P.D.; 1957, Duke Math. J., V.24
Lax, P.D. ; 1963, Lectures on hyperbolic partial differential equations, Stanford
University (lecture notes).
Littman, W.; 1963, J. Math. Mech., V.12, 55-68.
Richtm2er , R.D.; 1957, Difference methods for initial-value problems.
Wiley Interscience.
Richtmyer, R.D., Morton, K.W.; 1967, Difference methods for initial-value problems.
2nd Edition Wiley Interscience.
152
Strang, W.G.; 1960, Duke Math. J., V.27, 221-231o
Strang, W.Go; 1966, J. Diff. Eqo, V.2, 107-114.
Strang, W.G.! 1968, SWAM J. Numer. Anal., V.5, 506-617.
Thomele, V.; 1962, J. SIAM, V.lO, 229-245.
Thomee, V.; 1969, SIAM Review, V.11, 152-195.