numerical linear algebra applications jin

8/12/2019 Numerical Linear Algebra Applications Jin

1/196

Numerical Linear AlgebraAnd Its Applications

Xiao-Qing JIN 1 Yi-Min WEI 2

August 29, 2008

1Department of Mathematics, University of Macau, Macau, P. R. China.2Department of Mathematics, Fudan University, Shanghai, P.R. China


2/196

2


3/196

i

To Our Families


4/196

ii

CONTENTS

page

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Basic symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Basic problems in NLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Why shall we study numerical methods? . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Matrix factorizations (decompositions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Perturbation and error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.6 Operation cost and convergence rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Chapter 2 Direct Methods for Linear Systems . . . . . . . . . . . . . . . . . 9

2.1 Triangular linear systems andLU factorization . . . . . . . . . . . . . . . . . . . . . 9

2.2 LU factor ization with pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Cholesky factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Chapter 3 Perturbation and Error Analysis . . . . . . . . . . . . . . . . . . . 25

3.1 Vector and matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Perturbation analysis for linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3 Error analysis on floating point arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Error analysis on partial pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Chapter 4 Least Squares Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.1 Least squares problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2 Orthogonal transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54


5/196

iii

4.3 QR decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter 5 Classical Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.1 Jacobi and Gauss-Seidel method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.3 Convergence rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.4 SOR method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Chapter 6 Krylov Subspace Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.1 Steepest descent method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.2 Conjugate gradient method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.3 Practical CG method and convergence analysis . . . . . . . . . . . . . . . . . . . . 92

6.4 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.5 GMRES method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Chapter 7 Nonsymmetric Eigenvalue Problems . . . . . . . . . . . . . . . 1117.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.2 Power method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.3 Inverse power method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.4 QR method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.5 Real version ofQR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Chapter 8 Symmetric Eigenvalue Problems . . . . . . . . . . . . . . . . . . . 1318.1 Basic spectral properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

8.2 Symmetric QR method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

8.3 Jacobi method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139


6/196

iv

8.4 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

8.5 Divide-and-conquer method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 46

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Chapter 9 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

9.2 Background of BVMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

9.3 Strang-type preconditioner for ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

9.4 Strang-type preconditioner for DDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

9.5 Strang-type preconditioner for NDDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

9.6 Strang-type preconditioner for SPDDEs .. . . . . . . . . . . . . . . . . . . . . . . . . 177

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 8 5


7/196

v

Preface

Numerical linear algebra, also called matrix computation, has been a center of sci-

entific and engineering computing since 1946, the first modern computer was born.Most of problems in science and engineering finally become problems in matrix compu-tation. Therefore, it is important for us to study numerical linear algebra. This bookgives an elementary introduction to matrix computation and it also includes some newresults obtained in recent years. In the beginning of this book, we first give an outlineof numerical linear algebra in Chapter 1.

In Chapter 2, we introduce Gaussian elimination, a basic direct method, for solvinggeneral linear systems. Usually, Gaussian elimination is used for solving a dense linearsystem with median size and no special structure. The operation cost of Gaussianelimination is O(n3) where n is the size of the system. The pivoting technique is also

studied.In Chapter 3, in order to discuss effects of perturbation and error on numerical

solutions, we introduce vector and matrix norms and study their properties. The erroranalysis on floating point operations and on partial pivoting technique is also given.

In Chapter 4, linear least squares problems are studied. We will concentrate onthe problem of finding the least squares solution of an overdetermined linear systemAx = b where A has more rows than columns. Some orthogonal transformations andthe QR decomposition are used to design efficient algorithms for solving least squaresproblems.

We study classical iterative methods for the solution of Ax = b in Chapter 5.Iterative methods are quite different from direct methods such as Gaussian elimination.Direct methods based on an LUfactorization of the matrix A are prohibitive in termsof computing time and computer storage if A is quite large. Usually, in most largeproblems, the matrices are sparse. The sparsity may be lost during the LU factorizationprocedure and then at the end ofLU factorization, the storage becomes a crucial issue.For such kind of problem, we can use a class of methods called iterative methods. Weonly consider some classical iterative methods in this chapter.

In Chapter 6, we introduce another class of iterative methods called Krylov sub-space methods proposed recently. We will only study two versions among those Krylovsubspace methods: the conjugate gradient (CG) method and the generalized mini-

mum residual (GMRES) method. The CG method proposed in 1952 is one of the bestknown iterative method for solving symmetric positive definite linear systems. TheGMRES method was proposed in 1986 for solving nonsymmetric linear systems. Thepreconditioning technique is also studied.

Eigenvalue problems are particularly interesting in scientific computing. In Chapter


8/196

vi

7, nonsymmetric eigenvalue problems are studied. We introduce some well-knownmethods such as the power method, the inverse power method and the QR method.

The symmetric eigenvalue problem with its nice properties and rich mathematicaltheory is one of the most interesting topics in numerical linear algebra. In Chapter 8,we will study this topic. The symmetric QR iteration method, the Jacobi method, thebisection method and a divide-and-conquer technique will be discussed in this chapter.

In Chapter 9, we will briefly survey some of the latest developments in using bound-ary value methods for solving systems of ordinary differential equations with initialvalues. These methods require the solutions of one or more nonsymmetric, large andsparse linear systems. Therefore, we will use the GMRES method in Chapter 6 withsome preconditioners for solving these linear systems. One of the main results is thatif an A1,2-stable boundary value method is used for an m-by-m system of ODEs,then the preconditioned matrix can be decomposed as I+L where I is the identitymatrix and the rank of L is at most 2m(1+ 2). It follows that when the GMRESmethod is applied to the preconditioned system, the method will converge in at most2m(1 + 2) + 1 iterations. Applications to different delay differential equations are alsogiven.

If any other mathematical topic is as fundamental to the mathematicalsciences as calculus and differential equations, it is numerical linear algebra. L. Trefethen and D. Bau III

Acknowledgments: We would like to thank Professor Raymond H. F. Chan of

the Department of Mathematics, Chinese University of Hong Kong, for his constantencouragement, long-standing friendship, financial support; Professor Z. H. Cao of theDepartment of Mathematics, Fudan University, for his many helpful discussions anduseful suggestions. We also would like to thank our friend Professor Z. C. Shi forhis encouraging support and valuable comments. Of course, special appreciation goesto two important institutions in the authors life: University of Macau and FudanUniversity for providing a wonderful intellectual atmosphere for writing this book.Most of the writing was done during evenings, weekends and holidays. Finally, thanksare also due to our families for their endless love, understanding, encouragement andsupport essential to the completion of this book. The most heartfelt thanks to all ofthem!

The publication of the book is supported in part by the research grants No.RG024/01-02S/JXQ/FST, No.RG031/02-03S/JXQ/FST and No.RG064/03-04S/JXQ/FST fromUniversity of Macau; the research grant No.10471027 from the National Natural ScienceFoundation of China and some financial support from Shanghai Education Committeeand Fudan University.


9/196

vii

Authors words on the corrected and revised second printing: In its secondprinting, we corrected some minor mathematical and typographical mistakes in thefirst printing of the book. We would like to thank all those people who pointed these

out to us. Additional comments and some revision have been made in Chapter 7.The references have been updated. More exercises are also to be found in the book.The second printing of the book is supported by the research grant No.RG081/04-05S/JXQ/FST.


10/196

viii


11/196

Chapter 1

Introduction

Numerical linear algebra (NLA) is also called matrix computation. It has been a centerof scientific and engineering computing since the first modern computer came to thisworld around 1946. Most of problems in science and engineering are finally transferredinto problems in NLA. Thus, it is very important for us to study NLA. This book givesan elementary introduction to NLA and it also includes some new results obtained inrecent years.

1.1 Basic symbols

We will use the following symbols throughout this book.

LetR

denote the set of real numbers,C

denote the set of complex numbers andi 1. Let Rn denote the set of realn-vectors and Cn denote the set of complexn-vectors.

Vectors will almost always be column vectors.

Let Rmn denote the linear vector space ofm-by-nreal matrices and Cmn denotethe linear vector space ofm-by-n complex matrices.

We will use the upper case letters such as A, B, C, and , etc, to denotematrices and use the lower case letters such as x, y , z , etc, to denote vectors.

The symbol aij will denote the ij -th entry in a matrix A.

The symbolAT will denote the transpose of the matrix A and A will denote theconjugate transpose of the matrix A.

Let a1, , am Rn (or Cn). We will use span{a1, , am} to denote the linearvector space of all the linear combinations ofa1, , am.

1


12/196

2 CHAPTER 1. INTRODUCTION

Let rank(A) denote the rank of the matrix A.

Let dim(S) denote the dimension of the vector space S.

We will use det(A) to denote the determinant of the matrix A and use diag(a11, , ann)to denote the n-by-n diagonal matrix:

diag(a11, , ann) =

a11 0 0

0 a22. . .

......

. . . . . . 0

0 0 ann

.

For matrixA = [aij], the symbol |A| will denote the matrix with entries (|A|)ij =|aij|.

The symbol Iwill denote the identity matrix, i.e.,

I=

1 0 00 1

. . . ...

... . . .

. . . 00 0 1

,

andei will denote the i-th unit vector, i.e., the i-th column vector ofI.

We will use to denote a norm of matrix or vector. The symbols 1, 2and will denote the p-norm withp = 1, 2, , respectively.

As in MATLAB, in algorithms,A(i, j) will denote the (i, j)-th entry of matrixA;A(i, :) andA(:, j) will denote thei-th row and thej-th column ofA, respectively;A(i1: i2, k) will express the column vector constructed by using entries from thei1-th entry to thei2-th entry in thek-th column ofA;A(k, j1 : j2) will express therow vector constructed by using entries from the j1-th entry to the j2-th entryin the k-th row of A; A(k : l, p : q) will denote the (l k+ 1)-by-(qp+ 1)submatrix constructed by using the rows from the k-th row to the l-th row andthe columns from the p-th column to the q-th column.

1.2 Basic problems in NLA

NLA includes the following three main important problems which will be studied inthis book:


13/196

1.3. WHY SHALL WE STUDY NUMERICAL METHODS? 3

(1) Find the solution of linear systems

Ax= b

whereA is ann-by-n nonsingular matrix and b is an n-vector.

(2) Linear least squares problems: For anym-by-nmatrixA and anm-vectorb, findann-vectorx such that

Ax b2= minyRn

Ay b2.

(3) Eigenvalues problems: For any n-by-n matrix A, find a part (or all) of its eigen-values and corresponding eigenvectors. We remark here that a complex numberis called an eigenvalue ofAif there exists a nonzero vector x Cn such that

Ax= x,

wherex is called the eigenvector ofA associated with .

Besides these main problems, there are many other fundamental problems in NLA,for instance, total least squares problems, matrix equations, generalized inverses, in-verse problems of eigenvalues, and singular value problems, etc.

1.3 Why shall we study numerical methods?

To answer this question, let us consider the following linear system,

Ax= b

where A is an n-by-n nonsingular matrix and x = (x1, x2, , xn)T. If we use thewell-known Cramer rule, then we have the following solution:

x1=det(A1)

det(A), x2=

det(A2)

det(A), , xn= det(An)

det(A) ,

whereAi, for i = 1, 2, , n, are matrices with the i-th column replaced by the vectorb. Then we should compute n+ 1 determinants det(Ai), i = 1, 2, , n, and det(A).

There are [n!(n 1)](n + 1) = (n 1)(n + 1)!multiplications. Whenn = 25, by using a computer with 10 billion operations/sec., weneed

24 26!1010 3600 24 365 30.6 billion years.


14/196


If one uses Gaussian elimination, it requires

n

i=1

(i 1)(i + 1) =n

i=1

i2

n=1

6 n(n + 1)(2n + 1) n= O(n3

)

multiplications. Then less than 1 second, we could solve 25-by-25 linear systems byusing the same computer. From above discussions, we note that for solving the sameproblem by using different numerical methods, the results are much different. There-fore, it is essential for us to study the properties of numerical methods.

1.4 Matrix factorizations (decompositions)

For any linear system Ax = b, if we can factorize (decompose) A as A = LU where L

is a lower triangular matrix and Uis an upper triangular matrix, then we haveLy= b

U x= y.(1.1)

By substituting, we can easily solve (1.1) and then Ax= b. Therefore, matrix factor-izations (decompositions) are very important tools in NLA. The following theorem isbasic and useful in linear algebra, see [17].

Theorem 1.1 (Jordan Decomposition Theorem) IfACnn, then there exists

a nonsingular matrixX Cnn such that

X1AX=J diag(J1, J2, , Jp),

orA= X JX1, whereJis called the Jordan canonical form ofA and

Ji=

i 1 0 00 i 1

. . . ...

... 0 . . .

. . . 0...

. . . . . . 1

0 0 i

Cnini ,

fori = 1, 2, , p, are called Jordan blocks withn1 + +np= n. The Jordan canonicalform ofA is unique up to the permutation of diagonal Jordan blocks. IfA Rnn withonly real eigenvalues, then the matrixXcan be taken to be real.


15/196

1.5. PERTURBATION AND ERROR ANALYSIS 5

1.5 Perturbation and error analysis

The solutions provided by numerical algorithms are seldom absolutely correct. Usu-

ally, there are two kinds of errors. First, errors appear in input data caused by priorcomputations or measurements. Second, there may be errors caused by algorithmsthemselves because of approximations made within algorithms. Thus, we need to carryout a perturbation and error analysis.

(1) Perturbation.

For a given x, we want to compute the value of function f(x). Suppose thereis a perturbation x ofx and|x|/|x| is very small. We want to find a positivenumber c(x) as small as possible such that

|f(x + x) f(x)|

|f(x)

| c(x) |x|

|x|.

Thenc(x) is called the condition number off(x) atx. Ifc(x) is large, we say thatthe functionfis ill-conditioned at x; ifc(x) is small, we say that the function fis well-conditioned at x.

Remark: A computational problem being ill-conditioned or not has no relationwith numerical methods that we used.

(2) Error.

By using some numerical methods, we calculate the value of a function f at apoint x and we obtain y. Because of the rounding error (or chopping error),usually

y=f(x).If there exists x such that

y= f(x + x), |x| |x|,where is a positive constant having a closed relation with numerical methodsand computers used, then we say that the method is stable if is small; themethod is unstable if is large.

Remark: A numerical method being stable or not has no relation with computa-tional problems that we faced.

With the perturbation and error analysis, we obtain|y f(x)||f(x)| =

|f(x + x) f(x)||f(x)| c(x)

|x||x| c(x).

Therefore, whether a numerical result is accurate depends on both the stability of thenumerical method and the condition number of the computational problem.


16/196


1.6 Operation cost and convergence rate

Usually, numerical algorithms are divided into two classes:(i) direct methods;

(ii) iterative methods.

By using direct methods, one can obtain an accurate solution of computational prob-lems within finite steps in exact arithmetic. By using iterative methods, one can onlyobtain an approximate solution of computational problems within finite steps.

The operation cost is an important measurement of algorithms. The operationcost of an algorithm is the total operations of +, ,, used in the algorithm.We remark that the speed of algorithms is only partially depending on the operationcost. In modern computers, the speed of operations is much faster than that of datatransfer. Therefore, sometimes, the speed of an algorithm is mainly depending on thetotal amount of data transfers.

For direct methods, usually, we use the operation cost as a main measurement ofthe speed of algorithms. For iterative methods, we need to consider

(i) operation cost in each iteration;

(ii) convergence rate of the method.

For a sequence{xk} provided by an iterative algorithm, if{xk} x, the exactsolution, and if{xk} satisfies

xk x cxk1 x, k= 1, 2, ,where 0< c


17/196

1.6. OPERATION COST AND CONVERGENCE RATE 7

3. Let

A=

A11 A12A21 A22

,

whereAij , fori, j = 1, 2, are square matrices with det(A11) = 0, and satisfyA11A21= A21A11.

Thendet(A) = det(A11A22 A21A12).

4. Show that det(I uv) = 1 vu where u, v Cm are column vectors.5. Prove Hadamards inequality forA Cnn:

|det(A)| n

j=1

aj2,

whereaj =A(:, j) andaj2= ni=1

|A(i, j)|2 1

2

. When does the equality hold?

6. Let B be nilpotent, i.e., there exists an integer k > 0 such that Bk = 0. Show that ifAB= BA, then

det(A + B) = det(A).

7. Let A be an m-by-n matrix and B be an n-by-m matrix. Show that the matrices AB 0

B 0

and

0 0B BA

are similar. Conclude that the nonzero eigenvalues ofAB are the same as those ofBA,

anddet(Im+ AB) = det(In+ BA).

8. A matrix M Cnn is Hermitian positive definite if it satisfiesM=M, xM x >0,

for all x = 0 Cn. LetA and B be Hermitian positive definite matrices.

(1) Show that the matrix productAB has positive eigenvalues.

(2) Show thatAB is Hermitian if and only ifA and B commute.

9. Show that any matrixA

Cnn can be written uniquely in the form

A= B + iC,

whereB and Care Hermitian.

10. Show that if A is skew-Hermitian, i.e., A =A, then all its eigenvalues lie on theimaginary axis.


18/196


11. Let

A=

A11 A12A21 A22

.

Assume that A11, A22 are square, and A11, A22 A21A111A12 are nonsingular. Let

B=

B11 B12B21 B22

be the inverse ofA. Show that

B22= (A22 A21A111A12)1, B12= A111A12B22,

B21= B22A21A111, B11= A111 B12A21A111.

12. Suppose thatA and B are Hermitian with A being positive definite. Show that A+Bis positive definite if and only if all the eigenvalues ofA1B are greater than

1.

13. Let A be idempotent, i.e., A2 =A. Show that each eigenvalue ofA is either 0 or 1.

14. Let A be a matrix with all entries equal to one. Show that A can be written as A =eeT, where eT = (1, 1, , 1), and A is positive semi-definite. Find the eigenvalues andeigenvectors ofA.

15. Prove that any matrix A Cnn has a polar decomposition A = HQ, where H isHermitian positive semi-definite and Q is unitary. We recall that M Cnn is a unitarymatrix ifM1 =M. Moreover, ifAis nonsingular, thenHis Hermitian positive definiteand the polar decomposition ofA is unique.


19/196

Chapter 2

Direct Methods for LinearSystems

The problem of solving linear systems is central in NLA. For solving linear systems, ingeneral, we have two classes of methods. One is called the direct method and the otheris called the iterative method. By using direct methods, within finite steps, one canobtain an accurate solution of computational problems in exact arithmetic. By usingiterative methods, within finite steps, one can only obtain an approximate solution ofcomputational problems.

In this chapter, we will introduce a basic direct method called Gaussian eliminationfor solving general linear systems. Usually, Gaussian elimination is used for solving a

dense linear system with median size and no special structure.

2.1 Triangular linear systems and LUfactorization

We first study triangular linear systems.

2.1.1 Triangular linear systems

We consider the following nonsingular lower triangular linear system

Ly= b (2.1)

9


20/196

10 CHAPTER 2. DIRECT METHODS FOR LINEAR SYSTEMS

whereb = (b1, b2, , bn)T Rn is a known vector,y = (y1, y2, , yn)T is an unknownvector, and L = [lij] Rnn is given by

L=

l11 0 0l21 l22 0

...

l31 l32 l33. . .

......

... ...

. . . 0ln1 ln2 ln3 lnn

withlii= 0, i = 1, 2, , n. By the first equation in (2.1), we have

l11y1= b1,

and then

y1=

b1l11 .

Similarly, by the second equation in (2.1), we have

y2= 1

l22(b2 l21y1).

In general, if we have already obtained y1, y2, , yi1, then by using the i-th equationin (2.1), we have

yi= 1

lii

bi

i1j=1

lijyj

.

This algorithm is called the forward substitution method which needsO(n2) operations.

Now, we consider the following nonsingular upper triangular linear system

U x= y (2.2)

wherex = (x1, x2, , xn)T is an unknown vector, andU Rnn is given by

U=

u11 u12 u13 u1n0 u22 u23

...

0 0 u33. . .

......

... . . .

. . . ...

0

0 unn

with uii= 0, i= 1, 2, , n. Beginning from the last equation of (2.2), we can obtain

xn, xn1, , x1 step by step. The xn= yn/unn and xi is given by

xi= 1

uii

yi

nj=i+1

uijxj


21/196

2.1. TRIANGULAR LINEAR SYSTEMS ANDLU FACTORIZATION 11

for i = n 1, , 1. This algorithm is called the backward substitution method whichalso needs O(n2) operations.

For general linear systems

Ax= b (2.3)

whereA Rnn andb Rn are known. If we can factorize the matrix A intoA= LU

where L is a lower triangular matrix and U is an upper triangular matrix, then wecould find the solution of (2.3) by the following two steps:

(1) By using the forward substitution method to find solution y ofLy = b.

(2) By using the backward substitution method to find solutionx ofU x= y.

Now the problem that we are facing is how to factorize the matrix A intoA = LU. Wetherefore introduce Gaussian transform matrices.

2.1.2 Gaussian transform matrix

LetLk = I lkeTk

whereI Rnn is the identity matrix,lk = (0, , 0, lk+1,k, , lnk)T Rn andek Rnis the k-th unit vector. Then for any k,

Lk =

1 0 00 . . . ...... 1

...... lk+1,k 1

......

... . . . 0

0 lnk 0 1

is called the Gaussian transform matrix. Such a matrix is a unit lower triangularmatrix. We remark that a unit triangular matrix is a triangular matrix with ones onits diagonal. For any given vector

x= (x1, x2, , xn)T Rn,we have

Lkx = (x1, , xk, xk+1 xklk+1,k, , xn xklnk)T

= (x1, , xk, 0, , 0)T


22/196


if we takelik =

xixk

, i= k+ 1, , n

withxk= 0. It is easy to check thatL1k =I+ lke

Tk

by noting that eTk lk = 0.For a given matrix A Rnn, we have

LkA= (I lkeTk )A= A lk(eTk A)

andrank(lk(e

Tk A)) = 1.

Therefore,LkAis a rank-one modification of the matrix A.

2.1.3 Computation ofLU factorization

We consider the following simple example. Let

A=

1 5 92 4 73 3 10

.By using the Gaussian transform matrix

L1= 1 0 02 1 03 0 1

,we have

L1A=

1 5 90 6 110 12 17

.Followed by using the Gaussian transform matrix

L2= 1 0 00 1 0

0 2 1 ,we have

L2(L1A) U= 1 5 90 6 11

0 0 5

.


23/196

2.1. TRIANGULAR LINEAR SYSTEMS ANDLU FACTORIZATION 13

Therefore, we finally have

A= LU

where

L (L2L1)1 =L11 L12 = 1 0 02 1 0

3 2 1

.For general n-by-n matrix A, we can use n 1 Gaussian transform matrices L1,

L2, , Ln1such thatLn1 L1Ais an upper triangular matrix. In fact, letA(0) Aand assume that we have already found k1 Gaussian transform matrices L1, , Lk1Rnn such that

A(k1) =Lk1 L1A=

A(k1)11 A

(k1)12

0 A(k1)22

whereA(k1)11 is a (k 1)-by-(k 1) upper triangular matrix and

A(k1)22 =

a(k1)kk a(k1)kn

... . . .

...

a(k1)nk a(k1)nn

.Ifa

(k1)kk = 0, then we can use the Gaussian transform matrix

Lk = I lkeTkwhere

lk = (0, , 0, lk+1,k, , lnk)T

with

lik =a

(k1)ik

a(k1)kk

, i= k+ 1, , n,

such that the last n k entries in the k-th column of LkA(k1) become zeros. Wetherefore have

A(k) LkA(k1) = A

(k)11 A

(k)12

0 A

(k)

22 where A

(k)11 is a k-by-k upper triangular matrix. Aftern 1 steps, we obtain A(n1)

which is an upper triangular matrix that we need. Let

L= (Ln1 L1)1, U=A(n1),


24/196


then A = LU. Now we want to show that L is a unit lower triangular matrix. Bynoting thateTjli= 0 for j < i, we have

L = L11 L1n1= (I+ l1e

T1)(I+ l2e

T2) (I+ ln1eTn1)

= I+ l1eT1 + + ln1eTn1

= I+ [l1, l2, , ln1, 0]

=

1 0 0 0

l21 1 0 0l

31 l

32 1

. . . ...

... ...

... . . . 0

ln1 ln2 ln3 1

.This computational process of the LU factorization is called Gaussian elimination.Thus, we have the following algorithm.

Algorithm 2.1 (Gaussian elimination)

for k= 1 :n 1A(k+ 1 :n, k) =A(k+ 1 :n, k)/A(k, k)

A(k+ 1 :n, k+ 1 :n) =A(k+ 1 :n, k+ 1 :n)A(k+ 1 :n, k)A(k, k+ 1 :n)end

The operation cost of Gaussian elimination is

n1k=1

(n k) + 2(n k)2

=

n(n 1)2

+n(n 1)(2n 1)

3

= 2

3n3 + O(n2) =O(n3).

We remark that in Gaussian elimination, a(k1)kk , k = 1, , n 1, are required tobe nonzero. We have the following theorem.

Theorem 2.1 The entries a(i1)ii = 0, i = 1, , k, if and only if all the leading

principal submatricesAi ofA, i= 1, , k, are nonsingular.


25/196

2.2. LU FACTORIZATION WITH PIVOTING 15

Proof: By induction, for k = 1, it is obviously true. Assume that the statement istrue until k 1. We want to show that ifA1, , Ak1 are nonsingular, then

Ak is nonsingular

a

(k1)

kk = 0 .

By assumption, we know that

a(i1)ii = 0, i= 1, , k 1.

By using k 1 Gaussian transform matricesL1, , Lk1, we obtain

A(k1) =Lk1 L1A=

A(k1)11 A

(k1)12

0 A(k1)22

(2.4)

where A(k1)11 is an upper triangular matrix with nonzero diagonal entries a

(i1)ii , i =

1, , k1. Therefore, thek-th leading principal submatrix ofA(k1)

has the followingform A

(k1)11 0 a

(k1)kk

.

Let (L1)k, , (Lk1)k denote the k-th leading principal submatrices ofL1, , Lk1,respectively. By using (2.4), we obtain

(Lk1)k (L1)kAk =

A(k1)11 0 a

(k1)kk

.

By noting that Li,i = 1,

, k

1, are unit lower triangular matrices, we immediatelyknow that

det(Ak) =a(k1)kk det(A

(k1)11 ) = 0

if and only ifa(k1)kk = 0.

Thus, we have

Theorem 2.2 If all the leading principal submatrices Ai of a matrix A Rnn arenonsingular fori= 1, , n 1, then there exists a uniqueLU factorization ofA.

2.2 LU factorization with pivotingBefore we study pivoting techniques, we first consider the following simple example:

0.3 1011 11 1

x1x2

=

0.70.9

.


26/196


If we using Gaussian elimination with the 10-decimal-digit floating point arithmetic,we have

L= 1 00.3333333333 1012 1 and U = 0.3 1011 1

0 0.3333333333 1012

.

Then the computational solution is

x= (0.0000000000, 0.7000000000)T

which is not good comparing with the accurate solution

x= (0.2000000000006

, 0.6999999999994

)T.

If we just interchange the first equation and the second equation, we have 1 1

0.3 1011 1

x1x2

=

0.90.7

.

By using Gaussian elimination with the 10-decimal-digit floating point arithmetic again,we have L= 1 0

0.3 1011 1

, U= 1 10 1

.

Then the computational solution is

x= (0.2000000000, 0.7000000000)T

which is very good. So we need to introduce permutations into Gaussian elimination.We first define a permutation matrix.

Definition 2.1 A permutation matrixP is an identity matrix with permuted rows.

The important properties of the permutation matrix are included in the followinglemma. Its proof is straightforward.

Lemma 2.1 LetP, P1, P2 Rnn

be permutation matrices andX Rnn

. Then

(i) P X is the same as X with its rows permuted. XP is the same as X with itscolumns permuted.

(ii) P1 =PT.


27/196


(iii) det(P) = 1.(iv) P1 P2 is also a permutation matrix.

Now we introduce the main theorem of this section.

Theorem 2.3 If A is nonsingular, then there exist permutation matricesP1 andP2,a unit lower triangular matrix L, and a nonsingular upper triangular matrix U suchthat

P1AP2= LU.

Only one ofP1 andP2 is necessary.

Proof: We use induction on the dimension n. Forn = 1, it is obviously true. Assume

that the statement is true for n 1. IfA is nonsingular, then it has a nonzero entry.Choose permutation matrices P

1 and P

2 such that the (1, 1)-th position ofP

1AP

2 isnonzero. Now we write a desired factorization and solve for unknown components:

P

1AP

2 =

a11 A12A21 A22

=

1 0L21 I

u11 U120 A22

=

u11 U12L21u11 L21U12+A22

,

(2.5)

whereA22,A22 are (n 1)-by-(n 1) matrices, andL21, UT12 are (n 1)-by-1 matrices.

Solving for the components of this 2-by-2 block factorization, we get

u11= a11= 0, U12= A12,

and

L21u11= A21, A22= L21U12+A22.Therefore, we obtain

L21 = A21

a11, A22= A22 L21U12.

We want to apply induction toA22, but to do so we need to check thatdet( A22) = 0.

Since

det(P

1AP

2) = det(A) = 0


28/196


29/196


Algorithm 2.2 (Gaussian elimination with complete pivoting)

for k= 1 :n 1choose p, q, (k p, q n)such that

|A(p, q)| = max {|A(i, j)| : i= k : n, j = k : n}A(k, 1 :n) A(p, 1 :n)A(1 :n, k) A(1 :n, q)if A(k, k) = 0

A(k+ 1 :n, k) =A(k+ 1 :n, k)/A(k, k)A(k+ 1 :n, k+ 1 :n) =A(k+ 1 :n, k+ 1 :n)

A(k+ 1 :n, k)A(k, k+ 1 :n)else

stop

endend

We remark that although theLUfactorization with complete pivoting can overcomesome shortcomings of the LU factorization without pivoting, the cost of completepivoting is very high. Usually, it requires O(n3) operations in comparison with entriesof the matrix for pivoting.

In order to reduce the operation cost of pivoting, the LU factorization with partial

pivoting is proposed. In partial pivoting, at the k-th step, we choose a(k1)

pk from the

submatrixA(k1)22 which satisfies

|a(k1)pk | = max|a(k1)ik | :k i n

.

When A is nonsingular, the LU factorization with partial pivoting can be carried outuntil we finally obtain

P A= LU.

In this algorithm, the operation cost in comparison with entries of the matrix forpivoting is O(n2). We have


30/196


Algorithm 2.3 (Gaussian elimination with partial pivoting)

for k= 1 :n

1

choose p, (k p n)such that|A(p, k)| = max {|A(i, k)| : i= k : n}A(k, 1 :n) A(p, 1 :n)if A(k, k) = 0

A(k+ 1 :n, k) =A(k+ 1 :n, k)/A(k, k)A(k+ 1 :n, k+ 1 :n) =A(k+ 1 :n, k+ 1 :n)

A(k+ 1 :n, k)A(k, k+ 1 :n)else

stopend

end

2.3 Cholesky factorization

LetA Rnn be symmetric positive definite, i.e., it satisfiesA= AT, xTAx >0,

for all x = 0 Rn. We have

Theorem 2.4 LetA Rnn be symmetric positive definite. Then there exists a lowertriangular matrixL Rnn with positive diagonal entries such that

A= LLT.

This factorization is called the Cholesky factorization.

Proof: SinceAis positive definite, all the principal submatrices ofAshould be positivedefinite. By Theorem 2.2, there exist a unit lower triangular matrixL and an uppertriangular matrix U such that

A=LU.Let

D= diag(u11, , unn),

U =D1U,

whereuii> 0, for i = 1, , n. Then we haveUTDLT =AT =A =LDU .Therefore, LTU1 =D1 UTLD.


31/196

2.3. CHOLESKY FACTORIZATION 21

We note thatLTU1 is a unit upper triangular matrix and D1 UTLD is a lowertriangular matrix. Hence

LTU1 =I=D1 UTLDwhich impliesU=LT. Thus

A=LDLT.Let

L=Ldiag(u11, , unn).We finally have

A= LLT.

Thus, when a matrix A is symmetric positive definite, we could find the solution ofthe systemAx = b by the following three steps:

(1) Find the Cholesky factorization ofA: A= LLT.

(2) Find solutiony ofLy = b.

(3) Find solutionx ofLTx= y.

From Theorem 2.4, we know that we do not need a pivoting in Cholesky factor-ization. Also we could calculate L directly through a comparison in the correspondingentries between two sides ofA = LLT. We have the following algorithm.

Algorithm 2.4 (Cholesky factorization)

for k= 1 :n

A(k, k) =

A(k, k)A(k+ 1 :n, k) =A(k+ 1 :n, k)/A(k, k)for j = k+ 1 :n

A(j : n, j) =A(j : n, j) A(j : n, k)A(j, k)end

end

The operation cost of Cholesky factorization is n3

/3.


32/196


Exercises:

1. Let S, T Rnn be upper triangular matrices such that

(ST I)x= b

is a nonsingular system. Find an algorithm ofO(n2) operations for computingx.

2. Show that theLDLT factorization of a symmetric positive definite matrix A is unique.

3. LetA Rnn be symmetric positive definite. Find an algorithm for computing an uppertriangular matrix U Rnn such that A = U UT.

4. Let A = [aij ] Rnn be strictly diagonally dominant matrix, i.e.,

|akk | >n

j=1j=k|akj |, k= 1, 2, , n.

Prove that a strictly diagonally dominant matrix is nonsingular, and a strictly diagonallydominant symmetric matrix with positive diagonal entries is positive definite.

5. Let

A=

A11 A12A21 A22

with A11 being ak-by-k nonsingular matrix. Then

S= A22 A21A111A12is called the Schur complement ofA11 in A. Show that after k steps of Gaussian elimi-

nation without pivoting, A(k1)22 =S.

6. Let A be a symmetric positive definite matrix. At the end of the first step of Gaussianelimination, we have

a11 aT1

0 A22

.

Prove thatA22 is also symmetric positive definite.

7. Let A = [aij ] Rnn be a strictly diagonally dominant matrix. After one step ofGaussian elimination, we have

a11 a

T1

0 A22 .

Show thatA22 is also strictly diagonally dominant.

8. Show that ifP AQ= LUis obtained via Gaussian elimination with pivoting, then |uii| |uij |, forj = i + 1, , n.

9. Let H=A + iB be a Hermitian positive definite matrix, where A, B Rnn.


33/196


34/196



35/196

Chapter 3

Perturbation and Error Analysis

In this chapter, we will discuss effects of perturbation and error on numerical solutions.The error analysis on floating point operations and on partial pivoting technique is alsogiven. It is well-known that the essential notions of distance and size in linear vectorspaces are captured by norms. We therefore need to introduce vector and matrix normsand study their properties before we develop our perturbation and error analysis.

3.1 Vector and matrix norms

We first introduce vector norms.

3.1.1 Vector norms

Letx= (x1, x2, , xn)T Rn.

Definition 3.1 A vector norm onRn is a function that assigns to eachx Rn a realnumberx, called the norm ofx, such that the following three properties are satisfied

for allx, y Rn and all R:(i)x >0 ifx = 0, andx = 0 if and only ifx= 0;

(ii)x = || x;

(iii)x + y x + y.A useful class of vector norms is the p-norm defined by

xp

ni=1

|xi|p 1

p

25


36/196

26 CHAPTER 3. PERTURBATION AND ERROR ANALYSIS

where 1 p. The following p-norms are the most commonly used norms in practice:

x1=ni=1

|xi|, x2 = ni=1

|xi|212

, x = max1in

|xi|.

The Cauchy-Schwarz inequality concerning 2 is given as follows,

|xTy| x2y2

for x, y Rn, which is a special case of the Holder inequality given as follows,

|xTy| xpyq

where 1/p + 1/q= 1.A very important property of vector norms on Rn is that all the vector norms on

Rn are equivalent as the following theorem said, see [35].

Theorem 3.1 If and are two norms onRn, then there exist two positiveconstantsc1 andc2 such that

c1x x c2x

for allx Rn.

For example, ifx Rn, then we have

x2 x1

nx2,

x x2

nxand

x x1 nx.

We remark that for any sequence of vectors

{xk

}wherexk = (x

(k)1 ,

, x

(k)n )T

Rn,

and x = (x1, , xn)T Rn, by Theorem 3.1, one can prove that

limk

xk x = 0 limk

|x(k)i xi| = 0,

for i= 1, , n.


37/196

3.1. VECTOR AND MATRIX NORMS 27

3.1.2 Matrix norms

Let

A= [aij]ni,j=1 R

nn

.We now turn our attention to matrix norms.

Definition 3.2 A matrix norm is a function that assigns to each A Rnn a realnumberA, called the norm ofA, such that the following four properties are satisfied

for allA, B Rnn and all R:(i)A >0 ifA = 0, andA = 0 if and only ifA= 0;

(ii)A = || A;

(iii)A + B A + B;(iv)AB A B.

An important property of matrix norms on Rnn is that all the matrix norms onRnn are equivalent. For the relation between a vector norm and a matrix norm, wehave

Definition 3.3 If a matrix norm M and a vector norm v satisfy

Axv AMxv,

forA Rnn andx Rn, then these norms are called mutually consistent.

For any vector norm v, we can define a matrix norm in the following naturalway:

AM maxx=0

Axvxv = maxxv=1 Axv.

The most important matrix norms are the matrix p-norms induced by the vector p-norms for p = 1, 2, . We have the following theorem.

Theorem 3.2 Let

A= [aij]ni,j=1 Rnn.Then we have

(i)A1 = max1jn

ni=1

|aij|.


38/196


(ii)A = max1in

nj=1

|aij|.

(iii)A2= max(ATA), wheremax(ATA) is the largest eigenvalue ofATA.Proof: We only give the proof of (i) and (iii). In the following, we always assume thatA = 0.

For (i), we partition the matrix A by columns:

A= [a1, , an].

Let

= aj01 = max1jn

aj1.

Then for any vector x Rn which satisfiesx1=ni=1

|xi| = 1, we have

Ax1 = nj=1

xjaj

1

n

j=1|xj | aj1

(n

j=1|xj|) max

1jnaj1

= aj01= .Letej0 denote the j0-th unit vector and then

Aej01= aj01 = .

Therefore

A1= maxx1=1

Ax1 = = max1jn

aj1 = max1jn

ni=1

|aij|.

For (iii), we have

A2 = maxx2=1

Ax2= maxx2=1

[(Ax)T(Ax)]1/2

= maxx2=1[xT(ATA)x]1/2.

SinceATAis positive semi-definite, its eigenvalues can be assumed to be in the followingorder:

1 2 n 0.


39/196


Letv1, v2, , vn Rn

denote the orthonormal eigenvectors corresponding to 1, 2, , n, respectively. Thenfor any vector x Rn withx2= 1, we have

x=ni=1

ivi,ni=1

2i = 1.

Therefore,

xTATAx=ni=1

i2i 1.

On the other hand, letx = v1, we have

xT

AT

Ax= vT1A

T

Av1 = vT11v1= 1.

Thus

A2= maxx2=1

Ax2=

1=

max(ATA).

We have the following theorem for the norm 2.

Theorem 3.3 LetA Rnn. Then we have

(i)A2= maxx2=1

maxy2=1

|yAx|, wherex, y Cn.

(ii)AT2= A2=ATA2.

(iii)A2= QAZ2, for any orthogonal matricesQ andZ. We recall that a matrixM Rnn is called orthogonal ifM1 =MT.

Proof: We only prove (i). We first introduce the dual norm D of a vector norm defined as follows,

yD = maxx=1

|yx|.

For 2, we have by the Cauchy-Schwarz inequality,

|yx| y2x2with equality when x = 1y2 y. Therefore, the dual norm of 2 is given by

yD2 = maxx2=1

|yx| = maxx2=1

y2x2= y2.


40/196


So, 2 is its own dual. Now, we consider

A

2 = max

x2=1 Ax

2 = max

x2=1 Ax

D2

= maxx2=1

maxy2=1

|(Ax)y|

= maxx2=1

maxy2=1

|yAx|.

Another useful norm is the Frobenius norm which is defined by

A

F

n

j=1n

i=1 |aij|212

.

One of the most important properties of F is that for any orthogonal matrices Qand Z,

AF = QAZF.In the following, we will extend our discussion on norms to the field ofC. We remark

that from the viewpoint of norms, there is no essential difference between matrices orvectors in the field ofR and matrices or vectors in the field ofC.

Definition 3.4 Let A Cnn. Then the set of all the eigenvalues ofA is called thespectrum ofA and

(A) = max{|| : belongs to the spectrum ofA}

is called the spectral radius ofA.

For the relation between the spectral radius and matrix norms, we have

Theorem 3.4 LetA Cnn. Then(i) For any matrix norm, we have

(A) A.

(ii) For any >0, there exists a norm defined onCnn such that

A (A) + .


41/196


Proof: For (i), letx Cn satisfyx = 0, Ax= x, || =(A).

Then we have(A)xeT1 = xeT1 = AxeT1 A xeT1 .

Hence(A) A.

For (ii), by using Theorem 1.1 (Jordan Decomposition Theorem), we know thatthere is a nonsingular matrix X Cnn such that

X1AX=

1 1

2 2. . .

. . .

n1 n1n

wherei= 1 or 0. For any given >0, let

D= diag(1, , 2, , n1),

then

D1 X1AXD=

1 12 2

. . . . . .

n1 n1

n

.

Now, defineG = D1 X1GXD, G Cnn.

It is easy to see this matrix norm actually is induced by the vector norm definedas follows:

xXD = (XD)1x, x Cn.Therefore,

A= D1 X1AXD = max1in

(|i| + |i|) (A) + ,

wheren= 0.

We remark that for any sequence of matrices{A(k)} where A(k) = [a(k)ij ] Rnn,and A = [aij] Rnn,

limk

A(k) A = 0 limk

a(k)ij =aij,


42/196


for i, j= 1, , n.

Theorem 3.5 LetA

Cnn. Then

limk

Ak = 0 (A)< 1.

Proof: We first assume thatlimk

Ak = 0.

Let be an eigenvalue ofA such that (A) =||. Then k is an eigenvalue ofAk foranyk . By Theorem 3.4 (i), we know that for any k,

(A)k = ||k = |k| (Ak) Ak.Therefore,

limk

(A)k = 0

which implies (A)< 1.Conversely, assume that(A)< 1. By Theorem 3.4 (ii), there exists a matrix norm

such thatA


43/196

3.2. PERTURBATION ANALYSIS FOR LINEAR SYSTEMS 33

Corollary 3.1 Letbe a norm defined onCnn withI = 1 andA Cnn satisfyA


44/196


Obviously, the condition number depends on the matrix norm used. When (A) issmall, thenA is said to be well-conditioned, whereas if(A) is large, then A is said tobe ill-conditioned. Note that for any p-norm, we have

1 = I = A A1 A A1 =(A).

Let x be an approximation of the exact solution x ofAx = b. The error vector isdefined as follows,

e= x x,i.e.,

x= x + e. (3.2)

The absolute error is given by

e = x xfor any vector norm. Ifx = 0, then the relative error is defined by

ex =

x xx .

We have by substituting (3.2) into Ax = b,

A(x + e) =Ax + Ae= b.

Therefore,

Ax= b Ae=b.The xis the exact solution ofAx= bwherebis a perturbed vector ofb. Sincex= A1band x= A1b, we have

x x = A1(b b) A1 b b. (3.3)

Similarly,

b = Ax A x,i.e.,

1

xA

b . (3.4)Combining (3.3), (3.4) and (3.1), we obtain the following theorem which gives the effectof perturbations of the vector b on the solution of Ax = b in terms of the conditionnumber.


45/196

3.2. PERTURBATION ANALYSIS FOR LINEAR SYSTEMS 35

Theorem 3.7 Let x be an approximate solution of the exact solution x of Ax = b.Then

x

x

x (A)b

b

b .

The next theorem includes the effect of perturbations of the coefficient matrix Aon the solution ofAx = b in terms of the condition number.

Theorem 3.8 LetA be a nonsingular matrix andA be a perturbed matrix ofA suchthat

A A A1


46/196


we getx x

x

(1 A1E)1

A1E + (A)

b

.

By using

A1E A1 E =(A)EA ,

we finally havex x

x (A)

1 (A) EA

EA +

b

.

Theorems 3.7 and 3.8 give upper bounds for the relative error of x in terms ofthe condition number ofA. From Theorems 3.7 and 3.8, we know that if A is well-

conditioned, i.e.,(A) is small, the relative error in x will be small if the relative errorsin both A and b are small.

Corollary 3.2 Let be any matrix norm withI = 1 and A be a nonsingularmatrix withA +A being a perturbed matrix ofA such that

A1A


47/196

3.3. ERROR ANALYSIS ON FLOATING POINT ARITHMETIC 37

By using identity

B1 =A1 B1(B A)A1,

we have,(A +A)1 A1 = (A +A)1AA1.

Then

(A +A)1 A1 A1 A (A +A)1 A12 A1 r .

Finally, we obtain

(A +A)1 A1A1

A1 A1 r

A1 A1 A1

A

(A)

1 (A) AA

AA .

3.3 Error analysis on floating point arithmetic

In computers, the floating point numbers fare expressed as

f= J, L J U,

where is the base, J is the order, and is the fraction. Usually, has the followingform:

= 0.d1d2

dt

wheret is the length (precision) of , d1= 0, and 0 di< , for i = 2, , t.Let

F= {0} {f :f = J, 0 di< , d1= 0, L J U}.ThenFcontains

2( 1)t1(U L + 1) + 1floating point numbers. These numbers are symmetrically distributed in the intervals[m, M] and [M, m], where

m= L1, M =U(1 t). (3.5)

We remark thatF is only a finite set which cannot contain all the real numbers inthese two intervals.

Letf l(x) denote the floating point number of any real number x. Then

f l(x) = 0, forx = 0.


48/196


Ifm |x| M, by rounding, f l(x) is the minimum of|f l(x) x| = min

fF|f x|.

By chopping, f l(x) is the minimum of

|f l(x) x| = min|f||x|

|f x|.

For example, let = 10, t = 3, L = 0 and U = 2. We consider the floating pointexpression ofx = 5.45627. By rounding, we have f l(x) = 0.546 10. By chopping, wehave f l(x) = 0.545 10. The following theorem gives an estimate of the relative errorof floating point expressions.

Theorem 3.9 Letm

|x

| M, wherem andM are defined by (3.5). Then

f l(x) =x(1 + ), || u,whereu is the machine precision, i.e.,

u=

12

1t, by rounding,

1t, by chopping.

Proof: In the following, we assume that x= 0 and x > 0. Let be an integer andsatisfy

1 x < . (3.6)Since the order of floating point numbers in [1, ) is , all the numbers

0.d1d2 dt

are distributed in the interval with distancet. For the rounding error, by (3.6), wehave

|f l(x) x| 12

t =1

211t 1

2x1t,

i.e.,|f l(x) x|

x 1

21t.

For the chopping error, we have

|f l(x) x| t =11t x1t,i.e.,

|f l(x) x|x

1t.


49/196


50/196


Letx = u. By the left inequality of (3.9), we have

(1 + u)n enu. (3.10)Letx = nu. By the right inequality of (3.9), we have

enu 1 + 1.01nu. (3.11)Combining (3.10) and (3.11), we have

(1 + u)n 1 + 1.01nu. (3.12)By (3.7), (3.8) and (3.12), the proof is complete.

We consider the following example.

Example 3.1. For givenx, y

Rn, estimate the upper bound of

|f l(xTy) xTy|.Let

Sk =f l ki=1

xiyi

.

By Theorem 3.10, we have

S1 = x1y1(1 + 1), |1| u,and

Sk = f l(Sk1+ f l(xkyk))

= [Sk1+ xkyk(1 + k)](1 + k), |k|,|k| u.Therefore,

f l(xTy) = Sn=ni=1

xiyi(1 + i)n

j=i(1 + j)

=ni=1

(1 + i)xiyi,

where

1 + i= (1 + i)n

j=i(1 + j)with1= 0. Thus, ifnu 0.01, we then have by Theorem 3.11,

|f l(xTy) xTy| ni=1

|i| |xiyi| 1.01nuni=1

|xiyi|.


51/196

3.4. ERROR ANALYSIS ON PARTIAL PIVOTING 41

Before we finish this section, let us briefly discuss the floating point analysis onelementary matrix operations. We first introduce the following notations:

|E| = [|eij|],

whereE= [eij] Rnn and

|E| |F||eij| |fij|

for i, j = 1, 2, , n. Let A, B Rnn be matrices with entries inF, and F. ByTheorem 3.10, we have

f l(A) =A + E, |E| u|A|,

andf l(A + B) = (A + B) + E, |E| u|A + B|.

From Example 3.1, we also have

f l(AB) =AB + E, |E| 1.01nu|A| |B|.

Note that |A| |B| maybe is much larger than |AB|. Therefore the relative error ofABmay not be small.

3.4 Error analysis on partial pivoting

We will show that if Gaussian elimination with partial pivoting is used to solve Ax = b,then the computational solution xsatisfies

(A + E)x= b,

where E is an error matrix. An upper bound ofE is also given. We first study therounding error of the LU factorization ofA.

Lemma 3.1 Let A Rnn with floating point entries. Assume that A has an LUfactorization and6nu

1 whereu is the machine precision. Then by using Gaussian

elimination, we have LU=A + Ewhere

|E| 3nu(|A| + |L| |U|).


52/196


Proof: We use induction onn. Obviously, Lemma 3.1 is true for n = 1. Assume thatthe lemma holds for n 1. Now, we consider a matrix A Rnn:

A= wT

v A1

,

whereA1 R(n1)(n1). At the first step of Gaussian elimination, we should computethe vector l1= f l(v/) and modify the matrix A1 as

A1= f l(A1 f l(l1wT)).By Theorem 3.10, we have

l1 = v/ + f, |f| u|| |v| (3.13)

and A1= A1 l1wT + F, |F| (2 + u)u(|A1| + |l1| |w|T). (3.14)ForA1, by using the assumption, we obtain an LU factorization with a unit lowertriangular matrixL1 and an upper triangular matrixU1 such that

L1 U1=A1+ E1where

|E1| 3(n 1)u(| A1| + |L1| |U1|).Thus, we have

LU = 1 0l1L1

wT

0 U1

= A + E,

where

E= 0 0f E1+ F .By using (3.14), we obtain

| A1| (1 + 2u + u2)(|A1| + |l1| |w|T).


53/196


Therefore, by using the condition 6nu 1, we have

|E1+ F

| |E1

|+

|F

| 3(n 1)u(| A1| + |L1| |U1|) + (2 + u)u(|A1| + |l1| |w|T) 3(n 1)u

(1 + 2u + u2)(|A1| + |l1| |w|T) + |L1| |U1|

+(2 + u)u(|A1| + |l1| |w|T)

u

3n 1 + [6n + 3(n 1)u 5]u

(|A1| + |l1| |w|T)

+3(n

1)u(

|L1| |U1|) 3nu(|A1| + |l1| |w|T + |L1| |U1|).

Combining with (3.13), we obtain

|E| =

0 0

|||f| |E1+ F|

3nu

0 0

|v| |A1| + |l1| |w|T + |

L1| |

U1|

3nu|| |w|T|v| |A1| + 1 0|l1| |L1| || |w|T0 |U1| = 3nu(|A| + |L| |U|).

The proof is complete.

Corollary 3.3 LetA Rnn be nonsingular with floating point entries and6nu 1.Assume that by using Gaussian elimination with partial pivoting, we obtain

LU =P A + EwhereL= [lij] is a unit lower triangular matrix with|lij| 1,Uis an upper triangularmatrix andP is a permutation matrix. ThenEsatisfies the following inequality:

|E| 3nu(|P A| + |L| |U|).


54/196


After we obtain the LU factorization ofA, the problem of solving Ax = b becomesthe problem of solving the following two triangular systems:

Ly= P b, U x= y.Therefore, we need to estimate the rounding error of solving triangular systems.

Lemma 3.2 Let S Rnn be a nonsingular triangular matrix with floating pointentries and 1.01nu 0.01. By using the method proposed in Section 2.1.1 to solveSx = b, we then obtain a computational solutionx which satisfies

(S+ H)x= b,

where

|H| 1.01nu|S|.Proof: We use induction on n. Without loss of generality, let S = L be a lowertriangular matrix. Obviously, Lemma 3.2 is true for n = 1. Assume that the lemmais true for n 1. Now, we consider a lower triangular matrix L Rnn. Let x be thecomputational solution ofLx= b and we partition L, b and x as follows:

L=

l11 0

l1 L1

, b=

b1

c

, x=

x1

y

,

wherec, y Rn1 andL1 R(n1)(n1). By Theorem 3.10, we have

x1 = f l(b1/l11) = b1l11(1 + 1), |1| u. (3.15)

Note that y is the computational solution of the (n 1)-by-(n 1) system

L1y= f l(c x1l1).By assumption, we have

(L1+ H1)y= f l(c x1l1)where

|H1| 1.01(n 1)u|L1|. (3.16)

By Theorem 3.10 again, we obtain

f l(c x1l1) =f l(c f l(x1l1)) = (I+ D)1(c x1l1 x1Dl1),

whereD= diag(2, , n), D = diag(2, , n)


55/196


with|i|,|i| u, i= 2, , n.

Therefore, x1l1+ x1Dl1+ (I+ D)(L1+ H1)y= c,

and then(L + H)x= b,

where

H=

1l11 0Dl1 H1+ D(L1+ H1)

.

By using (3.15), (3.16) and the condition 1.01nu 0.01, we have

|H

| |1| |l11| 0|D| |l1| |H1| + |D|(|L1| + |H1|)

u|l11| 0u|l1| |H1| + u(|L1| + |H1|)

u|l11| 0

|l1| [1.01(n 1) + 1 + 1.01(n 1)u]|L1|

1.01nu|L|.

We then have the main theorem of this section.

Theorem 3.12 LetA Rnn be a nonsingular matrix with floating point entries and1.01nu0.01. If Gaussian elimination with partial pivoting is used to solveAx = b,we then obtain a computational solutionx which satisfies

(A + A)x= b,

whereA u(3n + 5.04n3)A (3.17)

with the growth factor

1A maxi,j,k |a(k)ij|.

Proof: By using Gaussian elimination with partial pivoting, we have the following twotriangular systems: Ly= P b, U x= y.


56/196


By using Lemma 3.2, the computational solution xshould satisfy

(L + F)(U+ G)x= P b,i.e.,

(LU+ FU+LG + F G)x= P b, (3.18)where

|F| 1.01nu|L|, |G| 1.01nu|U|. (3.19)SubstitutingLU=P A + Einto (3.18), we have

(A + A)x= b,

where

A = P

T

(E+ FU+LG + F G).By using (3.19), Corollary 3.3 and the condition 1.01nu 0.01, we have

|A| PT(3nu|P A| + (3n + 2.04n)u|L| |U|)= nuPT(3|P A| + 5.04|L| |U|). (3.20)

By Corollary 3.3 again, the absolute values of entries inL are less than or equal to 1.Therefore, we have

L n. (3.21)

We now define 1A maxi,j,k |a(k)ij|

and then we have

U nA. (3.22)Substituting (3.21) and (3.22) into (3.20), we have (3.17). The proof is complete.

We remark thatA usually is very small comparing with the initial error fromgiven data. Thus, Gaussian elimination with partial pivoting is numerically stable.

Exercises:1. Let

A=

1 0.999999

0.999999 1

.

ComputeA1, det(A) and the condition number ofA.


57/196


2. Prove thatABF A2BF andABF AFB2.3. Prove thatA22 A1A for any square matrix A.

4. Show that A11 00 A22 2 A11 A12A21 A22 2 .5. Let A be nonsingular. Show that

A112 = minx2=1

Ax2.

6. Show that ifS is real and S= ST, then I Sis nonsingular and the matrix(I S)1(I+ S)

is orthogonal. This is known as the Cayley transform ofS.

7. Prove that if bothA and A + E are nonsingular, then

(A + E)1 A1A1 (A + E)

1 E.

8. Let A Rnn be nonsingular and let x, y, z Rn such that Ax = b and Ay = b+z.Show that z2

A2 x y2 A12z2.

9. Let A = [aij ] be an m-by-nmatrix. Define

|||A|||l = maxi,j

|aij |.

Is||| |||l a matrix norm? Give a reason for your answer.10. Show that ifX Cnn is nonsingular, then AX = X1AX2 defines a matrix norm.11. Let A = LDLT Rnn be a symmetric positive definite matrix and

D= diag(d11, , dnn).Show that

2(A) maxi

{dii}mini

{dii} .

12. Verify that

xy

F =

xy

2=

x2

y2,

for any x, y Cn.13. Show that if 0 =vRn and E Rnn, thenEI v vT

vTv

2F

= E2FEv22

vTv .


58/196


59/196

Chapter 4

Least Squares Problems

In this chapter, we study linear least squares problems:

minyRn

Ay b2

where the data matrix A Rmn with m n and the observation vector b Rmare given. We introduce some well-known orthogonal transformations and the QRdecomposition for constructing efficient algorithms for these problems. For a literatureon least squares problems, we refer to [15, 21, 42, 44, 45, 48].

4.1 Least squares problems

In practice, if we are given m points t1, t2,

, tmwith data on these points y1, y2,

, ym,

and functions1(t), 2(t), , n(t) defined on these points, we then try to find f(x, t)defined by

f(x, t) n

j=1

xjj(t)

such that residuals defined by

ri(x) yi f(x, ti) =yi n

j=1

xjj(ti), i= 1, 2, , m,

can be as small as possible. In matrix form, we have

r(x) =b Axwhere

A=

1(t1) n(t1)... ...1(tm) n(tm)

,49


60/196

50 CHAPTER 4. LEAST SQUARES PROBLEMS

andb= (y1, , ym)T, x= (x1, , xn)T, r(x) = (r1(x), , rm(x))T.

Whenm= n, we can require that r(x) = 0 and x can be found by solving the systemAx = b. Whenm > n, we require that r(x) can reach its minimum under the norm 2. We therefore introduce the following definition of the least squares problem.

Definition 4.1 LetA Rmn andb Rm. Findx Rn such that

b Ax2= r(x)2= minyRn

r(y)2= minyRn

b Ay2. (4.1)

It is called the least squares (LS) problem andr(x) is called the residual.

In the following, we only consider the case of

rank(A) =n < m.

We first study the solution x of the following equation

Ax= b, A Rmn. (4.2)

The range of matrix A is defined by

R(A) {y Rm :y = Ax, x Rn}.

It is easy to see that

R(A) = span

{a1,

, an

}whereai, i = 1, , n, are column vectors ofA. The nullspace ofA is defined by

N(A) {x Rn :Ax = 0}.

The dimension ofN(A) is denoted by null(A). The orthogonal complement of a sub-spaceS Rn is defined by

S {y Rn :yTx= 0, for all x S}.

We have the following theorems for (4.2).

Theorem 4.1 The equation (4.2) has solutions rank(A) =rank([A, b]).

Theorem 4.2 Let x be a special solution of (4.2). Then the solution set of (4.2) isgiven byx +N(A).


61/196

4.1. LEAST SQUARES PROBLEMS 51

Corollary 4.1 Assume that the equation (4.2) has some solution. The solution isunique null(A) = 0.

We have the following essential theorem for the solution of (4.1).

Theorem 4.3 The LS problem (4.1) always has solutions. The solution is unique ifand only if null(A) = 0.

Proof: SinceRm = R(A) R(A),

the vector b can be expressed uniquely by

b= b1+ b2

where b1 R(A) and b2 R(A). For any x Rn, since b1 Ax R(A) and isorthogonal tob2, we therefore have

r(x)22 = b Ax22= (b1 Ax) + b222

= b1 Ax22+ b222.

Note that r(x)22reaches the minimum if and only ifb1 Ax22reaches the minimum.Since b1 R(A),r(x)22 reaches its minimum if and only if

Ax= b1,

i.e.,b1 Ax22= 0.

Thus, by Corollary 4.1, we know that the solution ofAx= b1is unique, i.e., the solutionof (4.1) is unique, if and only if

null(A) = 0.

LetX = {x Rn :x is a solution of (4.1)}.

We have

Theorem 4.4 A vectorx X if and only ifATAx= ATb. (4.3)


62/196


Proof: Letx X. By Theorem 4.3, we know that Ax = b1 where b1 R(A) and

r(x) =b

Ax= b

b1= b2

R(A).

Therefore

ATr(x) =ATb2= 0.

Substitutingr(x) =b Ax into ATr(x) = 0, we obtain (4.3).Conversely, letx Rn satisfy

ATAx= ATb,

then for any y Rn, we have

b

A(x + y)

22 =

b

Ax

22

2yTAT(b

Ax) +

Ay

22

= b Ax22+ Ay22

b Ax22.

Thus, x X.We therefore have the following algorithm for LS problems:

(1) Compute C=ATAandd = ATb.

(2) Find the Cholesky factorization ofC= LLT.

(3) Solve triangular linear systems: Ly= d and LTx= y.

We remark that in computation ofATA, usually, the operation cost isO(n2m), andsome information of matrix A could be lost. For example, we consider

A=

1 1 1 0 00 00 0

.We have

ATA= 1 + 2 1 11 1 + 2 1

1 1 1 + 2

.Assume that = 103 and a 6-digital decimal floating system is used. Then 1 +2 =1 + 106 is rounded off to be 1, which means that ATA is singular!


63/196

4.1. LEAST SQUARES PROBLEMS 53

We note that the solutionx of (4.3) can be expressed as

x= (ATA)1ATb.

If we let

A = (ATA)1AT,

then the LS solutionx could be written as

x= Ab.

Actually, the n-by-m matrix A is the Moore-Penrose generalized inverse ofA, whichis unique, see [14, 17, 42]. In general, we have

Definition 4.2 LetX

Rnm. If it satisfies the following conditions:

AXA= A, XAX =X, (AX)T =AX, (XA)T =X A,

thenX is called the Moore-Penrose generalized inverse ofA and denoted byA.

Now we develop the perturbation analysis of LS problems. Assume that there is aperturbationb onband letx,x +xdenote the solutions of the following LS problems,respectively,

minx

b Ax2, minx

(b + b) Ax2.Then

x= A

b,and

x + x = A(b + b) =Ab

where b= b + b. We have

Theorem 4.5 Letb1 and b1 denote orthogonal projections ofb and bonR(A), respec-tively. Ifb1= 0, then

x2x2 2(A)

b12b12

where2(A) = A2A

2 andb1= b1+ b1.

Proof: Letb2 denote the orthogonal projection ofb onR(A). Thenb= b1+ b2 andATb2 = 0. Note that

Ab= Ab1+ Ab2 = A

b1+ (ATA)1ATb2= A

b1.


64/196


65/196


66/196


where x1 is the first component of the vector x. Let the coefficient ofx be zero andthen we have the following equation:

1 2(x22 x1)x e122= 0.

Solving this equation for, we have = x2. Substituting it into (4.7), we thereforehave

Hx= x2e1.

We remark that for any vector 0= x Rn, by Theorem 4.8, one can construct aHouseholder matrix H such that the last n 1 components ofHx are zeros. We canuse the following two steps to construct the unit vector ofH:

(1) compute v = x x2e1;(2) compute = v/v2.

Now a natural question is: how to choose the sign in front ofx2? Usually, wechoose

v= x + sign(x1)x2e1,wherex1= 0 is the first component of the vector x, see [38]. Since

H=I 2T =I 2vTv

vvT =I vvT

where = 2/vTv, we only need to compute and v instead of forming . Thus, wehave the following algorithm.

Algorithm 4.1 (Householder transformation)

function:[v, ] =house(x)n= length(x)= x(2 :n)Tx(2 :n)

v(1) =x(1) +sign(x(1))

x(1)2 +

v(2 :n) =x(2 :n)

if = 0= 0

else= 2/(v(1)2 + )

end


67/196

4.3. QR DECOMPOSITION 57

4.2.2 Givens rotation

A Givens rotation is defined as follows:

G(i ,k,) = I+ s(eieTk ekeTi) + (c 1)(eieTi + ekeTk )

=

1 ...

.... . .

... ...

c s ...

... s c

... ...

. . ....

... 1

,

wherec = cos and s = sin . It is easy to prove thatG(i ,k,) is an orthogonal matrix.Letx Rn andy = G(i ,k,)x. We then have

yi= cxi+ sxk, yk = sxi+ cxk, yj =xj, j=i, k.

If we want to make yk = 0, then we only need to take

c= xi

x2i + x2k

, s= xk

x2i + x2k

.

Therefore,

yi=

x2i + x2k, yk = 0.

We remark that for any vector 0= x Rn, one can construct a Givens rotationG(i ,k,) acting onx to make a nonzero component ofx be zero.

4.3 QR decomposition

Let A Rmn and b Rm. By Theorem 3.3 (iii), for any orthogonal matrix Q, wehave

Ax b2 = QT(Ax b)2.Therefore, the LS problem

minx

QTAx QTb2is equivalent to (4.1). We wish that we could find a suitable orthogonal matrix Q suchthat the original LS problem becomes an easier solvable LS problem. We have


68/196


Theorem 4.9 (QR decomposition) Let A Rmn (m n). Then A has a QRdecomposition:

A= Q R0 , (4.8)whereQ Rmm is an orthogonal matrix andR Rnn is an upper triangular matrixwith nonnegative diagonal entries. The decomposition is unique whenm= n andA isnonsingular.

Proof: We use induction. When n = 1, we note that it is true by using Theorem4.8. Now, we assume that the theorem is true for all the matrices in Rp(n1) with

p n1. Let the first column ofA Rmn be a1. By Theorem 4.8 again, there existsan orthogonal matrix Q1 Rmm such that

QT1 a1=

a1

2e1.

Therefore, we have

QT1 A=

a12 vT0 A1

.

For the matrix A1 R(m1)(n1), we obtain by assumption,

A1= Q2

R2

0

,

whereQ2 R(m1)(m1) is an orthogonal matrix andR2is an upper triangular matrixwith nonnegative diagonal entries. Thus, let

Q= Q1

1 00 Q2

, R=

a12 vT0 R20 0

.ThenQ and R are the matrices satisfying the conditions of the theorem.

When A Rmm is nonsingular, we want to show that the QR decomposition isunique. Let

A= QR =Q RwhereQ,Q Rmm are orthogonal matrices, and R,R Rmm are upper triangularmatrices with nonnegative diagonal entries. Since A is nonsingular, we know that thediagonal entries ofR andR are positive. Therefore, the matricesQTQ=RR1are both orthogonal and upper triangular matrices with positive diagonal entries. ThusQTQ=RR1 =I ,


69/196


i.e.,

Q= Q,

R= R.

A complex version of the QR decomposition is needed later on.

Corollary 4.2 LetA Cmn (m n). ThenA has aQR decomposition:

A= Q

R

0

,

where Q Cmm is a unitary matrix and R Cnn is an upper triangular matrixwith nonnegative diagonal entries. The decomposition is unique whenm= n andA isnonsingular.

Now we use the QR decomposition to solve the LS problem (4.1). Suppose that

A Rmn (m n) has linearly independent columns, b Rm, and A has a QRdecomposition (4.8). LetQ be partitioned as

Q= [Q1Q2] ,

and

QTb=

QT1QT2

b=

c1c2

.

ThenAx b22= QTAx QTb22 = Rx c122+ c222.

Thex is the solution of the LS problem (4.1) if and only if it is the solution ofRx = c1.Note that it is much easier to get the solution of (4.1) by solving Rx = c1 since R isan upper triangular matrix. We have the following algorithm for LS problems:

(1) Compute aQR decomposition ofA.

(2) Compute c1= QT1 b.

(3) Solve the upper triangular systemRx = c1.

Finally, we discuss how to use Householder transformations to compute the QRdecomposition of A. Let m = 7 and n = 5. Assume that we have already foundHouseholder transformations H1 and H2 such that

H2H1A=

0

0 0 + 0 0 + 0 0 + 0 0 + 0 0 +

.


70/196


Now we construct a Householder transformationH3 R55 such thatH3

+

++++

= 0000

.LetH3= diag(I2,H3). We obtain

H3H2H1A=

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

.

In general, after n such steps, we can reduce the matrix A into the following form,

HnHn1 H1A=

R0

,

where R is an upper triangular matrix with nonnegative diagonal entries. By settingQ= H1 Hn, we obtain

A= Q

R

0

.

Thus, we have the following algorithm.

Algorithm 4.2 (QR decomposition: Householder transformation)

forj = 1 :n[v, ] =house(A(j : m, j))A(j: m, j : n) = (Imj+1 vvT)A(j : m, j :n)ifj < m

A(j+ 1 :m, j) =v(2 :m j+ 1)end

end

We remark that the QR decomposition is not only a basic tool for solving LSproblems but also an important tool for solving some other fundamental problems inNLA.

Exercises:


71/196


1. Let A Rmn have full column rank. Prove thatA +Ealso has full column rank ifEsatisfiesE2 1A2 , where A = (ATA)1AT.

2. Let U = [uij

] be a nonsingular upper triangular matrix. Show that

(U) maxi

|uii|mini

|uii| ,

where(U) = UU1.3. Let A Rmn withm nand have full column rank. Show that

I AAT 0

rx

=

b0

has a solution where x minimizesAx b2.

4. Let x Rn

and P be a Householder transformation such thatP x= x2e1.

LetG12, G23, , Gn1,n be Givens rotations, and letQ= G12G23 Gn1,n.

Suppose Qx = x2e1. Is Pequal to Q? Give a proof or a counterexample.5. Let A Rmn. Show that X=A minimizesAX IF over all X Rnm. What is

the minimum?

6. Let x =

x1x2

C2. Find an algorithm to compute the following unitary matrix

Q= c s

s c

, c R, c2 + |s|2 = 1

such thatQx =

0

.

7. Suppose an m-by-nmatrix A has the form

A=

A1A2

,

whereA1 is an n-by-n nonsingular matrix and A2 is an (m n)-by-n arbitrary matrix.Prove thatA2 A11 2.

8. Consider the following well-known ill-conditioned matrix

A=

1 1 1 0 00 00 0

, || 1.


72/196


(a) Choose a small such that rank(A) = 3. Then compute2(A) to show that A isill-conditioned.

(b) Find the LS solution withA given as above and b = (3, ,,)T by using

(i) the normalized equation method;

(ii) the QR method.

9. Let A = BCwhere B Cmr and C Crn withr= rank(A) = rank(B) = rank(C).

Show thatA =C(CC)1(BB)1B.

10. Let A = UV Cmn, where U Cmn satisfies UU = I, V Cnn satisfiesVV =Iand is an n-by-n diagonal matrix. Show that

A =VU.

11. Prove thatA = lim

0(AA + I)1A = lim

0A(AA + I)1.

12. Show thatR(A) N(A) = {0}.

13. Let A = [aij ] Cnn be idempotent. Then

R(A) N(A) = Cn, rank(A) =n

i=1aii.

14. Let A Cmn. Prove thatR(AA) = R(AA) = R(A),

R(AA) = R(AA) = R(A) = R(A),

N(AA) =N(AA) =N(A) =N(A),

N(AA) =N(AA) =N(A).

ThereforeAA andAA are orthogonal projectors.

15. Prove Corollary 4.2.


73/196

Chapter 5

Classical Iterative Methods

We study classical iterative methods for the solution of Ax = b. Iterative methods,originally proposed by Gauss in 1823, Liouville in 1837, and Jacobi in 1845, are quitedifferent from direct methods such as Gaussian elimination, see [2].

Direct methods based on an LU factorization ofA become prohibitive in terms ofcomputing time and computer storage if the matrix A is quite large. In some practicalsituation such as the discretization of partial differential equations, the matrix size canbe as large as several hundreds of thousands. For such problems, direct methods becomeimpractical. Furthermore, most large problems are sparse, and usually the sparsity islost during LU factorizations. Therefore, we have to face a very large matrix withmany nonzero entries at the end ofLUfactorizations, and then the storage becomes acrucial issue. For such kind of problems, we can use a class of methods called iterativemethods. In this chapter, we only consider some classical iterative methods.

We remark that the disadvantage with classical iterative methods is that the conver-gence rate maybe is slow or they may even diverge, and a stopping criterion is neededto be found.

5.1 Jacobi and Gauss-Seidel method

5.1.1 Jacobi method

Consider the following linear system

Ax= b

whereA = [aij] Rnn. We can write the matrix A in the following formA= D L U,

whereD= diag(a11, a22, , ann),

63


74/196

64 CHAPTER 5. CLASSICAL ITERATIVE METHODS

L=

0

a21 0

a31

a32 0

... ... . . . . . .

an1 an2 an,n1 0

,and

U =

0 a12 a13 a1n

0 a23 a2n. . .

. . . ...

0 an1,n0

.Then it is easy to see that

x= BJx + g,

whereBJ =D

1(L + U), g= D1b.

The matrix BJis called the Jacobi iteration matrix. The corresponding iteration

xk =BJxk1+ g, k= 1, 2, , (5.1)

is known as the Jacobi method if an initial vector x0 =

x(0)1 , x

(0)2 , , x(0)n

Tis given.

5.1.2 Gauss-Seidel method

In the Jacobi method, to compute the components of the vector

xk+1=

x(k+1)1 , x

(k+1)2 , , x(k+1)n

T,

only the components of the vector xk are used. However, note that to compute x(k+1)i ,

we could use x(k+1)1 , x

(k+1)2 , , x(k+1)i1 , which were already available for us. Thus a

natural modification of the Jacobi method is to rewrite the Jacobi iteration (5.1) in thefollowing form

xk = (D L)1U xk1+ (D L)1b, k= 1, 2, . (5.2)The idea is to use each new component as soon as it is available in the computation ofthe next component. The iteration (5.2) is known as the Gauss-Seidel method.

Note that the matrix D L is a lower triangular matrix with a11, , ann on thediagonal. Because these entries are assumed to be nonzero, the matrix D L is non-singular. The matrix

BGS= (D L)1Uis called the Gauss-Seidel iteration matrix.


75/196

5.2. CONVERGENCE ANALYSIS 65

5.2 Convergence analysis

5.2.1 Convergence theorems

It is often hard to make a good initial approximation x0. Thus, it will be nice to haveconditions that will guarantee the convergence of Jacobi, Gauss-Seidel methods for anyarbitrary choice of the initial approximation.

Both of the Jacobi iteration and the Gauss-Seidel iteration can be expressed by

xk+1= Bxk+ g, k= 0, 1, . (5.3)

For the Jacobi iteration, we have

BJ =D1(L + U), g= D1b;

and for the Gauss-Seidel iteration, we have

BGS= (D L)1U, g= (D L)1b.

The iteration (5.3) is called linear stationary iteration, where B Rnn is called theiteration matrix, g Rn the constant term, and x0 Rn the initial vector. In thefollowing, we give a convergence theorem.

Theorem 5.1 The iteration (5.3) converges with an arbitrary initial guess x0 if andonly if the matrixBk 0 ask .

Proof: Fromx = Bx + g andxk+1= Bxk+ g, we have

x xk+1= B(x xk). (5.4)

Because it is true for any value ofk , we can write

x xk =B(x xk1). (5.5)

Substituting (5.5) into (5.4), we have

x xk+1= B2(x xk1).

Continuing this process k times, we can write

x xk+1= Bk+1(x x0).

This shows that {xk} converges to the solutionxfor any choicex0if and only ifBk 0as k .


76/196

66 CHAPTER 5. CLASSICAL ITERATIVE METHODS

Recall that Bk 0 as k if and only if the spectral radius (B) < 1. Since|i| B, a good way to see whether (B) < 1 is to see whetherB < 1 bycomputing

B

with a row-sum or column-sum norm. Note that the converse is not

true. Combining the result of Theorem 5.1 with the above observation, we have thefollowing theorem.

Theorem 5.2 The iteration (5.3) converges for any choice ofx0 if and only if(B)n

j=1j=i

|aij|, i= 1, 2, ,

numerical linear algebra applications jin

Documents