Good Linear Algebra

Download Good Linear Algebra

Post on 28-Mar-2015

79 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

<p>Matrix Analysis and AlgorithmsAndrew Stuart Jochen Voss 4th August 2009</p> <p>IntroductionThe three basic problems we will address in this book are as follows. In all cases we are given</p> <p>simultaneous linear equations, eigenvalue problem.(SLE) denotes (SLE) (m nd</p> <p>as data a matrix</p> <p>A Cmn ,</p> <p>with</p> <p>mn</p> <p>and, for the rst two problems, the vector (LSQ) denotes</p> <p>least squares(EVP) (m nd</p> <p>b Cm .</p> <p>and (EVP) denotes</p> <p>= n) x Cn : Ax = b</p> <p>(LSQ) (m nd</p> <p> n) x Cn :</p> <p>= n) (x, ) Cn C:</p> <p>min Ax b 2 Ax = x, x 2=1 2 2 xCn The book contains an introduction to matrix analysis, and to the basic algorithms of numerical linear algebra. Further results can be found in many text books. The book of Horn and Johnson [HJ85] is an excellent reference for theoretical results about matrix analysis; see also [Bha97]. The subject of linear algebra, and matrix analysis in particular, is treated in an original and illuminating fashion in [Lax97]. For a general introduction to the subject of numerical linear algebra we recommend the book by Trefethen and Bau [TB97]; more theoretical treatments of the subject can be found in Demmel [Dem97], Golub and Van Loan [GL96] and in Stoer and Bulirsch [SB02]. Higham's book [Hig02] contains a wealth of information about stability and the eect of rounding errors in numerical algorithms; it is this source that we used for almost all theorems we state concerning backward error analysis. The book of Saad [Saa97] covers the subject of iterative methods for linear systems. The symmetric eigenvalue problem is analysed in Parlett [Par80].</p> <p>AcknowledgementWe are grateful to Menelaos Karavelas, Ian Mitchell and Stuart Price for assistance in the typesetting of this material. We are grateful to a variety of students at Stanford University (CS237A) and at Warwick University (MA398) for many helpful comments which have signicantly improved the notes.</p> <p>1</p> <p>Contents1 Vector and Matrix Analysis1.1 1.2 1.3 1.4 1.5 Vector Norms and Inner Products Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structured Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>44 7 10 11 16</p> <p>2 Matrix Factorisations2.1 2.2 2.3 2.4 2.5 2.6 Diagonalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . QR Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LU Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cholesky Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>1919 22 24 27 29 31</p> <p>3 Stability and Conditioning3.1 3.2 3.3 3.4 Conditioning of SLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conditioning of LSQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conditioning of EVP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stability of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>3333 35 37 39</p> <p>4 Complexity of Algorithms4.1 4.2 4.3 4.4 Computational Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix-Matrix Multiplication Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bidiagonal and Hessenberg Forms . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>4343 45 48 50</p> <p>5 Systems of Linear Equations5.1 5.2 5.3 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gaussian Elimination with Partial Pivoting The QR Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>5252 56 60</p> <p>6 Iterative Methods6.1 6.2 6.3 6.4 6.5 6.6 Linear Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Gauss-Seidel and SOR Methods . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Steepest Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>6768 71 74 75 76 78</p> <p>7 Least Squares Problems7.1 7.2 7.3 LSQ via Normal Equations LSQ via QR factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LSQ via SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .</p> <p>8687 87 88</p> <p>2</p> <p>8 Eigenvalue Problems8.1 8.2 8.3 8.4 8.5 8.6 The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rayleigh Quotient Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simultaneous Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The QR Algorithm for Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . Divide and Conquer for Symmetric Problems . . . . . . . . . . . . . . . . . . . .</p> <p>9395 97 98 98 100 102</p> <p>3</p> <p>Chapter 1</p> <p>Vector and Matrix AnalysisThe purpose of this chapter is to summarise the fundamental theoretical results from linear algebra to which we will frequently refer, and to provide some basic theoretical tools which we will use in our analysis. We study vector and matrix norms, inner-products, the eigenvalue problem, orthogonal projections and a variety of special matrices which arise frequently in computational linear algebra.</p> <p>1.1</p> <p>Vector Norms and Inner ProductsA</p> <p>Denition 1.1.a) b) c)</p> <p>vector normx Cnfor all and</p> <p>on</p> <p>Cn</p> <p>is a mapping i</p> <p> : Cn R</p> <p>satisfying</p> <p>x 0</p> <p>for all</p> <p>x =0</p> <p>x = 0,</p> <p>x = || x</p> <p> C, x Cn ,for all</p> <p>and</p> <p>x+y x + y</p> <p>x, y Cn . Rnis identical, but with</p> <p>Remark.replaced by</p> <p>The denition of a norm on</p> <p>Cn</p> <p>replaced by</p> <p>Rn</p> <p>and</p> <p>C</p> <p>R.</p> <p>Examples.the</p> <p>p-norm</p> <p>for</p> <p>1 p &lt; :n</p> <p>x</p> <p>p</p> <p>=j=1</p> <p>|xj |p</p> <p>1/p</p> <p>x Cn ;</p> <p>for</p> <p>p=2</p> <p>we get the Euclidean norm:</p> <p>n</p> <p>x</p> <p>2</p> <p>=j=1</p> <p>|xj |2</p> <p>x Cn ;</p> <p>for</p> <p>p=1</p> <p>we get</p> <p>n</p> <p>x Innity norm:</p> <p>1</p> <p>=j=1</p> <p>|xj |</p> <p>x Cn ;</p> <p>x</p> <p>= max1jn |xj |. a</p> <p>Theorem 1.2. All norms on Cn are equivalent: for each pair of norms there are constants 0 &lt; c1 c2 &lt; withc1 xa</p> <p>and </p> <p>b</p> <p>on Cn</p> <p> x</p> <p>b</p> <p> c2 x4</p> <p>a</p> <p>x Cn .</p> <p>x S = x Cn x 2 = 1 . Since a is non-zero on all of S we can dene f : S R by f (x) = x b / x a . Because the function f is continuous and the set S is compact there are x1 , x2 S with f (x1 ) f (x) f (x2 ) for all x S . Setting c1 = f (x1 ) &gt; 0 and c2 = f (x2 )completes the proof.</p> <p>Proof.</p> <p>Using property b) from the dention of a vector norm it suces to consider vectors</p> <p>Remarks.1. The same result holds for norms on change. 2. We remark that, if</p> <p>Rn .</p> <p>The proof transfers to this situation without</p> <p>A Cnn</p> <p>is an invertible matrix and</p> <p>a norm on</p> <p>Cn</p> <p>then</p> <p>A</p> <p>:=</p> <p>A</p> <p>is also a norm. An</p> <p>Denition 1.3.a) b) c) d)</p> <p>inner-product on Cnx Cnand</p> <p>is a mapping i</p> <p> , : Cn Cn C</p> <p>satisfying:</p> <p>x, x R+</p> <p>for all</p> <p>x, x = 0</p> <p>x = 0;</p> <p>x, y = y, x</p> <p>for all</p> <p>x, y Cn ; C, x, y Cn ;for all</p> <p>x, y = x, y</p> <p>for all</p> <p>x, y + z = x, y + x, z</p> <p>x, y, z Cn ; , is linear in the second component. Using</p> <p>Remark.</p> <p>Conditions c) and d) above state that</p> <p>the rules for inner products we get</p> <p>x + y, z = x, z + y, zand</p> <p>for all</p> <p>x, y, z Cn</p> <p>x, y = x, yThe inner product is said to be</p> <p>for all</p> <p> C, x, y Cn .</p> <p>anti-linear in the rst component.is given by</p> <p>Example.</p> <p>The</p> <p>standard inner product on Cnn</p> <p>x, y =j=1</p> <p>xj yj</p> <p>x, y Cn .</p> <p>(1.1)</p> <p>Denition 1.4.x, y = 0.</p> <p>Two vectors</p> <p>x, y</p> <p>are</p> <p>orthogonal</p> <p>with respect to an inner product</p> <p>, </p> <p>i</p> <p>Lemma 1.5 (Cauchy-Schwarz inequality). Letx, y2</p> <p> , : C n Cn C</p> <p>be an inner product. Then(1.2)</p> <p> x, x y, y</p> <p>for every x, y Cn and equality holds if and only if x and y are linearly dependent. Proof.For For every</p> <p>C</p> <p>we have (1.3)</p> <p>0 x y, x y = x, x y, x x, y + y, y . = y, x / y, ythis becomes</p> <p>0 x, x </p> <p>x, y y, x y, x x, y y, x x, y + = x, x y, y y, y y, y</p> <p>x, y y, y</p> <p>2</p> <p>and multiplying the result by If on the other hand and</p> <p>If equality holds in (1.2) then</p> <p>x</p> <p>and</p> <p>y, y gives (1.2). xy in (1.3) must be 0 and thus x and y are linearly dependent. y are linearly dependent, say x = y , then = y, y / y, y = </p> <p>x y = 0</p> <p>giving equality in (1.3) and thus in (1.2).</p> <p>5</p> <p>Lemma 1.6. Letis a vector norm. Proof.a) Since</p> <p> , : Cn Cn C x =</p> <p>be an inner product. Then : Cn R dened byx, x x Cn</p> <p>, </p> <p>is an inner product we have</p> <p>x, x 0</p> <p>for all</p> <p>x Cn ,</p> <p>i.e.</p> <p>x, x</p> <p>is real</p> <p>and positive. Also we get</p> <p>x = 0 x, x = 0 x = 0.b) We have</p> <p>x =</p> <p>x, x =</p> <p> x, x = || x .</p> <p>c) Using the Cauchy-Schwarz inequality</p> <p>x, yfrom Lemma 1.5 we get</p> <p> x</p> <p>y</p> <p>x, y Cn</p> <p>x+y</p> <p>2</p> <p>= x + y, x + y = x, x + x, y + y, x + y, y x x =2 2</p> <p>+ 2 x, y + y +2 x2</p> <p>2 2</p> <p>y + y</p> <p>x + y</p> <p>x, y Cn .</p> <p>This completes the proof.</p> <p>Remark.</p> <p>The</p> <p>angle between two vectors x and ycos() x</p> <p>is the unique value</p> <p> [0, ]</p> <p>with</p> <p>y = x, y . Rn , this denition of angle coincides</p> <p>When considering the Euclidean norm and inner product on if they have angle</p> <p>with the usual, geometric meaning of angles. In any case, two vectors are orthogonal, if and only</p> <p>/2. A Cmnas</p> <p>We write matrices</p> <p>a11 a21 A= . . . am1we write</p> <p>a12 a22. . .</p> <p>... ... ...</p> <p>am2</p> <p> a1n a2n . ; . . amn</p> <p>(A)ij = aijwe write</p> <p>for the</p> <p>ij th</p> <p>entry of</p> <p>A.</p> <p>Denition 1.7.A Rmn</p> <p>Given</p> <p>A Cmn we AT instead of A .) Cn</p> <p>dene the</p> <p>adjoint A CnmCn1of</p> <p>by</p> <p>A</p> <p>ij</p> <p>= aji .</p> <p>(For</p> <p>By identifying the space</p> <p>of vectors with the space</p> <p>n 1-matrices,</p> <p>we can take</p> <p>the adjoint of a vector. Then we can write the standard inner product as</p> <p>x, y = x y.Thus, the standard inner product satises</p> <p>Ax, y = (Ax) y = x A y = x, A yfor all</p> <p>x Cn , y Cm</p> <p>and all</p> <p>A Cmn .</p> <p>Unless otherwise specied, we will use</p> <p>, </p> <p>to</p> <p>denote the standard inner product (1.1) and</p> <p>6</p> <p>2 to denote the corresponding Euclidean norm.</p> <p>The following families of special matrices will be central in what follows:</p> <p>Denition 1.8. 1. Q Cmn is unitary if Q Q = I . Q is orthogonal.)2.</p> <p>(If</p> <p>Q</p> <p>is real then</p> <p>QT Q = I</p> <p>and we say</p> <p>A Cnn</p> <p>is</p> <p>Hermitian if A = A.</p> <p>(If</p> <p>A</p> <p>is real, we say</p> <p>A</p> <p>is</p> <p>symmetric.)</p> <p>3. A Hermitian matrix</p> <p>positive-denite or positive semi-denite we are necessarily refering to Hermitian matrices.</p> <p>Ax, x &gt; 0</p> <p>(resp.</p> <p>A Cnn is positive-denite (resp. 0) for all x Cn \ {0}. In this text,</p> <p>positive semi-denite) if x Ax =whenever we use the terminology</p> <p>Remarks.</p> <p>Unitary matrices have the following properties:</p> <p>A matrix rows.</p> <p>Q</p> <p>is unitary, if and only if the columns of</p> <p>Q</p> <p>are orthonormal with respect to the</p> <p>standard inner-product.</p> <p>In particular unitary matrices cannot have more columns than</p> <p>If</p> <p>Q</p> <p>is a square matrix,</p> <p>Q1 = Q</p> <p>and thus</p> <p>QQ = I . Qis unitary.</p> <p>A square matrix</p> <p>Q</p> <p>is unitary, if and only if</p> <p>The standard inner product and norm are invariant under multiplication by a unitary matrix:</p> <p>Theorem 1.9. Let , denote the standard inner product. Then for any unitary Q Cmn and any x, y Cn we have Qx, Qy = x, y and Qx 2 = x 2 .Proof.x, xThe rst claim follows from gives the second claim.</p> <p>Qx, Qy = x, Q Qy = x, y</p> <p>and using the relation</p> <p>x</p> <p>2</p> <p>=</p> <p>Other inner products with appropriate properties can give rise to other norms; for example, for matrices</p> <p>A</p> <p>which are Hermitian and positive-denite,</p> <p>x, yis an inner product and</p> <p>A</p> <p>= x, Ay x, xA.</p> <p>(1.4)</p> <p>xdenes a norm (see Exercise 1-2).</p> <p>A</p> <p>=</p> <p>(1.5)</p> <p>1.2</p> <p>Eigenvalues and Eigenvectors</p> <p>Denition 1.10. Given a matrix A Cnn , a vector x Cn is an eigenvector and C is an eigenvalue (also called a right eigenvalue) of A ifAx = xWhen and</p> <p>x = 0.</p> <p>(1.6)</p> <p>x</p> <p>is an eigenvector of</p> <p>A,</p> <p>then for every</p> <p>same eigenvalue, since both sides of (1.6) are linear in</p> <p>x</p> <p>by choosing</p> <p>x</p> <p>2</p> <p>= 1.</p> <p>Then the eigenvalue</p> <p> = 0 the vector x is an eigenvector for the x. Sometimes it is convenient to normalise n problem is to nd (x, ) C C satisfying x2</p> <p>Ax = x</p> <p>and</p> <p>= 1.</p> <p>Denition 1.11.</p> <p>Given a matrix</p> <p>A Cnn</p> <p>we dene the</p> <p>characteristic polynomial of A as</p> <p>A (z) := det(A zI).</p> <p>Theorem 1.12. A value C is an eigenvalue of the matrix A, if and only if A () = 0.Proof. 0.7 is an eigenvalue of</p> <p>A,</p> <p>if and only if there is an</p> <p>x = 0</p> <p>with</p> <p>equivalent to the condition that</p> <p>A I</p> <p>is singular which in turn is equivalent to</p> <p>(A I)x = 0. This is det(A I) =</p> <p>Since</p> <p>denoted by</p> <p>A is a polynomial of degree n, there will be n 1 , . . . , n and determined by A (k ) = 0.An eigenvalue</p> <p>(some possibly repeated) eigenvalues,</p> <p>Denition 1.13.that</p> <p>has</p> <p>(z )q </p> <p>algebraic multiplicity qA (z).An eigenvalue is</p> <p>if</p> <p>q</p> <p>is a factor of the characteristic polynomial</p> <p>the dimension of the null space of If is an eigenvalue of</p> <p>A I .then</p> <p>simple</p> <p>The</p> <p>geometric multiplicity, r, isq = r = 1. y(the eigenvector of</p> <p>is the largest integer such</p> <p>if</p> <p>and so</p> <p> (A I)</p> <p>A Cnn )</p> <p>det(A I) = 0and</p> <p>which implies that</p> <p>has non-trivial null space. with with</p> <p>Thus there is a vector</p> <p> det(A I) = 0 A</p> <p>corresponding to the eigenvalue</p> <p>y A = y </p> <p>y = 0.</p> <p>Denition 1.14.</p> <p>A vector</p> <p>y Cn</p> <p>y A = y is known as a</p> <p>and</p> <p>y=0 .</p> <p>left eigenvector of A Cnn</p> <p>corresponding to the eigenvalue</p> <p>Note that, even though the corresponding eigenvalues are the same, the right and left eigenvectors of a matrix are usually dierent.</p> <p>Denition 1.15.The matrix</p> <p>S</p> <p>is a</p> <p>similarity transform.</p> <p>Matrices</p> <p>A, B Cnn</p> <p>are</p> <p>similar, if B = S 1 AS</p> <p>with</p> <p>S Cnn</p> <p>invertible.</p> <p>Remarks.</p> <p>A Cnn has n linearly independent them as columns of the matrix X , then X is invertible. If we with eigenvalues of A on the diagonal, then we may writeIf a matrix</p> <p>eigenvectors let</p> <p>xi</p> <p>and we arrange</p> <p>denote a diagonal matrix</p> <p>AX = X.By invertibility of</p> <p>X</p> <p>we have</p> <p>A = XX 1 .Thus</p> <p>(1.7)</p> <p>is a similarity transform of</p> <p>A.</p> <p>It reveals the eigenstru...</p>