the eigenvalue problem of the symmetric toeplitz …...2 abstract in this assignment, the methods...

��

The Eigenvalue Problem ofthe Symmetric ToeplitzMatrix

��

��

��

2

Abstract

In this assignment, the methods and algorithms for solving the eigenvalueproblem of symmetric Toeplitz matrix are studied. First the Toeplitz systemis introduced. Then the methods that can localize the eigenvalues ofToeplitz matrix are studied. Finally, algorithms that can solve theeigenvalue problem of symmetric matrix are presented.

3

Symbols and Notations

nm�� denotes the vector space of all m-by-n real matrices:

� ��

�

�

��

�

�

��

mnm

n

ijnm

aa

aa

aAA

...

.........

...

1

111

,

When a capital letter is used to denote a matrix (e.g., A ,B ,� ), thecorresponding lowercase letter with the subscript ij refers to the ),( ji

component (e.g., ija , ijb , ij� ).

The n -by-n matrices are said to be square. The n -by-n identity matrix isdenoted by nI and its k -th column by )(n

ke :

��

�

�

��

�

�

�

1...0

.........

0...1

nI Tnk k

e )0,...0,1

,0,...,0()( � .

When the dimension is clear from context, we simply write I and ke ,respectively.

If A and B in nn�� satisfy IAB � , then B is the inverse of A and is denotedby 1�A . If 1�A exists, then A is said to be nonsingular; otherwise A issingular. TA� denotes 11 )()( �� TT AA .

If 11)( �� aA , then its determinant is given by det (A)=a . For nnA �� wehave

4

det (A)= )det()1( 111

1jj

n

j

j Aa��

�

where jA1 is a )1( n -by- )1( n matrix obtained by deleting the first row and

j -th column of A .

m� denotes 1��m , and for these column vectors we customarily uselowercase letters and denote individual components with single subscripts.Thus, if nmA �� , nx � , and Axy � , then

��

�n

jjiji xay

1

. mi ,...,1�

The set of all linear combinations of mnaa �,...,1 is a subspace referred to

as the span of � �naa ,...,1 :

� �naaspan ,...,1 =��

��

�� nj

jj a �� ,...,1

If mS �� is a subspace, then there exist independent basic vectors kaa ,...,1

in S such that

� �kaaspanS ,...,1� .

All bases for a subspace S have the same number of elements. This numberis the dimension of S and is denoted by )dim(S .

There are two important subspaces associated with a matrix A in nm�� . Therange of A is defined by

� �nm xAxyyAR �� somefor )(

and the null space of A by

� �0)( �� AxxAN n .

If � �naaA ,...,1� then� �naaspanAR ,...,)( 1� .

5

The rank of a matrix A is defined by

� �)(dim)( ARArank � .

A set of vectors � �pxx ,...,1 in m� is orthogonal if 0�jTi xx whenever ji � and

orthonormal if ijjTi xx �� . More generally, a collection of subspaces pSS ,...,1

of m� is mutually orthogonal if 0�yxT whenever iSx and jSy for ji � .

The orthogonal complement of a subspace mS �� is defined by

� �SxxyyS Tm �� allfor 0 .

If nmA �� and we write

),...,( 1 kdiagA �� , ),min( nmk �

then )( ijaA � is diagonal and iiia �� for ki ,...,1� .

We say that nnA �� issymmetric if AAT �

skew-symmetric if AAT �

positive definite if 0�AxxT , nx ��0non-negative definite if 0�AxxT , nx �

indefinite if 0))(( �AyyAxx TT for some nyx �, .orthogonal if n

T IAA �

nilpotent if 0�kA for some k .idempotent if AA �2 .positive if 0�ija for all i and j

non-negative if 0�ija for all i and j

diagonally dominant if ��

�ij ijii aa for all i

permutation if � �nss eeA ,...,

1� where ),...,( 1 nss is a permutation

of ),...,2,1( n .

6

Contents

1 Introduction 7

2 Basic Facts on Eigenvalues 8

3 Toeplitz Systems 9

4 Localization of the Eigenvalues of Toeplitz Matrices 12

4.1 The Embedding 124.2 Eigenstructure 144.3 Bounds for the Eigenvalues 164.4 Optimum Values for the nm�� 18

5 The Symmetric Eigenvalue Problem 20

5.1 Mathematical Properties underlying symmetric eigenproblem 205.2 Tridiagonalization and the Symmetric QR Algorithm 225.3 The Singular Value Decomposition 245.4 Jacobi Methods 275.5 Some Special Methods 28

6 Lanczos Methods 32

6.1 Derivation and Convergence Properties 326.2 Practical Lanczos Procedures 336.3 Applications to Linear Equations 35

7 Conclusion 36

Bibliography 37

7

1 Introduction

Many Practical problems in engineering and physics lead to eigenvalueproblems. Typically, in all these problems, an overdetermined system ofequations is given, say 1�n equations for n unknowns n�� ,...,1 of the form

0

);,...,(

......................

);,...,(

:);(

11

11

�

��

�

�

��

�

�

�

��

��

�

nn

n

f

f

xF (1.1)

in which the functions if also depend on an additional parameter � .

Usually, (1.1) has a solution � �Tnx �� ,...,1� only for specific values,...2,1 , �� ii�� , of this parameter. These values i� are called eigenvalues of

the eigenvalue problem (1.1), and a corresponding solution )( ixx �� of (1.1)eigensolution belonging to the eigenvalue i� .

The following chapters provide the main theoretical results and algorithmson the eigenvalue problem for symmetric matrix. Chapter 2 introduces thebasic facts on eigenvalues. Chapter 3 introduces Toeplitz Systems. Chapter4 discusses how to solve the eigenvalue problem for symmetric matrix.Chapter 5 is entirely devoted to solving of sparse matrix problems. Thediscussion revolves around the Lanczos method. We show how varioussparse eigenvalue problems can be solved using this important algorithms.Chapter 6 is localization of the eigenvalues of Toeplitz matrices usingadditive decomposition, embedding in circulants, and the FlourierTransform.

8

2 Basic Facts on Eigenvalues

A number C�� is called an eigenvalue of the matrix A if there is a vector0�x such that xAx �� . Every such vector is called a (right) eigenvector of

A associated with the eigenvalue � . The set of all eigenvalues is called thespectrum of A .

The set

}0)(|{:)( � � xIAxL ��

forms a linear subspace of nC of dimension

)rank)( I(An �� ,

and a number C�� is an eigenvalue of A precisely when 0)( ��L , i.e.,when 0)( �� and thus IA �� is singular:

0)det � I(A � .

It is easily seen that )det(:)( IA �� is a nth-degree polynomial of theform

)()1()( 01

1 ��

�

nn

nn .

It is called the characteristic polynomial of the matrix A . Its zeros are theeigenvalues of A .

If � is an eigenvalue of A , then � is also an eigenvalue of TA .

9

3 Toeplitz Systems

The class of Toeplitz matrices is extremely important, for a number oftheoretical and practical reasons. These matrices arise naturally in a varietyof problems, including trigonometric moment problems, optimum filtering,linear prediction and spectral estimation.

The eigenanalysis of Toeplitz matrices is a task often required in signalprocessing and control. For example, the stochastic processes in the signalprocessing application lead into the correlation natrix which is symmetric,positive definite and the Toeplitz structure. It is practically important tosolve Toeplitz systems in efficient way.

Matrices whose entries are constant along each diagonal rise in manyapplications and are called Toeplitz matrices. Formally, nnT �� is Toeplitzif there exist scalars 101 ,...,,...,

�� nn rrr such that ijij ra�

� for all i and j . Thus

��

�

�

��

�

�

�

��

��

�

0123

1012

2101

3210

rrrr

rrrr

rrrr

rrrr

T 4�n

is Toeplitz.

Toeplitz matrices are persymmetric matrices. We say that nnB �� ispersymmetric if it is symmetric about its northeast-southwest diagonal. Inthis chapter we show how to solve Toeplitz systems in )( 2nO time. Thediscussion will be restricted to the important case when T is also symmetricand positive definite.

Assume that we have scalars 11,..., �nrr such that the matrices

10

��

�

�

��

�

�

�

�

�

1...

......

...1

1

...1

1

12

11

121

k

k

k

r

rr

rr

rrr

T nk ,...,1�

are all positive definite. (There is no loss of generality in normalizing thediagonal.) Two algorithms will be described:

(1). Durbin’s algorithm for the Yule-Walker problem Tnn rryT ),...,( 1 � .

(2). Levinson’s algorithm for the general r.h.s. bxTn � problem.

Durbin’s algorithm

Algorithm 3.1 (Durbin 1960). [Golub, Matrix Computations, p.127] Given realnumbers )1(,...,, 010 �rrrr n and that nn

jirT �

�� )( is positive definite, the

following algorithm computes ny � such that TnrrTy ),...,( 1 � .

1,...,1For

1

1

11

�

�

�

�

nk

r

ry

�

�

1,...,1For

)1(

11

1

2

�

��

��

!��

�

��

�

�

ni

yrr iik

k

ik ��

��

ikii yyz��

�� 1�

��

��

�1

),...,1(

k

ii

y

kizy

This algorithm requires 2n flops.

11

Levinson’s algorithm

Algorithm 3.2 (Levinson 1947). [Golub, Matrix Computations, p.128] Givennb � , scalars )1(,...,, 010 �rrrr n and that nn

jirT �

�� )( is positive definite,

the following algorithm computes nx � such that bTx � .

1,...,1For

1

1

11

11

�

�

�

�

�

nk

r

bx

ry

�

�

then1 If

),...,1(

),...,1(

)1(

1

1

11

1

2

�

�

��

��

��

��

! ��

�

�

��

��

�

�

nk

x

kivx

kiyxv

xrb

k

ii

ikii

iki

k

ik

�

�

��

��

�

�

��

�

��

��

��

��

!��

�

�

��

�

�

1

1

11

1

),...,1(

),...,1(

k

ii

-ikii

iki

k

ik

y

kizy

kiyyz

yrr

This algorithm requires 22n flops.

12

4 Localization of the Eigenvalues ofToeplitz Matrices

This chapter explores the close relationship between Toeplitz and circulantmatrices. A very simple embedding result, and a somewhat less directadditive decomposition of perhaps theoretical interest, enable the derivationof bounds for all eigenvalues of symmetric Toeplitz matrices.

The bounds have a number of desirable characteristics, including lowcomputational complexity, which is only )(log2 nO operations per bound.This is a consequence of the use of the fast Fourier transform algorithm.Despite this, the bounds are sharper than a few other bounds (Slepian andLandau, 1978; Hertz, 1992), and often give tight estimates for theeigenvalues.

4.1 The Embedding

Suppose nM denotes the set of nn" matrices over the complex field. Thenotations 0�T and 0�T means that T is nonnegative definite and positivedefinite, respectively.

An arbitrary Toeplitz matrix nMT with elements ),0( nkitT kiik �#��

canbe embedded in a circulant mMC if 12 �� nm . In general, this leads to amatrix C which will contain 12 �� nml arbitrary parameters i� . Theembedding is defined by the elements ic of the first row of C ,

$�

$�

�

�#�

�##

�#

�

�

�

.

,

,0

milnt

lnin

nit

c

mi

ni

i

i � (4.1.1)

It can be verified that the matrix C so defined is indeed circulant.

13

The results given in this chapter could account for hermitian Toeplitzmatrices over the complex field, but, for simplicity, only real nn"

symmetric Toeplitz matrices will be considered. These matrices are definedby their first rows )0( niti �# . The modifications for the hermitian Toeplitzcase are trivial and will be omitted.

To illustrate the embedding (4.1.1) two examples will be given, the firstbeing based on the Toeplitz matrix

��

�

�

��

�

�

�

012

101

210

ttt

ttt

ttt

T

It is convenient to define a matrix S by

��

�

�

��

�

�

�

021

202

120

�

�

�

tt

tt

tt

S

where 0� is an arbitrary parameter. Embedding T in a circulant of order six,using (4.1.1), leads to the matrix C

��

�

�

��

�

�

�

012021

101202

210120

021012

202101

120210

ttttt

ttttt

ttttt

ttttt

ttttt

ttttt

C

�

�

�

�

�

�

��

��

��

TS

ST (4.1.2)

For the second example, let the desired order of the circulant be eight.Applying (4.1.1) yields the circulant

14

��

�

�

��

�

�

01201021

10120102

21012010

02101201

10210120

01021012

20102101

12010210

ttttt

ttttt

ttttt

ttttt

ttttt

ttttt

ttttt

ttttt

��

��

��

��

��

��

��

��

In this case, the number of arbitrary parameters i� would be 3�l . However,a symmetry constraint has been enforced upon the circulant, and reducedthat number down to two. In general, a symmetry constraint reduces thenumber of free parameters from l to 2/)1( �l or 2/l , depending on whetherl is odd or even.

4.2 Eigenstructure

A number of aspects pertaining to the eigenvalues and eigenvectors of theblock-matrices appearing in (4.1.2) are discussed in this section. The resultswill subsequently be used to obtain bounds for the eigenvalues of symmetricToeplitz matrices.

In the following, it is assumed that nm 2� . This implies that C has evenorder and only one free parameter.

It can be verified that the matrix nMF 2

��

��

�

�

��

II

IIF

2

1 ,

Where nMI , is unitary and reduces nMC 2 to block-diagonal form, withblock-diagonal matrices ST � and ST � .

Note that F is the block-analog of the Flourier matrix of order two, whichdiagonalizes all order two circulants, and that C is block-circulant, as canbe seen from (4.1.2).

15

Thus, 0�C if and only if 0�� ST and 0�� ST , a fact that will be used inthe sequel.

Since S is Toeplitz, both ST � and ST � are Toeplitz matrices. It is easy tosee that ST � is circulant whereas ST � is not. It is possible to explicitlyfind the eigenvectors and eigenvalues of these two matrices. To do this,recall that

� �Tiniii wwwwx )12(20 ,...,,, �� ,

Where njew 2/2�� , and ni 20 � , are orthogonal eigenvectors of nMC 2 . Wewill denote by i� the corresponding eigenvalues, which are just the discreteFourier transform (DFT) coefficients of the first row of C ,

��

�

�12

0

2/2n

k

nikjki ec �� . (4.2.1)

Partitioning ix in two 1n blocks

� �Ti

Ti

Tix vu�

and pre-multiplying by C leads to

��

��

��

iiii

iiii

vTvSu

uSvTu

�

�

Using the fact that ii

i uv )1( � one obtains

��

�

��

�� 121212

222

)(

)(

iii

iii

uuST

uuST

�

�.

This shows how the eigenvectors and eigenvalues of ST � and ST � arerelated to those of C : the even and odd coefficients of the DFT of the firstrow of C are the eigenvalues of ST � and ST � , respectively. Thecorresponding eigenvectors are n)i(0 2 �#iu , and n)i(0 12 �#

�iu ,respectively. Note that

� �Tiniii wwww )1(20 , ... ,,, ��u ,

16

where njew 2/2�� .

Although the vectors in the sets )0( 2 nii �#u and )0( 12 nii �#�

u satisfy theorthogonality conditions 0, 22 �ki uu and 0, 1212 �

�� ki uu )( ki � , one has

0, 122 ��ki uu .

4.3 Bounds for the Eigenvalues

Let 1´10 �� m�� denote the ordered eigenvalues of i� , and

110 �� n�� denote the ordered eigenvalues of T . For simplicity,

assume that nm 2� .

This section describes how to obtain bounds for all eigenvalues of T . Twodifferent methods are used, the first being based on the embeddingmentioned in section 4.2.

It is well-known that the eigenvalues of any of the )1()1( " nn principalsubmatrices of a symmetric nn" matrix M separate the eigenvalues ofM itself. This is expressed by the classical interlacing inequalities (Horn andJohnson, 1990). Iterating them n times one is lead to the relations

iiin �� ##�

, )0( ni �# (4.3.1)

Inequalities valid for other m could also be readily obtained.

The i� are the DFT coefficients of the first row of C (see 4.2.1). Thus, if n

is a power of two, they can be found using the fast Fourier transform (FFT)algorithm, and the number of necessary arithmetic operations will beroughly proportional to n2log per bound. Even when n is not a compositenumber, m can be taken as the first power of two exceeding 12 �n , and theFFT still applied.

The second method depends on the results given in section 4.3. The matrixT is expressed as

17

� �)()(2

1STSTT �� ,

which enables the results on the eigenvalues of ST � to be used. It can beseen, considering the quadratic form associated with T , that the maximumeigenvalues of T cannot exceed the sum of the maximum eigenvalues of

ST � . Similarly, the minimum eigenvalue of T can not be less than the sumof the minimum eigenvalues of ST � . Thus

niniii

�#�

�## �

0

max

0

max(

2

1 1220

�� , (4.3.2)

niniii

n�#

��#

� �

0

min

0

min(

2

1 122 �� , (4.3.3)

Where now the i� stand for the eigenvalues of C with their naturalordering, expressed by (4.2.1).

These are better bounds than the ones which follow from (4.3.1) namely,

nii

20

max00

�#�#

��

nii

nn 20

min2

�#��

��

In fact, using a theorem due to Weyl and originally derived in the context ofintegral equations, one can give upper and lower bounds for all eigenvaluesof T . The theorem asserts that, if A and B are hermitian, and if theireigenvalues )(Ai� and )(Bi� are sorted by increasing order, the orderedeigenvalues of BA � , )( BA �k� , satisfy

)()()( 1 BABA �#� kk �� ,)()()( BABA nkk �� #� .

For details about the theorem see Horn and Johnson (1990). Setting, forexample, STA ��2 and STB ��2 , and using the fact that the eigenvaluesof A and B are the even and odd subsequences of the eigenvalues of the

18

circulant C , leads immediately to upper and lower bounds for theeigenvalues of T ,

)()()(2 1 STSTT �� kk

)()()(2 STSTT ��# nkk �� (4.3.4)

The bounds (4.3.2) and (4.3.3) are special cases of these equations.

4.4 Optimum Values for the i�

The bounds given before depend on the 12 �� nml parameters i� , and so itis natural to ask what values of the i� lead to best-possible results. Thisproblem is addressed in this section.

Again, the notation )0( mmi �#� denotes the eigenvalues of the circulantC , with the ordering given by (4.2.1). The average values of the eigenvaluesis denoted by

��

�

�1

0

1 m

kkm

�� ,

If best-possible is considered to mean that the quantity

� � 21

010 )(,..., �

�

�

� �

m

kkl �� ,

is to be minimized, the answer is quite simple. One can then show that theall i� must be zero. When this happens, the eigenvalues of the circulant are

clustered as closely as possible in the Euclidean norm around � .

For brevity, a detailed discussion will be omitted. However, note that

0/)(2 ��%

%� minkj

i

k e �

�

�,

and that 0t�� . It is easy to see that 0/ �%% i�� , for 1,...,1,0 � li , leads to

19

��

�

� �1

0

/)(2 0m

k

minkjke

�� ,

which implies 0�� inmc , that is, 0�i� . Thus, as stated before, the values

which render the bounds best-possible are 0�i� )1,...,1,0( � li ,independently of the matrix T .

20

5 The Symmetric Eigenvalue Problem

5.1 Mathematical Properties underlying symmetric eigenproblem

Symmetry simplifies the real eigenvalue problem xAx �� in two ways: itensures that all of A ’s eigenvalues are real, and it ensures that there is anorthonormal basis of eigenvectors. Here are some mathematical propertiesof symmetric eigenvalue problem.

Theorem 5.1.1: Real Schur Decomposition for Symmetric Matrices. [Golub,Matrix Computations, p.268] If nnA �� is symmetric, then there exists anorthogonal Q such that

),...,(diag 1 nT AQQ �� .

Theorem 5.1.2: Courant-Fischer Minimax Characterization. [Golub, MatrixComputations, p.269] If nnA �� is symmetric, then for nk ,...,1� ,

yyAyyA TT

SykSk min max)(

0)dim( ��

�� ,

where )(Ak� denotes the i -th largest eigenvalue of A .

Theorem 5.1.3: Wielandt-Hoffman. [Golub, Matrix Computations, p.270] If A

and EA� are n -by-n symmetric matrices, then

� �22

1)()(

Fii

n

iEAEA # ��

�

�� .

Theorem 5.1.4: [Golub, Matrix Computations, p.271] Suppose A and EA� aren -by-n symmetric matrices and that

� �

knk

QQQ

�

, 21

is an orthogonal matrix such that )( 1QR is an invariant subspace for A .Partition the matrices AQQT and EQQT as following:

21

��

��

��

22

11

0

0

A

AAQQT , �

�

��

��

2221

1211

EE

EEEQQT .

If

0min222211

)()(

22

11

� ��

�

EEAA

��

and 2/212 �#E , then there exists a matrix kknP �� )( satisfying

�/22212

EP #

such that the columns of 2/1211 ))(( �� PPIPQQQ T form an orthonormal

basis for a subspace that is invariant for EA� .

Theorem 5.1.5: [Golub, Matrix Computations, p.272] Suppose nnA �� andkkS �� are symmetric and that,

111 ESQAQ �

where knQ ��1 satisfies kT IQQ �11 . Then there exist eigenvalues

)(,...,1 Ak �� such that

212)( ESii # �� . ki ,...,1�

Theorem 5.1.6: [Golub, Matrix Computations, p.273] Suppose nnA �� andkkS �� are symmetric and that,

111 FSXAX �

where knX ��1 satisfies 0)( 1 �Xk . Then there exist )(,...,1 Ak �� suchthat

)(2)(

1

21

X

FS

kii

�� # .

Theorem 5.1.7: [Golub, Matrix Computations, p.273] If nnA �� is symmetricand knQ ��1 satisfies k

T IQQ �11 , then

22

F

T

F

T

FS

AQQQI

AQQQAQSQAQkk

111

111111

)(

)(min

�

� �

�� .

Theorem 5.1.8: [Golub, Matrix Computations, p.274] Suppose nnA �� issymmetric and knQ ��1 satisfies k

T IQQ �11 . If

DZAQQZ kTT �� ),...,(diag)( 111

is the Schur decomposition of 11 AQQT and � �kyyZQ ,...,11 � , then

211121112)()( AQQQIZeAQQQIyAy T

iT

iii # �

for ki ,...,1� . The i are called Ritz values, the iy are called Ritz vectors, andthe ),( ii y are called Ritz pairs.

Theorem 5.1.9: Sylvester Law of Inertia. [Golub, Matrix Computations, p.274] If nnA �� is symmetric and nnX �� is nonsingular, then A and AXX T

have the same inertia.

5.2 Tridiagonalization and the Symmetric QR Algorithm

The methods for actually computing the eigenvalues and eigenvectors of amatrix A usually are preceded by a reduction step, in which the matrix A istransformed to a “similar” matrix B having the same eigenvalues as A . Thematrix )( ikbB � has a simpler structure than A (B is either a tridiagonalmatrix, 0�ikb for 1� ki , or a Hessenberg matrix, 0�ikb for 2�� ki ), so

that the standard methods for computing eigenvalues and eigenvectors arecomputationally less expensive when applied to B than when applied to A .

Algorithm 5.2.1: Householder Tridiagonalization. [Golub, MatrixComputations, p.277] Given an n -by-n symmetric matrix A , the followingalgorithm overwrites A with 0AUUT T

o� , where T is tridiagonal and

210 ...�

� nPPU is the product of Householder transformations.

23

For 2,...,1 � nk

Determine a Householder matrix knkP �� such that

��

�

�

��

�

�

�

��

�

�

��

�

��

0

...

0...

,1

x

a

a

P

nk

kk

.

Tkk APPA �: , ),(diag kkk PIP �

This algorithm requires 332 n flops.

Algorithm 5.2.2: Implicit Symmetric Q-R Step with Wilkinson Shift. [Golub,Matrix Computations, p.280] Given an unreduced symmetric tridiagonal matrix

nnT �� , the following algorithm overwrites T with ZTZT

, where

11... �� nJJZ is a product of Givens rotations with the property that )( ITZ

T�

is upper triangular and � is that eigenvalue of T ’s trailing 2-by-2 principalsubmatrix closer to nnt .

� �

1,...,1For

)(sign/

2/)(

21

11

21,

221,

1,1

�

�

�

��

�

��

��

nk

tz

tx

tdddtt

ttd

nnnnnn

nnnn

�

�

Determine )cos(�c and )sin(�s such that

��

��

��

�

��

��

��

�

0

*

z

x

cs

sc

kk

kk

kkTk

tz

tx

nkI

kkJJTJJT

,2

,1

then1 f

),1,(,

�

�

�

�

�

��

This algorithm requires about 14n flops and n square roots. If a givenorthogonal matrix Q is overwritten with 11... �nJQJ , then an additional 24n

flops are needed.

24

Algorithm 5.2.3: Symmetric QR Algorithm. [Golub, Matrix Computations,p.281] Given an n -by-n symmetric matrix A and � , a small multiple of theunit roundoff, the following algorithm overwrites A with EDAQQT ��

where Q is orthogonal, D is diagonal, and E satisfies 22

AE u& .

Use Algorithm 5.2.1, compute the tridiagonalizationTPPAPPA n

Tn ��

��)()( 2121

RepeatSet 1, �iia and iia ,1� to zero if

)( 1,11,,1 ��#� iiiiiiii aaaa �

for any 1,...,1 � ni .Find the largest qand the smallest p such that if

q

q

p

p

n

A

A

A

A

��

�

�

��

�

�

�

00

00

00

33

22

11

then 33A is diagonal and 22A has no zero subdiagonal elements.If thennq � quit.Apply Algorithm 5.2.2 to 22A ,

),,(diag ),,(diag qpT

qp IZIAIZIA �

Go to Repeat.

This algorithm requires about 332 n flops if Q is not accumulated and about

35n flops if Q is accumulated.

5.3 The Singular Value Decomposition

Algorithm 5.3.1: QR with Column Pivoting. [Golub, Matrix Computations,p.165] Given nmA �� , the following algorithm computes the factorization

QRA �' defined by

rnrrm

rRRRQRA

�

�

��

��' 00

1211

25

where )(Arankr � ,Q is orthogonal, 11R is upper triangular, and ' is apermutation. The permutation � �

ncc ee ,...,1

�' is determined according to the

strategy

� �2

)1(

2

)1(

2

)1( ,...,max �� kn

kk

kp vvv .

The element ija is overwritten by )( jirij # .

nk

nja

njjcm

iijj

j

,...,1For

),...,1(

),...,1(

1

2

�

��

��

��

�

Determine )( npkp ## so jnjk

p ��

� max .

If 0�p� then quit

elseInterchange kc and pc , k� and p� , and ika and ipa

for mi ,...,1� .Determine a Householder kQ

~ such that

��

�

�

��

�

�

�

��

�

�

��

�

�

0

...

0

*

...

...~

mk

kk

k

a

a

Q .

AQIA kk )~

,(diag 1��

),...,1( 2 nkjakjjj ��

This algorithm requires 3/2)(2 32 rnmrmnr �� flops where )rank(Ar � .

Algorithm 5.3.2: Golub-Kahan SVD Step. [Golub, Matrix Computations, p.292]Given a bidiagonal matrix nnB �� having no zeros on its diagonal orsuperdiagonal, the following algorithm overwrites B with the bidiagonalmatrix VBUB

T� where U and V are orthogonal and V is essentially the

orthogonal matrix that would be obtained by applying Algorithm 5.2.2 toBBT T� .

26

Let � be the eigenvalue of the trailing 2-by-2 submatrix of BBT T�

that is closer to nnt .� � 11ty

12tz �

1,...,1For � nk


� � � �0* ��

��

�

cs

sczy

kk

kk

bz

b

,k,kBJB

,1

y

)1(

��

�

��


��

��

��

�

��

��

��

�

0

*

z

y

cs

sc

2,

1,

T

then1 If

)1(

�

�

�

�

�

��

kk

kk

bz

by

nk

B,k,kJB

An efficient implementation of this algorithm would store B ’s diagonal andsuperdiagonal in vectors ),...,( 1 naa and ),...,( 2 nff respectively and would

require n20 flops and n2 square roots. Accumulating U requires mn4 flops.Accumulating V requires 24n flops.

Algorithm 5.3.3: The SVD Algorithm. [Golub, Matrix Computations, p.293]Given )( nmA nm �� and , a small multiple of the unit roundoff, thefollowing algorithm overwrites A with EDAVU T �� , where nmU �� isorthogonal, nnV �� is orthogonal, nmD �� is diagonal, and E satisfies

22AE u& .

Use Algorithm 5.3.1, compute the bidiagonalization)()( 211 �

�� nT

n VVAUUA

RepeatSet 1, �iia to zero if )( 1,11, ��

�# iiiiii aaa � for any 1,...,1 � ni .

Find the largest qand the smallest p such that if

27

nm

q

qpn

p

A

A

A

A

��

�

�

��

�

�

�

000

00

00

00

33

22

11

then 33A is diagonal and 22A has nonzero superdiagonal.If thennq � quit.If any diagonal entry in 22A is zero, the zero the superdiagonal

entry in the same row and go to Repeat.Apply Algorithm 5.3.2 to 22A ,

),,(diag ),,(diag qpT

nmqp IZIAIUIA��

�

Go to Repeat.

The amount of work required by this algorithm and its numerical propertiesare shown in Table 5.3.1.

Table 5.3.1. SVD Flop counts m nRequire Golub-Reinsch SVD Chan SVD

� 33222 nmn 32 nmn �

�, V 32 42 nmn � 33

172 nmn �

�, U 22 42 mnnm � 33

1922 nnm �

�, U1327 nmn 3

31623 nmn �

�, U, V 33

1422 42 nmnnm �� 32 112 nnm �

�, U1, V 331127 nmn � 32 103 nmn �

5.4 Jacobi Methods

Jacobi (1846) proposed a method for reducing a symmetric matrix nnA ��

to diagonal form using what we have been calling Givens rotations.Although his method has essentially been eclipsed by the symmetric QPalgorithm, it is important to understand because of its significant role inparallel computation. Jacobi’s method is also useful for finding theeigenvalues of nearly diagonal symmetric matrices.

28

Algorithm 5.4.1: Serial Jacobi. [Golub, Matrix Computations, p.299] Given asymmetric matrix nnA �� and u�� , the unit roundoff, the followingalgorithm overwrites A with EDAUU T �� where U is orthogonal, D isdiagonal, and E has a zero diagonal and satisfies

FFAE �# :

FA�� :

Do Until off ( A) 2�#

For 1,...,2,1 � np

For npq ,...,1��

Find ),,( qpJJ � such that the ( qp, ) entryof AJJ T is zero.

AJJA T�

This algorithm requires 32n flops per sweep. An additional 32n flops arerequired if the orthogonal matrix U is accumulated.

5.5 Some Special Methods

By exploiting the rich mathematical structure of the symmetriceigenproblem, it is possible to devise useful alternatives to the symmetricQR algorithm. Many of these techniques are appropriate when only a feweigenvalues and/or eigenvectors are desired. Three such methods aredescribed in this section: bisection and Rayleigh quotient iteration.

Bisection

Let rT denote the leading r -by-r principal submatrix of

��

�

�

��

�

�

�

nn

n

ab

b

bab

ba

T

...0

.........

...

0...

322

21

(5.5.1)

and define the polynomials )(),...,(0 xpxp n by

29

1)( �xpr

)det()( xITxp rr �

for nr ,...,2,1� . A simple determinatal expansion can be used to show that

)()()()( 22

1 xpbxpxaxp rrrrr ��

� . nr ,...,2�

Because )(xpn can be evaluated in )(nO flops, it is feasible to find its rootsusing the method of bisection. For example, if 0)()( �zpyp and zy � , thenthe iteration

Do While � �)( zyzy �� u

2/)( zyx ��

if 0)()( �ypxp nn

then xz �

else xy �

is guaranteed to converge to a zero of )(xpn , i.e., to an eigenvalue of T . Theiteration converges linearly: error is approximately halved at each step.

Sometimes it is necessary to compute the k -th largest eigenvalue of T forsome prescribed value of k . This can be done efficiently by using thebisection idea and the following classical result:

Theorem 5.5.1: Sturm Sequence Property. [Golub, Matrix Computations, p.306]

If the tridiagonal matrix T in (4.5.1) is unreduced, then the eigenvalues of1�rT strictly separate the eigenvalues of rT :

)()()(...)()()( 1112111 rrrrrrrrr TTTTTT ��

.

Moreover, if )(�a denotes the number of sign changes in the sequence

� �)(),...,(),( 10 �� nppp

then )(�a equals the number of T ’s eigenvalues that are less than � .(Convention: )(�rp has the opposite sign from )(1 �

�rp if 0)( ��rp .)

Suppose we wish to compute )(Tk� . From the Gershgorin circle theorem itfollows that � �zyTk ,)( � where

30

1

min�

� iii bbai

y

1

min�

�� iii bbai

z 010 ��nbb

With these starting values, it is clear from the Sturm sequence property thatthe iteration

Do While � �)( zyyz �� u

2/)( zyx ��

if kxa �)( (5.5.2)then xz �

else xy �

produces a sequence of subintervals that are repeatedly halved in length butwhich always contain )(Tk� .

During the executing of (5.5.2), information about the location of othereigenvalues is obtained. By systematically keeping track of this informationit is possible to devise an efficient scheme for computing “contiguous”subsets of )(T� , e.g., � �)(),...,(),( 1 TTT jkkk ��

�� .

If selected eigenvalues of a general symmetric matrix A are desired, then itis necessary first to compute the tridiagonalization 00 TUUT T� before theabouve bisection schemes can be applied. This can be done using Algorithm5.2.1 or by the Lanczos algorithm discussed in the next chapter. In eighercase, the corresponding eigenvectors can be readily found via inverseiteration, since tridiagonal systems can be solved in )(nO flops.

Rayleigh Quotient Iteration

Suppose nnA �� is symmetric and that x is a given nonzero n -vector. Asimple differentiation reveals that

xx

Axxxr

T

T

�� )(�

31

minimizes 2

)( xIA � . The scalar )(xr is called the Rayleigh quotient of x .

Clearly, if x is an approximate eigenvector, then )(xr is a reasonable choicefor the corresponding eigenvalue. On the other hand, if � is an approximateeigenvalues, then inverse iteration theory tells us that the solution to

bxIA � )( � will almost always be a good approximate eigenvector.

Combining these two ideas in the natural way give rise to the Rayleighquotient iteration:

0x given, 120 �x

For ,...1,0�k

)( kk xr��

Solve kkk xzIA � �1)( � for 1�kz

2111 /��

� kkk zzx

Note that for any k we have

11)(��

�� kkkk zzEA �

where

2

211 /��

� kTkkk zzxE .

It follows from 21/1

�# kk z�� for some )(A�� .

32

6 Lanczos Methods

In this chapter we develop the Lanczos method, a technique that isapplicable to large, sparse, symmetric eigenproblems. The method involvestridiagonalizing the given matrix A . However, unlike the Householderapproach, no intermediate (and full) submatrices are generated. Equallyimportant, information about A ’s external eigenvalues tends to emergelong before the tridiagonalization is complete. This makes the Lanczosalgorithm particularly useful in situations where a few of A ’s largest orsmallest eigenvalues are desired.

6.1 Derivation and Convergence Properties

Suppose nnA �� is large, sparse, and symmetric and assume that a few ofits largest and/or smallest eigenvalues are desired. This problem can besolved by a method attributed to Lanczos (1950). The method generates asequence of tridiagonal matrices � �jT with the property that the external

eigenvalues of jjjT �� are progressively better estimates of A ’s external

eigenvalues.

Algorithm 6.1.1: The Lanczos Algorithm. [Golub, Matrix Computations, p.325]

Given a symmetric nnA �� and nw � having unit 2-norm, the followingalgorithm computes a j -by- j symmetric tridiagonal matrix jT with the

property that )()( ATj �� . The diagonal and subdiagonal elements of jT are

stored in j�� ,...,1 and 11,..., �j�� , respectively.

0�iv ),...,1( ni �

10 ��

0�j

Do While )0( �j�

If )0( �j thenFor ni ,...,1�

iwt � , jii vw �/� , tv ji � �

vAwv ��

33

1�� jj

vwTj ��

wvv j� �

2vj ��

Note that A is not altered during the entire process. Thus, only a procedurefor computing matrix-vector products involving A need be supplied. Ifsparsely is exploited in this procedure and only kn flops are involved ineach call )( nk �� , then each Lanczos step requires approximately nk)4( �

flops to execute.

The eigenvalues of jT can be found using the symmetric QR algorithm or

any of the special methods of section 5.5.

6.2 Practical Lanczos Procedures

Lanczos with Complete Reorthogonalization

Let njrr ��10 ,..., be given and suppose that Householder matrices 10 ,...,

�jPP

have been computed such that � �1010 ,...,)...(�� j

Tj rrPP is upper triangular.

Denote the first j columns of )...( 10 �jPP by � �jqq ,...,1 . Now suppose that we

are given a vector njr � and wish to compute a unit vector 1�jq in the

direction of

� ��

�

� �j

ijij

Tij qqqrqrw

11,...,span)( .

If a Householder matrix jP is determined so � �jT

j rrPP ,...,)...( 00 is upper

triangular, then it follows that the j +1-st column of jPP ...0 is the desired

unit vector.

If we incorporate these Householder computations into the Lanczos process,we can produce Lanczos vectors that are orthogonal to working accuracy:

10 qr � (given unit vector)Determine 00000 /2 vvvvIP TT � so 100 erP � .

34

111 AqqT��

Do 1,...,1 � nj

)0( )( 0011 � ��

qqqIAr jjjjj ��

jj rPPw )( 01 ��

Determine jTj

Tjjj vvvvIP /2 � such that

Tjjj wwwP )0,...,0,,,...,( 1 ��

101 )(��

�� jjj ePPq

111 �� j

Tjj Aqq�

This is an example of a complete reorthogonalization Lanczos scheme.

Block Lanczos

Like the simple power method, the Lanczos algorithm has a block analog.Suppose rpn � and consider the decomposition

��

�

�

��

�

�

��

�

�

rr

Tr

T

T

T

MB

B

BMB

BM

TAQQ

1

1

221

11

...0

.........

............

...

0...

(6.2.1)

where

� �rXXQ ,...,1� pniX ��

is orthogonal, each ppiM �� , and each pp

iB �� is upper triangular.

Comparing blocks in TQAQ � show that

jjjjTjjj BXMXBXAX 111 ��

�� 000 �BX

for 1,...,2,1 � rj . From the orthogonality of Q it follows that

jTjj AXXM � . rj ,...,2,1�

Moreover,

jjj RBX ��1

represents the RQ factorization of

35

pnjjjjjj BXMXAXR �

�� 11

These observations suggest that the block tridiagonal matrix T in (6.2.1)can be generated as follows:

pnX ��1 given, with pT IXX �11 .

111 AXXM T�

For 1,...,1 � rjTjjjjjj BXMXAXR 11 ��

� )0( 00 �TBX (6.2.2)

jjj RBX ��1 (Q-R factorization)

111 �� j

Tjj AXXM

s-Step Lanczos

The block Lanczos algorithm (6.2.2) can be used in an iterative fashion tocalculate selected eigenvalues of A . To fix ideas, suppose we wish tocalculate the p largest eigenvalues. If pnX ��1 is a given matrix havingorthonormal columns, we may proceed as follows:

1. Generate pnsXX ��,...,2 via the block Lanczos algorithm.

2. From � � � �sT

ss XXAXXT ,...,,..., 11� and sp-by-sp, p -diagonal matrix.3. Compute an orthogonal matrix � �spuuU ,...,1� such that

),...,diag( 1 spsT UTU � where sp ��1 .

4. Set � �� ps uuXXX ,...,,..., 111 � .

5. If F

sTXAX 11 is still too large, go to 1.

This is the block analog of the s-step Lanczos algorithm. The same idea canalso be used to compute several of A ’s smallest eigenvalues or a mixture ofboth large and small eigenvalues.

6.3 Applications to Linear Equations

The Lanczos iteration can be used to solve large sparse linear equation. Thealgorithm is discussed in [Golub, Matrix Computation, p.344]. And it isidentidied as the widely known method of conjugate gradients.

36

7 Conclusion

In this paper, we introduced the Toeplitz matrix which is extremelyimportant of solving the problems in signal processing. At first, we studiedan important method which can localize the eignvalues of Toeplitz matrix.Then we introduced the method which can diagonalize the symmetric matrixand its importance. Finally, we discussed serveral important algorithmswhich can calculate the eigenvalues of symmetic matrix. In chapter 5, wedescribe the symmetric QR algorithm, Rayleign quotient iteration andbisection method. In chapter 6, the Lanczos methods are introduced.

37

Bibliography

Gene H. Golub, Charles F. van Loan (1983). Matrix Computations, TheJohns Hopkins University Press.

Paulo Jorge S. G. Ferreira (1994). “Localization of the Eigenvalues ofToeplitz Matrices Using Additive Decomposition, Embedding in Circulants,and the Fourier Transform”, Proc. 10th IFAC Symposium on SystemIdentification, Sys ID’94, Copenhagen, Denmark, July 1994, pp.271-276.

J. Stoer, R. Bulirsch (1980). Introduction to Numerical Analysis, Springer-Verlag New York Inc.

G.Strang (1976). Linear Algebra and Its Applications, Academic Press, NewYork.

L. Mirsky (1955). An Introduction to Linear Algebra, Oxford UniversityPress, London.

J. M. Ortega (1972). Numerical Analysis: A Second Course, AcademicPress, New York.

A. S. Householder (1964). The Theory of Matrices in Numerical Analysis,Blaisdell Publication Co., New York.

the eigenvalue problem of the symmetric toeplitz …...2 abstract in this assignment, the methods...

Documents