parallel numerical algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel...

42
Basics Power Iteration QR Iteration Krylov Methods Other Methods Parallel Numerical Algorithms Chapter 5 – Eigenvalue Problems Section 5.2 – Eigenvalue Computation Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 1 / 42

Upload: others

Post on 21-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Parallel Numerical AlgorithmsChapter 5 – Eigenvalue Problems

Section 5.2 – Eigenvalue Computation

Michael T. Heath and Edgar Solomonik

Department of Computer ScienceUniversity of Illinois at Urbana-Champaign

CS 554 / CSE 512

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 1 / 42

Page 2: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Outline

1 Basics

2 Power Iteration

3 QR Iteration

4 Krylov Methods

5 Other Methods

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 2 / 42

Page 3: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Eigenvalues and Eigenvectors

Given n× n matrix A, find scalar λ and nonzero vector xsuch that

Ax = λx

λ is eigenvalue and x is corresponding eigenvector

A always has n eigenvalues, but they may be neither realnor distinct

May need to compute only one or few eigenvalues, or all neigenvalues

May or may not need corresponding eigenvectors

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 3 / 42

Page 4: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Problem Transformations

Shift : for scalar σ, eigenvalues of A− σI are eigenvaluesof A shifted by σ, λi − σ

Inversion : for nonsingular A, eigenvalues of A−1 arereciprocals of eigenvalues of A, 1/λi

Powers : for integer k > 0, eigenvalues of Ak are kthpowers of eigenvalues of A, λki

Polynomial : for polynomial p(t), eigenvalues of p(A) arevalues of p evaluated at eigenvalues of A, p(λi)

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 4 / 42

Page 5: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Similarity Transformations

B is similar to A if there is nonsingular T such that

B = T−1AT

Then

By = λy ⇒ T−1ATy = λy ⇒ A(Ty) = λ(Ty)

so A and B have same eigenvalues, and if y iseigenvector of B, then x = Ty is eigenvector of A

Similarity transformations preserve eigenvalues, andeigenvectors are easily recovered

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 5 / 42

Page 6: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Similarity Transformations

Forms attainable by similarity transformation

A T Bdistinct eigenvalues nonsingular diagonalreal symmetric orthogonal real diagonalcomplex Hermitian unitary real diagonalnormal unitary diagonalarbitrary real orthogonal real block triangular (real Schur)arbitrary unitary upper triangular (Schur)arbitrary nonsingular almost diagonal (Jordan)

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 6 / 42

Page 7: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Preliminary Reduction

Eigenvalues easier to compute if matrix first reduced tosimpler form by similarity transformation

Diagonal or triangular most desirable, but cannot alwaysbe reached in finite number of steps

Preliminary reduction usually to tridiagonal form (forsymmetric matrix) or Hessenberg form (for nonsymmetricmatrix)

Preliminary reduction usually done by orthogonaltransformations, using algorithms similar to QRfactorization, but transformations must be applied fromboth sides to maintain similarity

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 7 / 42

Page 8: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

DefinitionsTransformationsParallel Algorithms

Parallel Algorithms for Eigenvalues

Algorithms for computing eigenvalues and eigenvectorsemploy basic operations such as

vector updates (saxpy)inner productsmatrix-vector and matrix-matrix multiplicationsolution of triangular systemsorthogonal (QR) factorization

In many cases, parallel implementations will be based onparallel algorithms we have already seen for these basicoperations, although there will sometimes be new sourcesof parallelism

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 8 / 42

Page 9: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Power Iteration

Simplest method for computing one eigenvalue-eigenvector pair is power iteration, which in effect takessuccessively higher powers of matrix times initial startingvector

x0 = arbitrary nonzero vectorfor k = 1, 2, . . .

yk = Axk−1xk = yk/‖yk‖∞

end

If A has unique eigenvalue λ1 of maximum modulus, thenpower iteration converges to eigenvector corresponding todominant eigenvalue

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 9 / 42

Page 10: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Power Iteration

Convergence rate of power iteration depends on ratio|λ2/λ1|, where λ2 is eigenvalue having second largestmodulus

It may be possible to choose shift, A− σI, so that ratio ismore favorable and yields more rapid convergence

Shift must then be added to result to obtain eigenvalue oforiginal matrix

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 10 / 42

Page 11: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Parallel Power Iteration

Power iteration requires repeated matrix-vector products,which are easily implemented in parallel for dense orsparse matrix, as we have seen

Additional communication may be required fornormalization, shifts, convergence test, etc.

Though easily parallelized, power iteration is often too slowto be useful in this form

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 11 / 42

Page 12: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Inverse Iteration

Inverse iteration is power iteration applied to A−1, whichconverges to eigenvector corresponding to dominanteigenvalue of A−1, which is reciprocal of smallesteigenvalue of A

Inverse of A is not computed explicitly, but onlyfactorization of A (and only once) to solve system of linearequations at each iteration

x0 = arbitrary nonzero vectorfor k = 1, 2, . . .

Solve Ayk = xk−1 for yk

xk = yk/‖yk‖∞end

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 12 / 42

Page 13: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Inverse Iteration

Shifting strategy can greatly accelerate convergence

Inverse iteration is especially useful for computingeigenvector corresponding to approximate eigenvalue,since it converges rapidly when approximate eigenvalue isused as shift

Inverse iteration also useful for computing eigenvalueclosest to any given value β, since if β is used as shift,then desired eigenvalue corresponds to smallesteigenvalue of shifted matrix

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 13 / 42

Page 14: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Parallel Inverse Iteration

Inverse iteration requires initial factorization of matrix Aand solution of triangular systems at each iteration, so itappears to be much less amenable to efficient parallelimplementation than power iteration

However, inverse iteration is often used to computeeigenvector in situations where

approximate eigenvalue is already known, so using it asshift yields very rapid convergencematrix has previously been reduced to simpler form (e.g.,tridiagonal) for which linear system is easy to solve

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 14 / 42

Page 15: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Power IterationInverse IterationSimultaneous Iteration

Simultaneous Iteration

To compute many eigenvalue-eigenvector pairs, couldapply power iteration to several starting vectorssimultaneously, giving simultaneous iteration

X0 = arbitrary n× q matrix of rank qfor k = 1, 2, . . .

Xk = AXk−1end

span(Xk) converges to invariant subspace determined byq largest eigenvalues of A, provided |λq| > |λq+1|

Normalization is needed at each iteration, and columns ofXk become increasingly ill-conditioned basis for span(Xk)

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 15 / 42

Page 16: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

Orthogonal Iteration

Both issues addressed by computing QR factorization ateach iteration, giving orthogonal iteration

X0 = arbitrary n× q matrix of rank qfor k = 1, 2, . . .

Compute reduced QR factorizationQ̂kRk = Xk−1

Xk = AQ̂k

end

Converges to block triangular form, and blocks aretriangular where moduli of consecutive eigenvalues aredistinct

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 16 / 42

Page 17: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

Parallel Orthogonal Iteration

Orthogonal iteration requires matrix-matrix multiplicationand QR factorization at each iteration, both of which weknow how to implement in parallel with reasonableefficiency

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 17 / 42

Page 18: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

QR Iteration

If we take X0 = I, then orthogonal iteration shouldproduce all eigenvalues and eigenvectors of A

Orthogonal iteration can be reorganized to avoid explicitformation and factorization of matrices Xk

Instead, sequence of unitarily similar matrices is generatedby computing QR factorization at each iteration and thenforming reverse product, giving QR iteration

A0 = Afor k = 1, 2, . . .

Compute QR factorizationQkRk = Ak−1

Ak = RkQk

endMichael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 18 / 42

Page 19: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

QR Iteration

In simple form just given, each iteration of QR methodrequires Θ(n3) work

Work per iteration is reduced to Θ(n2) if matrix is inHessenberg form, or Θ(n) if symmetric matrix is intridiagonal form

Preliminary reduction is usually done by Householder orGivens transformations

In addition, number of iterations required is reduced bypreliminary reduction of matrix

Convergence rate also enhanced by judicious choice ofshifts

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 19 / 42

Page 20: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

Parallel QR Iteration

Preliminary reduction can be implemented efficiently inparallel, using algorithms analogous to parallel QRfactorization for dense matrix

But subsequent QR iteration for reduced matrix isinherently serial, and permits little parallel speedup for thisportion of algorithm

This may not be of great concern if iterative phase isrelatively small portion of total time, but it does limitefficiency and scalability

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 20 / 42

Page 21: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

Reduction to Hessenberg/Tridiagonal

To reduce to Hessenberg can execute QR on each column

If Hi is the Householder transformation used to annihilateith subcolumn, perform similarity transformation

A(i+1) = HiA(i)HT

i

More generally, run QR on b subcolumns of A to reduce toband-width b

B(i+1) = QiB(i)QT

i

To avoid fill QTi must not operate on the b columns which

Qi reduces

Once reduction completed to a band-width subsequenteliminations introduce fill (bulges) but can be done

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 21 / 42

Page 22: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Orthogonal IterationQR Iteration

Reduction from Banded to Tridiagonal

Two-sided successive orthogonal reductions generatebulges of fill when applied to a banded matrixBulges can be chased with pipelined parallelism

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 22 / 42

Page 23: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Krylov Subspace Methods

Krylov subspace methods reduce matrix to Hessenberg(or tridiagonal) form using only matrix-vector multiplication

For arbitrary starting vector x0, if

Kk =[x0 Ax0 · · · Ak−1x0

]then

K−1n AKn = Cn

where Cn is upper Hessenberg (in fact, companion matrix)

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 23 / 42

Page 24: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Krylov Subspace Methods

To obtain better conditioned basis for span(Kn), computeQR factorization

QnRn = Kn

so thatQH

n AQn = RnCnR−1n ≡H

with H upper Hessenberg

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 24 / 42

Page 25: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Krylov Subspace Methods

Equating kth columns on each side of equationAQn = QnH yields recurrence

Aqk = h1kq1 + · · ·+ hkkqk + hk+1,kqk+1

relating qk+1 to preceding vectors q1, . . . , qk

Premultiplying by qHj and using orthonormality,

hjk = qHj Aqk, j = 1, . . . , k

These relationships yield Arnoldi iteration, which producesupper Hessenberg matrix column by column using onlymatrix-vector multiplication by A and inner products ofvectors

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 25 / 42

Page 26: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Arnoldi Iteration

x0 = arbitrary nonzero starting vectorq1 = x0/‖x0‖2for k = 1, 2, . . .

uk = Aqkfor j = 1 to k

hjk = qHj uk

uk = uk − hjkqjendhk+1,k = ‖uk‖2if hk+1,k = 0 then stopqk+1 = uk/hk+1,k

end

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 26 / 42

Page 27: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Arnoldi Iteration

IfQk =

[q1 · · · qk

],

thenHk = QH

k AQk

is upper Hessenberg matrix

Eigenvalues of Hk, called Ritz values, are approximateeigenvalues of A, and Ritz vectors given by Qky, where yis eigenvector of Hk, are corresponding approximateeigenvectors of A

Eigenvalues of Hk must be computed by another method,such as QR iteration, but this is easier problem if k � n

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 27 / 42

Page 28: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Arnoldi Iteration

Arnoldi iteration expensive in work and storage becauseeach new vector qk must be orthogonalized against allprevious columns of Qk, which must be stored

So Arnoldi process usually restarted periodically withcarefully chosen starting vector

Ritz values and vectors produced are often goodapproximations to eigenvalues and eigenvectors of A afterrelatively few iterations

Work and storage costs drop dramatically if matrixsymmetric or Hermitian, since recurrence then has onlythree terms and Hk is tridiagonal (so usually denoted Tk),yielding Lanczos iteration

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 28 / 42

Page 29: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Lanczos Iteration

q0 = 0β0 = 0x0 = arbitrary nonzero starting vectorq1 = x0/‖x0‖2for k = 1, 2, . . .

uk = Aqkαk = qHk uk

uk = uk − βk−1qk−1 − αkqkβk = ‖uk‖2if βk = 0 then stopqk+1 = uk/βk

end

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 29 / 42

Page 30: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Lanczos Iteration

αk and βk are diagonal and subdiagonal entries ofsymmetric tridiagonal matrix Tk

As with Arnoldi, Lanczos iteration does not produceeigenvalues and eigenvectors directly, but only tridiagonalmatrix Tk, whose eigenvalues and eigenvectors must becomputed by another method to obtain Ritz values andvectors

If βk = 0, then algorithm appears to break down, but in thatcase invariant subspace has already been identified (i.e.,Ritz values and vectors are already exact at that point)

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 30 / 42

Page 31: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Lanczos Iteration

In principle, if Lanczos algorithm is run until k = n,resulting tridiagonal matrix is orthogonally similar to A

In practice, rounding error causes loss of orthogonality,invalidating this expectation

Problem can be overcome by reorthogonalizing vectors asneeded, but expense can be substantial

Alternatively, can ignore problem, in which case algorithmstill produces good eigenvalue approximations, but multiplecopies of some eigenvalues may be generated

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 31 / 42

Page 32: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Krylov Subspace Methods

Virtue of Arnoldi and Lanczos iterations is ability to producegood approximations to extreme eigenvalues for k � n

Moreover, they require only one matrix-vector multiplicationby A per step and little auxiliary storage, so are ideallysuited to large sparse matrices

If eigenvalues are needed in middle of spectrum, say nearσ, then algorithm can be applied to matrix (A− σI)−1,assuming it is practical to solve systems of form(A− σI)x = y

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 32 / 42

Page 33: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Krylov SubspacesArnoldi IterationLanczos IterationKrylov Subspace Methods

Parallel Krylov Subspace Methods

Krylov subspace methods composed of

vector updates (saxpy)inner productsmatrix-vector multiplicationcomputing eigenvalues/eigenvectors of tridiagonal matrices

Parallel implementation requires implementing each ofthese in parallel as before

For early iterations, Hessenberg or tridiagonal matricesgenerated are too small to benefit from parallelimplementation, but Ritz values and vectors need not becomputed until later

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 33 / 42

Page 34: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Jacobi Method

Jacobi method for symmetrix matrix starts with A0 = Aand computes sequence

Ak+1 = JTk AkJk

where Jk is plane rotation that annihilates symmetric pairof off-diagonal entries in Ak

Plane rotations are applied repeatedly from both sides insystematic sweeps through matrix until magnitudes of alloff-diagonal entries are reduced below tolerance

Resulting diagonal matrix is orthogonally similar to originalmatrix, so diagonal entries are eigenvalues, andeigenvectors given by product of plane rotations

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 34 / 42

Page 35: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Parallel Jacobi Method

Jacobi method, though slower than QR iteration serially,parallelizes better

Parallel implementation of Jacobi method performs manyannihilations simultaneously, at locations chosen so thatrotations do not interfere with each other (analogous toparallel Givens QR factorization)

Computations are still rather fine grained, however, soeffectiveness is limited on parallel computers withunfavorable ratio of communication to computation speed

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 35 / 42

Page 36: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Bisection or Spectrum-Slicing

For real symmetric matrix, can determine how manyeigenvalues are less than given real number σ

By systematically choosing various values for σ (slicingspectrum at σ) and monitoring resulting count, anyeigenvalue can be isolated as accurately as desired

For example, by computing inertia using A = LDLT

factorization of A− σI for various σ, individual eigenvaluescan be isolated as accurately as desired using intervalbisection technique

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 36 / 42

Page 37: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Sturm Sequence

Another spectrum-slicing method for computing individualeigenvalues is based on Sturm sequence property ofsymmetric matrices

Let pr(σ) denote determinant of leading principal minor ofA− σI of order r

Zeros of pr(σ) strictly separate those of pr−1(σ), andnumber of agreements in sign of successive members ofsequence pr(σ), for r = 1, . . . , n, gives number ofeigenvalues of A strictly greater than σ

Determinants pr(σ) easy to compute if A transformed totridiagonal form before applying Sturm sequence technique

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 37 / 42

Page 38: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Parallel Multisection

Spectrum-slicing methods can be implemented in parallelby assigning disjoint intervals of real line to different tasks,and then each task carries out bisection process usingserial spectrum-slicing method for its subinterval

Both for efficiency of spectrum-slicing technique and forstorage scalability, matrix is first reduced (in parallel) totridiagonal form, so that each task can store entire(reduced) matrix

Load imbalance is potential problem, since differentsubintervals may contain different numbers of eigenvalues

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 38 / 42

Page 39: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Parallel Multisection

Clustering of eigenvalues cannot be anticipated inadvance, so dynamic load balancing may be required toachieve reasonable efficiency

Eigenvectors must still be determined, usually by inverseiteration

Since each task holds entire (tridiagonal) matrix, each canin principle carry out inverse iteration for its eigenvalueswithout any communication among tasks

Orthogonality of resulting eigenvectors cannot beguaranteed, however, and thus communication is requiredto orthogonalize them

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 39 / 42

Page 40: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Divide-and-Conquer Method

Another method for computing eigenvalues andeigenvectors of real symmetric tridiagonal matrix is basedon divide-and-conquer

Express symmetric tridiagonal matrix T as

T =

[T1 OO T2

]+ β uuT

Can now compute eigenvalues and eigenvectors of smallermatrices T1 and T2

To relate these back to eigenvalues and eigenvectors oforiginal matrix requires solution of secular equation, whichcan be done reliably and efficiently

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 40 / 42

Page 41: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

Jacobi MethodBisection MethodsDivide-and-Conquer

Divide-and-Conquer Method

Applying this approach recursively yieldsdivide-and-conquer algorithm that is naturally parallel

Parallelism in solving secular equations grows asparallelism in processing independent tridiagonal matricesshrinks, and vice versa

Algorithm is complicated to implement and difficultquestions of numerical stability, eigenvector orthogonality,and load balancing must be addressed

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 41 / 42

Page 42: Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

BasicsPower Iteration

QR IterationKrylov MethodsOther Methods

References

Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst,eds., Templates for the Solution of Algebraic EigenvalueProblems: A Practical Guide, SIAM, 2000

J. W. Demmel, M. T. Heath, and H. A. van der Vorst, Parallelnumerical linear algebra, Acta Numerica 2:111-197, 1993

P. J. Eberlein and H. Park, Efficient implementation of Jacobialgorithms and Jacobi sets on distributed memory architectures,J. Parallel Distrib. Comput., 8:358-366, 1990

G. W. Stewart, A Jacobi-like algorithm for computing the Schurdecomposition of a non-Hermitian matrix, SIAM J. Sci. Stat.Comput. 6:853-864, 1985

G. W. Stewart, A parallel implementation of the QR algorithm,Parallel Comput. 5:187-196, 1987

Michael T. Heath and Edgar Solomonik Parallel Numerical Algorithms 42 / 42