krylov subspace methods and exascale computations: good ...€¦ · krylov subspace methods are...

Krylov subspace methods and exascalecomputations:

good match or lost case?

Zdenek StrakošCharles University in Prague and Czech Academy of Sciences

http://www.karlin.mff.cuni.cz/˜strakos

SPPEXA Symposium, Münich, January 2016

Z. Strakoš 2

Personal prehistory

Strakos, Z., Efficiency and Optimizing of Algorithms and Programs on theHost Computer / Array Processor System, Parallel Computing, 4, 1987,pp. 189-209.

● Host Computer (0.2 MFlops) / Array Processor (up to 10 MFlops).

● Large instruction overhead and slow data transfers.

● Pipelining, several arithmetic units.

● Possible overlap of data transfers and arithmetic.

● Slow scalar operations.

Basic problems and principles are not even after thirty years that muchdifferent.

Z. Strakoš 3

Preconditioned algebraic CG

r0 = b−Ax0, solve Mz0 = r0, p0 = z0

For n = 1, . . . , nmax

αn−1 =z∗n−1rn−1

p∗

n−1Apn−1

xn = xn−1 + αn−1pn−1 , stop when the stopping criterion is satisfied

rn = rn−1 − αn−1Apn−1

Mzn = rn , solve for zn

βn =z∗nrn

z∗n−1rn−1

pn = zn + βnpn−1

End

Z. Strakoš 4

Obstacles for parallelization

● Synchronized recursion.

● Matrix-vector multiplication and vector updates are linear and (possibly)fast. Preconditioning is expensive (substantial global communication).

● Scalar coefficients bring in nonlinearity and require inner products.However, for the approximation power of the methods,nonlinearity is essential!

● Parallelization can lead to numerical instabilities.

Z. Strakoš 5

Parallel (communication sensitive) algorithms?

● Block recursion in order to increase arithmetic/communication ratio.

● Numerical stability is crucial.

● Stopping criteria can save the case. Size of the blocks?

● Preconditioning means an approximate solution of a part of the problem.

State-of-the-art in the algorithmic developments:

E. Carson, Communication-Avoiding Krylov Subspace Methods in Theoryand Practice, PhD Thesis, UC at Berkeley, CA, 2015.

Z. Strakoš 6

Outline

1. Philosophy of using Krylov subspace methods

2. Nonlinear model reduction

3. Inexact Krylov?

4. Operator and algebraic preconditioning

5. Krylov subspaces and discretization

6. Stopping criteria?

Z. Strakoš 7

1 Plethora of Krylov subspace methods

● Thorough analysis and fair comparison of several important methodsshould be given priority to overproduction of algorithmic variations.

● Krylov subspace methods are efficient providing that they “do justiceto the inner nature of the problem.” (C. Lanczos, 1947). Infinitedimensional considerations are very useful.

● Oversimplification is dangerous. Widespread worst scenario analysisrestricted to the operator only, universal contraction-based bounds,asymptotic considerations, unjustified or hidden restrictive assumptions.

● Results pointing out difficulties should be taken as an inspiration. Theyare instead unwanted and often labeled as “negative.”

Z. Strakoš 8

1 Málek and S, SIAM Spotlight, 2015

⇒

Z. Strakoš 9

Outline



3. Inexact Krylov?




Z. Strakoš 10

2 Operator form of the BVP and preconditioning

Let V be a real (infinite dimensional) Hilbert space with the innerproduct (·, ·)V : V × V → R, let V # be the dual space of boundedlinear functionals on V . Consider a bounded and coercive operatorA : V → V # and the equation in V #

Ax = b , A : V → V #, x ∈ V, b ∈ V # .

Using the Riesz map,

(τAx− τb, v)V = 0 for all v ∈ V .

The Riesz map τ can be interpreted as transformation of the originalproblem Ax = b in V # into the equation in V

τAx = τb, τA : V → V, x ∈ V, τb ∈ V ,

which is (unfortunately) called preconditioning.

Z. Strakoš 11

2 Model reduction using Krylov subspaces

Let B (= τA) be a bounded linear operator on the Hilbert space V .Choosing z0 (= τb− τAx0) ∈ V . Consider the Krylov sequencez0, z1 = Bz0, z2 = Bz1 = B2z0, . . . , zn = Bzn−1 = Bnzn−1, . . .

Determine a sequence of operators Bn defined on the sequence ofnested subspaces Vn = span {z0, . . . , zn−1} , with the projector En

onto Vn , such that (Vorobyev (1958, 1965))

z1 = Bz0 = Bnz0,

z2 = B2z0 = (Bn)2z0,

...

zn−1 = Bn−1z0 = (Bn)n−1z0,

Enzn = EnBnz0 = (Bn)nz0.

Z. Strakoš 12

2 Bounded self-adjoint operators in V

B x = f ←→ ω(λ),

∫

F (λ) dω(λ)

↑ ↑

Tn yn = ‖f‖V e1 ←→ ω(n)(λ),n

∑

i=1

ω(n)i F

(

θ(n)i

)

Using F (λ) = λ−1 gives (assuming coercivity)

∫ λU

λL

λ−1 dω(λ) =n

∑

i=1

ω(n)i

(

θ(n)i

)

−1

+‖u− un‖

2a

‖f‖2V

Stieltjes (1894) and Vorobyev (1958) moment problems for self-adjointbounded operators reduce to the Gauss-Christoffel quadrature (1814).No one would consider describing it by contraction.

Z. Strakoš 13

2 CG in Hilbert spaces

r0 = b−Ax0 ∈ V #, p0 = τr0 ∈ V

For n = 1, 2, . . . , nmax

αn−1 =〈rn−1, τrn−1〉

〈Apn−1, pn−1〉=

(τrn−1, τrn−1)V

(τApn−1, pn−1)V



βn =〈rn, τrn〉

〈rn−1, τrn−1〉=

(τrn, τrn)V

(τrn−1, τrn−1)V

pn = τrn + βnpn−1

End

Hayes (1954); Vorobyev (1958, 1965); Karush (1952); Stesin (1954)Superlinear convergence for (identity + compact) operators.

Z. Strakoš 14

Outline


2. Matching moments model reduction

3. Inexact Krylov?




Z. Strakoš 15

3 Delay of convergence due to inexactness

0 20 40 60 80 100

10−15

10−10

10−5

100

?

0 100 200 300 400 500 600 700 800

10−15

10−10

10−5

100

iteration number

residualsmooth uboundbackward errorloss of orthogonalityapproximate solutionerror

Here numerical inexactness due to roundoff. How much may we relaxaccuracy of the most costly operations without causing an unwanted delayand/or affecting the maximal attainable accuracy?

Z. Strakoš 16

Outline



3. Inexact Krylov?




Z. Strakoš 17

4 Restriction to finite dimensional subspace Vh

Let Φh = (φ(h)1 , . . . , φ

(h)N ) be a basis of the subspace Vh ⊂ V ,

let Φ#h = (φ

(h)#1 , . . . , φ

(h)#N ) be the canonical basis of its dual V

#h ,

( V#h = AVh) . Using the coordinates in Φh and in Φ#

h ,

〈f, v〉 → v∗f ,

(u, v)V → v∗Mu, (Mij) = ((φj , φi)V )i,j=1,...,N

,

Au→ Au , Au = AΦhu = Φ#h Au ; (Aij) = (a(φj , φi))i,j=1,...,N

,

τf → M−1f , τf = τΦ#h f = ΦhM

−1f ;

we get with b = Φ#h b , xn = Φh xn , pn = Φh pn , rn = Φ#

h rn thealgebraic CG formulation.

Z. Strakoš 18

4 Galerkin discretization gives matrix CG in Vh

r0 = b−Ax0, solve Mz0 = r0, p0 = z0

For n = 1, . . . , nmax

αn−1 =z∗n−1rn−1

p∗

n−1Apn−1



Mzn = rn , solve for zn

βn =z∗nrn

z∗n−1rn−1

pn = zn + βnpn−1

End

Günnel, Herzog, Sachs (2014); Málek, S (2015)

Z. Strakoš 19

4 Observations

● Unpreconditioned CG, i.e. M = I , corresponds to the discretizationbasis Φ orthonormal wrt (·, ·)V .

● Orthogonalization of the discretization basis will result in theunpreconditioned algebraic CG that is applied to the preconditionedalgebraic system. The resulting matrix of this preconditioned algebraicsystem is not sparse!

● Any algebraic preconditioning applied to the algebraic system that arisefrom discretization can be interpreted within this operatorpreconditioning framework.

Z. Strakoš 20

Outline



3. Inexact Krylov?




Z. Strakoš 21

5 Conjugate gradient method - first n steps

Tn =

α1 β2

β2. . .

. . .. . .

. . .. . .

. . .. . . βn

βn αn

is the Jacobi matrix of the orthogonalization coefficients and the CGmethod is formulated by

Tnyn = ‖τr0‖V e1, xn = x0 + Qnyn , xn ∈ Vn .

Infinite dimensional Krylov subspace methods perform discretization viamodel reduction.

Z. Strakoš 22

Outline



3. Inexact Krylov?




Z. Strakoš 23

6 L-shape domain, Papež, Liesen, S (2014)

−1 −0.5 0 0.5 1 −1

0

1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

−10

1 −1

0

1−4

−2

0

2

4

x 10−4

Exact solution x (left) and the discretisation error x− xh (right) in thePoisson model problem, linear FEM, adaptive mesh refinement.

Quasi equilibrated discretization error over the domain.

Z. Strakoš 24

6 L-shape domain, Papež, Liesen, S (2014)

−10

1 −1

0

1−4

−2

0

2

4

x 10−4

−10

1 −1

0

1−4

−2

0

2

4

x 10−4

Algebraic error xh − x(n)h (left) and the total error x− x

(n)h (right) after

the number of CG iterations guaranteeing

‖x− xh‖a = ‖∇(x− xh)‖ ≫ ‖x− xn‖A .

Z. Strakoš 25

Conclusions

● Krylov subspace methods adapt to the problem. Exploiting thisadaptation is the key to their efficient use.

● They are expensive and by their nature recursive. Therefore they cannot be efficient without being fast, i.e., without powerful preconditioning.

● Individual steps modeling-analysis-discretization-computation shouldnot be considered separately within isolated disciplines.They form a single problem.

● Fast HPC computations result from appropriate handling of all involvedissues, including numerical stability and a posteriori error analysisleading to appropriate stopping criteria.

● There are many difficult but exciting challenges ahead. In order toresolve them, we should fairly admit that they exist.

Z. Strakoš 26

References

● J. Málek and Z.S., Preconditioning and the Conjugate Gradient Methodin the Context of Solving PDEs. SIAM Spotlight Series, SIAM (2015)

● T. Gergelits and Z.S., Composite convergence bounds based onChebyshev polynomials and finite precision conjugate gradientcomputations, Numer. Alg. 65, 759-782 (2014)

● J. Papež, J. Liesen and Z.S., Distribution of the discretization andalgebraic error in numerical solution of partial differential equations,Linear Alg. Appl. 449, 89-114 (2014)

● J. Liesen and Z.S., Krylov Subspace Methods, Principles and Analysis.Oxford University Press (2013)

● Z.S. and P. Tichý, On efficient numerical approximation of the bilinearform c∗A−1b, SIAM J. Sci. Comput. 33, 565-587 (2011)

Z. Strakoš 27

Thank you for your patience!

Z. Strakoš 28

Czech and German Elephant

Kralicky Sn eznık - Glatzer Schneeberg

German Artist’ Union Jetscher

Artist Amei Hallenger

Made by Co Forster, Zuckmantel (Zlat e Hory)

1932

krylov subspace methods and exascale computations: good ...€¦ · krylov subspace methods are...

Documents