preconditioners for sparse approximations and big data … · 2020-02-18 · • compressed sensing...
TRANSCRIPT
Preconditioners: Sparse Approximations, Big Data
TH
E
U N I V E R S
I TY
OF
ED I N B U
RG
H
Preconditionersfor Sparse Approximationsand Big Data Optimization
Jacek Gondzio
thanks to my collaborator:
K. Fountoulakis
Padova, February 18th, 2020 1
Preconditioners: Sparse Approximations, Big Data
Introduction
New applications:
• signal/image processing(compression, reconstruction, deblurring)
• modern computational statistics(inverse problems, classification, machine learning)
Overarching mathematical problem:
• dimension reduction
Optimization faces new challenges:
huge scale of data availableneeds new processing algorithms
Padova, February 18th, 2020 2
Preconditioners: Sparse Approximations, Big Data
Outline
• Homotopy:“replace a difficult problem with a sequenceof (easy to solve) auxilliary problems”
– Interior Point Methods– Big Data optimization
• 2nd-order methods for optimization– Inexact Newton method– Preconditioners needed– Matrix-free methods
• The same idea of preconditioner in IPMs and BD• Big Data problem (240
≈ 1012 variables)• Conclusions
Padova, February 18th, 2020 3
Preconditioners: Sparse Approximations, Big Data
Interior Point Methods (IPMs)
Padova, February 18th, 2020 4
Preconditioners: Sparse Approximations, Big Data
Primal-Dual Pair of Quadratic ProgramsPrimal Dual
min cTx+ 12xTQx max bTy − 1
2xTQx
s.t. Ax = b, s.t. ATy+ s−Qx = c,x ≥ 0; s ≥ 0.
LagrangianL(x, y) = cTx+
12xTQx− yT (Ax− b)− sTx.
Optimality Conditions
Ax = b,
ATy+ s−Qx = c,
XSe = 0, ( i.e., xj · sj = 0 ∀j),(x, s) ≥ 0,
X=diag{x1, · · · , xn}, S=diag{s1, · · · , sn}, e=(1, · · · ,1)∈Rn.Padova, February 18th, 2020 5
Preconditioners: Sparse Approximations, Big Data
Interior-Point Framework
The log barrier − log xj“replaces” the inequality xj ≥ 0. x
−ln x
1
We derive the first order optimality conditionsfor the primal barrier problem:
Ax = b,
−Qx+ ATy+ s = c,XSe = µe,
and apply Newton method to solve this system of(nonlinear) equations.Padova, February 18th, 2020 6
Preconditioners: Sparse Approximations, Big Data
First Order Opt Conditions for QP
Ax = b,
ATy+ s−Qx = c,XSe = 0,(x, s) ≥ 0,
First Order Opt Conditions for Barrier QP
Ax = b,
ATy+ s−Qx = c,XSe = µe,
(x, s) > 0,
Padova, February 18th, 2020 7
Preconditioners: Sparse Approximations, Big Data
Homotopy/Continuation
LP/QP is about “solving” the following equation
xj · sj = 0 ∀j = 1,2, ..., n.
Do not guess!Replace this equation with
xj · sj = µ ∀j = 1,2, ..., n,
and decrease µ gradually to zero.
Let the intelligent IPMs find outwhat the optimal partition is.
Padova, February 18th, 2020 8
Preconditioners: Sparse Approximations, Big Data
The pronunciation of Greek letter µ
Robert De Niro, Taxi Driver (1976)
Padova, February 18th, 2020 9
Preconditioners: Sparse Approximations, Big Data
Big Data Optimization
Padova, February 18th, 2020 10
Preconditioners: Sparse Approximations, Big Data
Sparse Approximation
• Machine Learning: Classification with SVMs
• Statistics: Estimate x from observations
• Wavelet-based signal/image reconst. & restoration
• Compressed Sensing (Signal Processing)
All such problems lead to the same dense, potentiallyvery large QP.
Padova, February 18th, 2020 11
Preconditioners: Sparse Approximations, Big Data
Binary Classification
min τ‖x‖1+m∑
i=1log(1+e−bix
Tai) min τ‖x‖22+m∑
i=1log(1+e−bix
Tai)
Padova, February 18th, 2020 12
Preconditioners: Sparse Approximations, Big Data
Bayesian Statistics Viewpoint
Estimate x from observations
y = Ax+ e,
where y are observations and e is the Gaussian noise.
→ minx ‖y − Ax‖22
If the prior on x is Laplacian (log p(x) = −λ‖x‖1 + K)then
minx
τ‖x‖1+ ‖Ax− b‖22
Tibshirani,J. of Royal Statistical Soc B 58 (1996) 267-288.
Padova, February 18th, 2020 13
Preconditioners: Sparse Approximations, Big Data
ℓ1-regularization
minx
τ‖x‖1+ φ(x).
Unconstrained optimization ⇒ easy
Serious Issue: nondifferentiability of ‖.‖1
Two possible tricks:
• Splitting x = u− v with u, v ≥ 0
• Smoothing with pseudo-Huber approximation
replaces ‖x‖1 with ψµ(x) =∑ni=1(
√
µ2+ x2i − µ)
Padova, February 18th, 2020 14
Preconditioners: Sparse Approximations, Big Data
Huber:
Padova, February 18th, 2020 15
Preconditioners: Sparse Approximations, Big Data
Continuation
Embed inexact Newton Meth into a homotopy approach:
• Inequalities u ≥ 0, v ≥ 0 −→ use IPM
replace z ≥ 0 with −µ logz and drive µ to zero.
• pseudo-Huber regression −→ use continuation
replace |xi| with µ(
√
1+x2iµ2
−1) and drive µ to zero.
Questions:
• Theory?
• Practice?
Padova, February 18th, 2020 16
Preconditioners: Sparse Approximations, Big Data
Main Tool: Inexact Newton Method
Replace an exact Newton direction
∇2f(x)∆x = −∇f(x)
with an inexact one:
∇2f(x)∆x = −∇f(x) + r,
where the error r is small: ‖r‖ ≤ η‖∇f(x)‖, η ∈ (0,1).
The NLP community usually writes it as:
‖∇2f(x)∆x+∇f(x)‖2 ≤ η‖∇f(x)‖2, η ∈ (0,1).
Dembo, Eisenstat & Steihaug,SIAM J. on Numerical Analysis 19 (1982) 400–408.Padova, February 18th, 2020 17
Preconditioners: Sparse Approximations, Big Data
Inexact Primal-Dual IPM:G., Convergence Analysis of an Inexact Feasible IPM forConvex QP, SIAM J Opt, 23 (2013) No 3, 1510–1527.G., Matrix-Free Interior Point Method,Comput Opt and Appls, 51 (2012) 457–480.
Primal-Dual Newton Conjugate Grad Meth:Fountoulakis and G., A Second-order Method forStrongly Convex ℓ1-regularization Problems,Mathematical Programming, 156 (2016) 189–219.Dassios, Fountoulakis and G., A Preconditionerfor a Primal-Dual Newton Conjugate Gradient Methodfor Compressed Sensing Problems,SIAM J on Sci Comput, 37 (2015) A2783–A2812.
Padova, February 18th, 2020 18
Preconditioners: Sparse Approximations, Big Data
Compressed Sensing and Continuation
Replaceminx
f(x) = τ‖W ∗x‖1+12‖Ax− b‖22, −→ xτ
withminx
fµ(x) = τψµ(W ∗x) +12‖Ax− b‖22, −→ xτ,µ
Solve approximately a family of problems for a (short)decreasing sequence of µ’s: µ0 > µ1 > µ2 · · ·
Theorem (Brief description)
There exists a µ such that ∀µ ≤ µ the difference of thetwo solutions satisfies
‖xτ,µ − xτ‖2 = O(µ1/2) ∀ τ, µ
Padova, February 18th, 2020 19
Preconditioners: Sparse Approximations, Big Data
Convergence of the primal-dual Newton CG
Use inexact Newton directions:
‖∇2fµ(x)∆x+∇fµ(x)‖2 ≤ η‖∇fµ(x)‖2, η ∈ (0,1)
computed by the Newton CG method.
Theorem (Primal convergence)
Let {xk}∞k=0 be a sequence generated by pdNCG. Thenthe sequence {xk}∞k=0 converges to the primal (perturbed)solution xτ,µ.
Theorem (Rate of convergence)
If the forcing factor ηk satisfies limk→∞ ηk = 0, thenpdNCG converges superlinearly.
Padova, February 18th, 2020 20
Preconditioners: Sparse Approximations, Big Data
Continuation
(three examples of ℓ1-regularization)
Padova, February 18th, 2020 21
Preconditioners: Sparse Approximations, Big Data
Three examples of ℓ1-regularization
• Compressed Sensingwith K. Fountoulakis and P. Zhlobich
minx
τ‖x‖1+12‖Ax− b‖22, A ∈ Rm×n
• Compressed Sensing (Coherent and Redundant Dict.)with I. Dassios and K. Fountoulakis
minx
τ‖W ∗x‖1+12‖Ax− b‖22, W ∈ Cn×l, A ∈ Rm×n
think of Total Variation
• Big Data optimization (Machine Learning)with K. Fountoulakis
Padova, February 18th, 2020 22
Preconditioners: Sparse Approximations, Big Data
Example 1: Compressed Sensing
with K. Fountoulakis and P. Zhlobich
Large dense quadratic optimization problem:
minx
τ‖x‖1+12‖Ax− b‖22,
where A ∈ Rm×n is a very special matrix.
Fountoulakis, G., ZhlobichMatrix-free IPM for Compressed Sensing Problems,Math. Prog. Computation 6 (2014), pp. 1–31.
Software available at http://www.maths.ed.ac.uk/ERGO/
Padova, February 18th, 2020 23
Preconditioners: Sparse Approximations, Big Data
Restricted Isometry Property (RIP)
• rows of A are orthogonal to each other (A is built ofa subset of rows of an othonormal matrix U ∈Rn×n)
AAT = Im.
• small subsets of columns of A are nearly-orthogonalto each other: Restricted Isometry Property (RIP)
‖AT A−m
nIk‖ ≤ δk ∈ (0,1).
Candes, Romberg & Tao,Comm on Pure and Appl Maths 59 (2005) 1207-1233.
Padova, February 18th, 2020 24
Preconditioners: Sparse Approximations, Big Data
Restricted Isometry Property
Matrix A ∈ Rm×k (k ≪ n) is built of a subset of columnsof A ∈ Rm×n.
A = −→ A =
AT A = = ≈m
nIk.
This yields a very well conditioned optimization problem.
Padova, February 18th, 2020 25
Preconditioners: Sparse Approximations, Big Data
Problem Reformulation
minx
τ‖x‖1+12‖Ax− b‖22
Replace x = x+ − x− to be able to use |x| = x++ x−.Use |xi| = zi+ zi+n to replace ‖x‖1 with ‖x‖1 = 1T2nz.
(Increases problem dimension from n to 2n.)
minz≥0
cTz +12zTQz,
where
Q =[
AT
−AT
]
[A −A ] =[
ATA −ATA
−ATA ATA
]
∈ R2n×2n
Padova, February 18th, 2020 26
Preconditioners: Sparse Approximations, Big Data
Preconditioner
Approximate
M =[
ATA −ATA
−ATA ATA
]
+
[
Θ−11
Θ−12
]
withP =
m
n
[In −In
−In In
]
+
[
Θ−11
Θ−12
]
.
We expect (optimal partition):
• k entries of Θ−1 → 0, k ≪ 2n,
• 2n− k entries of Θ−1 → ∞.
Padova, February 18th, 2020 27
Preconditioners: Sparse Approximations, Big Data
Spectral Properties of P−1M
Theorem
• Exactly n eigenvalues of P−1M are 1.
• The remaining n eigenvalues satisfy
|λ(P−1M)− 1| ≤ δk +n
mδkL,
where δk is the RIP-constant, andL is a threshold of “large” (Θ1+Θ2)−1.
Fountoulakis, G., ZhlobichMatrix-free IPM for Compressed Sensing Problems,Math. Prog. Computation 6 (2014), pp. 1–31.
Padova, February 18th, 2020 28
Preconditioners: Sparse Approximations, Big Data
Preconditioning
2 4 6 8 10 12 14 16 18 20 2210
0
101
102
103
Number of CG/PCG call
Ma
tVe
cs
Matrix−vector products per CG/PCG call
Unpreconditioned CG
Preconditioned CG
2 4 6 8 10 12 14 16 18 20 2210
−6
10−4
10−2
100
102
104
106
108
number of CG/PCG call m
ag
nit
ud
e o
f λ(M
)/λ(P
−1M
)
Spread of λ(M)/λ(P−1
M) per call of CG/PCG
λ(M)
λ(P−1
M)
−→ good clustering of eigenvalues
mf-IPM compares favourably with NestA on easy probs(NestA: Becker, Bobin and Candes).
Padova, February 18th, 2020 29
Preconditioners: Sparse Approximations, Big Data
SPARCO problems
Comparison on 18 out of 26 classes of problems(all but 6 complex and 2 installation-dependent ones).
Solvers compared:PDCO, Saunders and Kim, Stanford,ℓ1−ℓs, Kim, Koh, Lustig, Boyd, Gorinevsky, Stanford,FPC-AS-CG, Wen, Yin, Goldfarb, Zhang, Rice,SPGL1, Van Den Berg, Friedlander, Vancouver, andmf-IPM, Fountoulakis, G., Zhlobich, Edinburgh.
On 36 runs (noisy and noiseless problems), mf-IPM:• is the fastest on 11,• is the second best on 14, and• overall is very robust.
Padova, February 18th, 2020 30
Preconditioners: Sparse Approximations, Big Data
Linear Algebra Perspective
Convert a difficulty into an advantage
Interestingly the same trick works:in IPMs and in Big Data!
Padova, February 18th, 2020 31
Preconditioners: Sparse Approximations, Big Data
Linear Algebra of IPMs for LP/QP
Newton direction
A 0 0−Q AT IS 0 X
∆x∆y∆s
=
ξpξdξµ
.
Eliminate ∆s from the second equation and get[
−Q−Θ−1 AT
A 0
] [∆x∆y
]
=[ξd −X−1ξµ
ξp
]
,
where Θ = XS−1 is a diagonal scaling matrix.
Eliminate ∆x from the first equation and get
(A(Q+Θ−1)−1AT )∆y = g.
Padova, February 18th, 2020 32
Preconditioners: Sparse Approximations, Big Data
IPM Linear Algebra: Splitting Preconditioner
For “basic” variables, xB → xB > 0 and sB → sB = 0hence
Θj = xj/sj ≈ (x2j)/(xjsj) = O(µ−1) ∀j ∈ B.
For “non-basic” variables, xN→ xN = 0 and sN→ sN > 0hence
Θj = xj/sj ≈ (xjsj)/(s2j) = O(µ) ∀j ∈ N .
Convert a difficulty into an advantage→ Exploit the property:
As µ→ 0 then:ΘB → ∞, Θ−1
B → 0;
ΘN → 0, Θ−1N → ∞.
Padova, February 18th, 2020 33
Preconditioners: Sparse Approximations, Big Data
Oliveira & Sorensen, LAA 394 (2005) 1-24.
Guess basic/nonbasic partition A= [B|N ], invertible B.
E−1HE−T
=
Θ1/2B Θ−1/2
B B−1
Θ1/2N
Θ1/2B
−Θ−1B BT
−Θ−1N NT
B N 0
Θ1/2B Θ1/2
B
Θ1/2N
B−TΘ−1/2B
=
Im WT
W −In−m−Im
, where W = Θ1/2N NTB−TΘ−1/2
B .
With µ→ 0 we have Θ−1B → 0, ΘN → 0 hence
W = Θ1/2N NTB−TΘ−1/2
B ≈ 0.
Padova, February 18th, 2020 34
Preconditioners: Sparse Approximations, Big Data
Property of Sparse Approximations
minx
f(x) = τ‖x‖1+ ‖Ax− b‖22
Assume a sparse solution exists x = [xB | xZ] with xZ = 0.Partition matrix A = [AB |AZ] accordingly.
Then only a small subset of the Hessian ATA is “relevant”
ATA =
[
ATBAB ATBAZ
ATZAB ATZAZ
]
Padova, February 18th, 2020 35
Preconditioners: Sparse Approximations, Big Data
Splitting x = u − v and IPMs
There is a need to solve equations with ATA+Θ−1 and
• Θ−1j → 0, for j in the sparse part B (xj > 0),
• Θ−1j → ∞, for j in the zero part Z (xj ≈ 0).
Then
ATA+Θ−1 =
[
ATBAB +Θ−1B ATBAZ
ATZAB ATZAZ +Θ−1Z
]
≈
[
ATBAB
Θ−1Z
]
From RIP: ATBAB ≈ mn I.
Padova, February 18th, 2020 36
Preconditioners: Sparse Approximations, Big Data
Smoothing with pseudo-Huber approximation
There is a need to solve equations with ATA+ ∇2ψµ(x)and
• ψµ(xj) ≫ 0 and ∇2ψµ(xj) → 0, for j in part B,
• ψµ(xj) ≈ 0 and ∇2ψµ(xj) → 1µ, for j in part Z.
Then
ATA+∇2ψµ(x) =
[
ATBAB+∇2ψµ(xB) ATBAZ
ATZAB ATZAZ+∇2ψµ(xZ)
]
≈
[
ATBAB1µI
]
Padova, February 18th, 2020 37
Preconditioners: Sparse Approximations, Big Data
Example 2: CS, Coherent & Redundant Dict.
with I. Dassios and K. Fountoulakis.
Large dense quadratic optimization problem:
minx
τ‖W ∗x‖1+12‖Ax− b‖22,
where A ∈ Rm×n and W ∈ Cn×l is a dictionary.
Dassios, Fountoulakis and G.A Preconditioner for a Primal-Dual Newton ConjugateGradient Method for Compressed Sensing Problems,SIAM J on Sci. Comput. 37 (2015) A2783–A2812.
Software available at http://www.maths.ed.ac.uk/ERGO/
Padova, February 18th, 2020 38
Preconditioners: Sparse Approximations, Big Data
W-Restricted Isometry Property (W-RIP)
• rows of A are nearly-orthogonal to each other, i.e.,there exists a small constant δ such that
‖AAT − Im‖ ≤ δ.
• W-Restricted Isometry Property (W-RIP):there exists a constant δq such that
(1− δq)‖Wz‖22 ≤ ‖AWz‖22 ≤ (1 + δq)‖Wz‖22
for all at most q-sparse z ∈ Cn.
Candes, Eldar & Needell,Appl and Comp Harmonic Anal 31 (2011) 59-73.
Padova, February 18th, 2020 39
Preconditioners: Sparse Approximations, Big Data
Preconditioner
ApproximateH = τ∇2ψµ(W ∗x) + ATA
withP = τ∇2ψµ(W ∗x) + ρIn.
We expect (optimal partition):
• k entries of W ∗x ≫ 0, k ≪ l, ∇2ψµ(W ∗x)j → 0,
• l−k entries of W ∗x ≈ 0, ∇2ψµ(W ∗x)j → 1µ.
The preconditioner approximates well the 2nd derivativeof the pseudo-Huber regularization.
Padova, February 18th, 2020 40
Preconditioners: Sparse Approximations, Big Data
Spectral Properties of P−1H
Theorem
• The eigenvalues of P−1H satisfy
|λ(P−1H)− 1| ≤η(δ, δq, ρ)
ρ,
where δq is the W-RIP constant,δ is another small constant, andη(δ, δq, ρ) is some simple function.
Dassios, Fountoulakis and G.A Preconditioner for a Primal-Dual Newton ConjugateGradient Method for Compressed Sensing Problems,SIAM J on Sci. Comput. 37 (2015) A2783–A2812.Padova, February 18th, 2020 41
Preconditioners: Sparse Approximations, Big Data
CS: Coherent and Redundant Dictionaries
−→ good clustering of eigenvalues
pdNCG outperforms TFOCS on several examples(TFOCS: Becker, Candes and Grant).
Padova, February 18th, 2020 42
Preconditioners: Sparse Approximations, Big Data
Example 3: Big Data and Optimization
with K. Fountoulakis.
Large dense quadratic optimization problem:
minx
τ‖x‖1+12‖Ax− b‖22,
where A ∈ Rm×n.
Fountoulakis and G.Performance of First- and Second-Order Methods for ℓ1-regularized Least Squares Problems, COAP 65 (2016)605–635.
Software availablehttp://www.maths.ed.ac.uk/ERGO/trillion/
Padova, February 18th, 2020 43
Preconditioners: Sparse Approximations, Big Data
Simple example for ℓ1-regularization
minx
τ‖x‖1+ ‖Ax− b‖22
Special matrix given in SVD form A = QΣGT .
Matlab generator:http://www.maths.ed.ac.uk/ERGO/trillion/
The user controls:
• the condition number κ(A),
• the sparsity of matrix A.
Padova, February 18th, 2020 44
Preconditioners: Sparse Approximations, Big Data
Simple example for ℓ1-regularization
Suppose m ≥ n. Write A in the SVD form:
A = Q
[diag{σ1, σ2, . . . , σn}
0
]
GT ,
where Q is an m×m orthogonal matrix,σ1, σ2, . . . , σn are the singular values of A,G is a product of Givens rotations
G = G(i1, j1, θ1)G(i2, j2, θ2) . . . G(iK, jK, θK),
hence it is an orthonormal matrix.Observe that
ATA = G[
diag{σ21, σ22, . . . , σ
2n}
]
GT .
Padova, February 18th, 2020 45
Preconditioners: Sparse Approximations, Big Data
Excessive Computational Tests (4 mths of CPU)
• FISTA (Fast Iterative Shrinkage-Thresholding Alg)
• PCDM (Parallel Coordinate Descent Method)
• PSSgb (Projected Scaled Subgrad, Gafni-Bertsekas)
• pdNCG (primal-dual Newton Conjugate Gradients)
The 1-st order methods:
• work well if the condition number κ(A) ≤ 102,
• struggle when κ(A) ≥ 103,
• stall when κ(A) ≥ 104.
The 2-nd ordermethod (pdNCG, diag preconditioner):
• works well if the condition number κ(A) ≤ 106.
Padova, February 18th, 2020 46
Preconditioners: Sparse Approximations, Big Data
Let us go big: a trillion (240) variables
n (billions) Processors Memory (TB) time (s)1 64 0.192 19234 256 0.768 196816 1024 3.072 198664 4096 12.288 1970256 16384 49.152 1990
1,024 65536 196.608 2006
ARCHER (ranked 25 on top500.com, 11 March 2015)Linpack Performance (Rmax) 1,642.54 TFlop/sTheoretical Peak (Rpeak) 2,550.53 TFlop/sFountoulakis and G. Performance of First- andSecond-Order Methods for ℓ1-regularized Least SquaresProblems, COAP 65 (2016) 605–635.Padova, February 18th, 2020 47
Preconditioners: Sparse Approximations, Big Data
Conclusions
2nd-order methods for optimization:
• employ inexact Newton method• rely on preconditioners• enjoy matrix-free implementation
Trick:
• find the “essential subspace” and• exploit it to simplify the linear algebra
– works in IPMs for LP– works in Newton CG for ℓ1-regularization
Simple, reliable test example for ℓ1-regularization:http://www.maths.ed.ac.uk/ERGO/trillion/
Padova, February 18th, 2020 48
Preconditioners: Sparse Approximations, Big Data
Baby test example: RCDC vs pdNCG
10−1
100
101
102
103
106
108
1010
1012
1014
1016
CPU time
Ob
jecti
ve f
un
cti
on
dcNCG
RCDC
10−1
100
101
102
103
106
108
1010
1012
1014
CPU time
dcNCG
RCDC
Dimensions: m = 4× 103, n = 2× 103.x∗ has 50 non-zero elements randomly positioned.
RCDC interrupted after 109 iterations, 31 hours.
Padova, February 18th, 2020 49
Preconditioners: Sparse Approximations, Big Data
ID rhs Accuracy mfIPM ℓ1 ℓs pdco fpc as cg spgl1
2 b 3.0e-04 61 48 687 9 40000b 1.0e-11 65 98 40007 40002 22
3 b 7.0e-04 241 462 4941 106 40000b 1.0e-08 415 1612 40157 212 148
5 b 2.0e-03 5991 9842 28203 521 40000b 2.0e-05 7953 19684 41283 874 2567
10 b 1.0e-03 4775 8529 6203 40002 40000b 9.0e-10 4567 8192 41227 40161 40000
701 b 2.0e-02 947 1794 5967 1049 40000b 7.0e-09 1341 2656 42041 40017 15239
702 b 4.0e-03 809 1574 3341 40001 40000b 1.0e-07 1123 3030 49563 40157 11089
Padova, February 18th, 2020 50
Preconditioners: Sparse Approximations, Big Data
A better linearization
τ Dx︸︷︷︸
∇ψµ(x)
+ AT (Ax− b) = 0,
where D:=diag(D1,...,Dn) with Di:=(µ2+x2i )−1
2 ∀i=1,...,n
Set g = Dx. Use the easier form of the equations.
Difficult:
τg+ AT (Ax− b) = 0,g = Dx.
Easy:
τg+ AT (Ax− b) = 0,D−1g = x.
Chan, Golub, Mulet,SIAM J. on Sci. Comput. 20 (1999) 1964–1977.
Padova, February 18th, 2020 51
Preconditioners: Sparse Approximations, Big Data
A better linearizationExample: gi = 0.99
bad: gi = Dixi good: D−1i gi = xi
Padova, February 18th, 2020 52