math 1080: numerical linear algebra chapter 5, …sussmanm/1080/supplemental/chap5.pdfmath 1080:...
TRANSCRIPT
Math 1080: Numerical Linear AlgebraChapter 5, Solving Ax = b by Optimization
M. M. [email protected]
Office Hours: MW 1:45PM-2:45PM, Thack 622
March 2015
1 / 66
Some definitions
I A matrix A is “SPD” (Symmetric Postive Definite)1. A = AT
2. x · Ax = 〈x ,Ax〉 = xT Ax > 0 for x 6= 0I A is “negative definite” if −A is SPDI A is “skew-symmetric” if AT = −AI A is “indefinite” if symmetric and neither SPD nor negative
definite.I Two symmetric matrices satisfy A > B if (A− B) is SPD
2 / 66
Important facts
I All eigenvalues of a symmetric matrix are realI A symmetric matrix has an orthogonal basis of eigenvectorsI A symmetric matrix A is positive definite if and only if all λ(A) > 0.
3 / 66
A-inner product
If A is SPD then〈x , y〉A = x tAy
is a weighted inner product on Rn, the A-inner product, and
‖x‖A =√〈x , x〉A =
√x tAx
is a weighted norm, the A-norm.
4 / 66
Proof
bilinear
〈u+v , y〉A = (u+v)tAy = utAy+v tAy = 〈u, y〉A+〈v , y〉A
and
〈αu, y〉A = (αu)tAy = α(utAy
)= α〈u, y〉A.
Positive〈x , x〉A = x tAx > 0, for x 6= 0, since A is SPD.
Symmetric
〈x , y〉A = x tAy = (x tAy)t = y tAtx tt = y tAx = 〈y , x〉A.
Hence, 〈x , x〉A induces a norm ‖x‖A =√〈x , x〉A
5 / 66
The A-norm is a weighted 2-norm
The norm induced by a 2× 2 SPD matrix A is just a weighted 2-norm.
Proof:I Let (λ1, φ1) and (λ2, φ2) be the two eigenvalues and eigenvectors
of A
I Any vector v = α1φ1 + α2φ2
I ‖v‖2A = (α1φ1 + α2φ2)T A(α1φ1 + α2φ2)
I ‖v‖2A = (α1φ1 + α2φ2)T (α1λ1φ1 + α2λ2φ2)
I ‖v‖2A = (α2
1λ1 + α22λ2)
6 / 66
Homework
Text, Exercises 258, 259
7 / 66
Topics
The connection to optimization
Descent Methods
Application to Stationary Iterative Methods
Parameter selection
The Steepest Descent Method
8 / 66
Finding a solution = finding a minimum
For A SPD, the unique solution of
Ax = b
is also the unique minimizer of the functional
J(x) =12〈x ,Ax〉 − 〈x ,b〉 =
12
(xT Ax)− xT b
ProofTake derivatives of both sides.
∂J∂xi
= 0 =12
((Ax)i + (Ax)i )− bi
9 / 66
Notation
x = A−1b is the argument that minimizes J(y):
x = arg miny∈RN
J(y)
10 / 66
Theorem 260
Let A be SPD. The solution of Ax = b is the unique minimizer of J(x).I For any y
J(x + y) = J(x) +12
y tAy > J(x)
I And‖x − y‖2
A = 2 (J(y)− J(x)) .
11 / 66
Proof of first claim
J(x + y) =12
(x + y)tA(x + y)− (x + y)tb
=12
x tAx + y tAx +12
y tAy − x tb − y tb
= (12
x tAx − x tb) + (y tAx − y tb) +12
y tAy
= J(x) + y t (Ax − b) +12
y tAy
= J(x) +12
y tAy > J(x).
A is SPD, so y 6= 0 =⇒ y tAy > 0.
12 / 66
Proof of first claim
J(x + y) =12
(x + y)tA(x + y)− (x + y)tb
=12
x tAx + y tAx +12
y tAy − x tb − y tb
= (12
x tAx − x tb) + (y tAx − y tb) +12
y tAy
= J(x) + y t (Ax − b) +12
y tAy
= J(x) +12
y tAy > J(x).
A is SPD, so y 6= 0 =⇒ y tAy > 0.
12 / 66
Proof of first claim
J(x + y) =12
(x + y)tA(x + y)− (x + y)tb
=12
x tAx + y tAx +12
y tAy − x tb − y tb
= (12
x tAx − x tb) + (y tAx − y tb) +12
y tAy
= J(x) + y t (Ax − b) +12
y tAy
= J(x) +12
y tAy > J(x).
A is SPD, so y 6= 0 =⇒ y tAy > 0.
12 / 66
Proof of first claim
J(x + y) =12
(x + y)tA(x + y)− (x + y)tb
=12
x tAx + y tAx +12
y tAy − x tb − y tb
= (12
x tAx − x tb) + (y tAx − y tb) +12
y tAy
= J(x) + y t (Ax − b) +12
y tAy
= J(x) +12
y tAy > J(x).
A is SPD, so y 6= 0 =⇒ y tAy > 0.
12 / 66
Proof of first claim
J(x + y) =12
(x + y)tA(x + y)− (x + y)tb
=12
x tAx + y tAx +12
y tAy − x tb − y tb
= (12
x tAx − x tb) + (y tAx − y tb) +12
y tAy
= J(x) + y t (Ax − b) +12
y tAy
= J(x) +12
y tAy > J(x).
A is SPD, so y 6= 0 =⇒ y tAy > 0.
12 / 66
Proof of second claim
LHS
‖x − y‖2A
= (x − y)tA(x − y)
= x tAx − y tAx − x tAy + y tAy= ( since Ax = b)
= x tb − y tb − y tb + y tAy
= y tAy − 2y tb + x tb
RHS
2 (J(y)− J(x))
= y tAy − 2y tb − x tAx + 2x tb= (since Ax = b)
= y tAy − 2y tb − x t [Ax − b] + x tb
= y tAy − 2y tb + x tb
13 / 66
The 2× 2 case
I 2× 2 system Ax = b with
A =
[a cc d
].
I Eigenvalues: (a− λ)(d − λ)− c2 = 0
λ =(a + d)±
√(a + d)2 − 4(ad − c2)
2
I Positive definite (λ > 0) =⇒ (a + d) > 0 and c2 − ab < 0
14 / 66
The 2× 2 case cont’dI Write x = [x1, x2]T
J(x1, x2) =12
(x1, x2)
[a cc d
] [x1x2
]− [x1, x2]
[b1b2
]=
12
(ax21 + 2cx1x2 + dx2
2 )− (b1x1 + b2x2).
I The surface z = J(x1, x2) is a paraboloid opening up if and only ifa > 0, d > 0, and c2 − ad < 0.
I Hessian matrix is SPD (Theorem 260)
z=J(x,y)
15 / 66
Paraboloid opening upward (again)
I Consider b1 = b2 = 0: x1, x2 are components of the errorI J(x1, x2) = 1
2 (ax21 + 2cx1x2 + dx2
2 )
I Set x1 = r√a cos θ and x2 = r√
dsin θ
J = r2 cos2 θ + r2 sin2 θ + 2r2 c√ad
sin θ cos θ
= r2(
1 +c√ad
sin 2θ)
I J > 0 if and only if ± c√ad< 1 or c2 < ad .
16 / 66
Topics
The connection to optimization
Descent Methods
Application to Stationary Iterative Methods
Parameter selection
The Steepest Descent Method
17 / 66
General Descent Method, Algorithm 263Given
1. Ax = b2. a quadratic functional J(·) that is minimized at x = A−1b3. a maximum number of iterations itmax4. an initial guess x0:
Compute r0 = b − Ax0
for n=1:itmax
(∗) Choose a direction vector dn
Find α = αn by solving the 1d minimization problem:(∗∗) αn = arg minα J(xn + αdn)
xn+1 = xn + αndn
rn+1 = b − Axn+1
if converged, exit, endend
18 / 66
Steepest descent
I Functional: J(x) = 12 x tAx − x tb
I Descent direction: dn = −∇J(xn)
19 / 66
−∇J = r when J(x) = 12x tAx − x tb
I J = 12 xT Ax − xT b
I ∂J∂xi
= 12 (Ax)i + 1
2 (xT A)i − bi
I But 12 (xT A)i = 1
2 ((Ax)T )i = 12 (Ax)i
I So ∂J∂xi
= (Ax)i − bi = −ri
I −∇J = r
20 / 66
Formula for α
Proof of following is a homework problem:
For arbitrary dn,
αn =(dn)T rn
(dn)T Adn
Hint: Set ddα (J(x + αd)) = 0 and solve.
21 / 66
Formula for α
Proof of following is a homework problem:
For arbitrary dn,
αn =(dn)T rn
(dn)T Adn
Hint: Set ddα (J(x + αd)) = 0 and solve.
21 / 66
Homework
Text, 264.
22 / 66
Notation
I Use the 〈·, ·〉 notationI α for steepest descent becomes
αn =〈dn, rn〉〈dn,Adn〉
=〈dn, rn〉〈dn,dn〉A
.
23 / 66
Descent demonstration
descentdemo.m
24 / 66
Different descent methods: descent direction dn
I Steepest descent direction: dn = −∇J(xn)
I Random directions: dn = a randomly chosen vectorI Gauss-Seidel like descent: dn cycles through the standard basis
of unit vectors e1,e2, · · ·,eN and repeats if necessary.I Conjugate directions: dn cycles through an A-orthogonal set of
vectors.
25 / 66
Different descent methods: choice of functional J
I If A is SPD the most common choice is
J(x) =12
x tAx − x tb
I Minimum residual methods: for general A ,
J(x) =12〈b − Ax ,b − Ax〉.
I Various combinations such as residuals plus updates:
J(x) =12〈b − Ax ,b − Ax〉+
12〈x − xn, x − xn〉
26 / 66
Homework
Text, 265, 268
27 / 66
Topics
The connection to optimization
Descent Methods
Application to Stationary Iterative Methods
Parameter selection
The Steepest Descent Method
28 / 66
Stationary Iterative Methods
I A = M − NI Iterative method Mxn+1 = b + Nxn
I Equivalently M(xn+1 − xn) = b − Axn
29 / 66
Examples
I FOR: M = ρI, N = ρI − AI Jacobi: M = diag(A)
I Gauss-Seidel: M = D + L (lower triangular part of A).
30 / 66
Householder lemma (269)
Let A be SPDLet xn be given by M(xn+1 − xn) = b − Axn
Let en = x − xn then
〈en,Aen〉 − 〈en+1,Aen+1〉 = 〈(xn+1 − xn),P(xn+1 − xn)〉
where P = M + MT − A.
31 / 66
Convergence of FOR, Jacobi, GS and SOR
Corollary 270For A SPD,P SPD =⇒ convergence is monotonic in the A norm:
‖en‖A > ‖en+1‖A > . . . −→ 0 as n→ 0.
Householder lemma in this case gives
〈en,Aen〉 − 〈en+1,Aen+1〉 = 〈(xn+1 − xn),P(xn+1 − xn)〉, so
〈en,Aen〉 > 〈en+1,Aen+1〉, or
‖en+1‖A < ‖en‖A
32 / 66
Convergence of FOR, Jacobi, GS and SOR
Corollary 270For A SPD,P SPD =⇒ convergence is monotonic in the A norm:
‖en‖A > ‖en+1‖A > . . . −→ 0 as n→ 0.
Householder lemma in this case gives
〈en,Aen〉 − 〈en+1,Aen+1〉 = 〈(xn+1 − xn),P(xn+1 − xn)〉, so
〈en,Aen〉 > 〈en+1,Aen+1〉, or
‖en+1‖A < ‖en‖A
32 / 66
Proof
I ‖en‖A ≥ 0 and monotone decreasing, so has a limitI Cauchy criterion:
(〈en,Aen〉 − 〈en+1,Aen+1〉)→ 0.
I Householder gives||xn+1 − xn||P → 0.
I So thatM(xn+1 − xn) = b − Axn → 0
33 / 66
Monotone convergence
1. FOR converges monotonically in ‖ · ‖A if
P = M + M t − A = 2ρI − A > 0 if ρ >12λmax(A).
2. Jacobi converges monotonically in the A-norm if diag(A) > 12 A.
3. Gauss-Seidel converges monotonically in the A norm for SPD Ain all cases.
4. SOR converges monotonically in the A-norm if 0 < ω < 2.
34 / 66
Proofs
1. For Jacobi, M = diag(A), so
P = M + M t − A = 2diag(A)− A > 0 if diag(A) >12
A.
2. For GS, A = D + L + LT (since A SPD)
P = M + MT − A = (D + L) + (D + L)T − A =
D + L + D + LT − (D + L + LT ) = D > 0
3. For SOR,
P = M + M t − A = M + M t − (M − N) = M t + N
= ω−1D + Lt +1− ωω
D − U =2− ωω
D > 0,
for 0 < ω < 24. Convergence of FOR in the A-norm is left as an exercise.
35 / 66
Proofs
1. For Jacobi, M = diag(A), so
P = M + M t − A = 2diag(A)− A > 0 if diag(A) >12
A.
2. For GS, A = D + L + LT (since A SPD)
P = M + MT − A = (D + L) + (D + L)T − A =
D + L + D + LT − (D + L + LT ) = D > 0
3. For SOR,
P = M + M t − A = M + M t − (M − N) = M t + N
= ω−1D + Lt +1− ωω
D − U =2− ωω
D > 0,
for 0 < ω < 24. Convergence of FOR in the A-norm is left as an exercise.
35 / 66
Proofs
1. For Jacobi, M = diag(A), so
P = M + M t − A = 2diag(A)− A > 0 if diag(A) >12
A.
2. For GS, A = D + L + LT (since A SPD)
P = M + MT − A = (D + L) + (D + L)T − A =
D + L + D + LT − (D + L + LT ) = D > 0
3. For SOR,
P = M + M t − A = M + M t − (M − N) = M t + N
= ω−1D + Lt +1− ωω
D − U =2− ωω
D > 0,
for 0 < ω < 2
4. Convergence of FOR in the A-norm is left as an exercise.
35 / 66
Proofs
1. For Jacobi, M = diag(A), so
P = M + M t − A = 2diag(A)− A > 0 if diag(A) >12
A.
2. For GS, A = D + L + LT (since A SPD)
P = M + MT − A = (D + L) + (D + L)T − A =
D + L + D + LT − (D + L + LT ) = D > 0
3. For SOR,
P = M + M t − A = M + M t − (M − N) = M t + N
= ω−1D + Lt +1− ωω
D − U =2− ωω
D > 0,
for 0 < ω < 24. Convergence of FOR in the A-norm is left as an exercise.
35 / 66
Homework
Text, Exercises 275, 278In Exercise 278, you are to:
1. Give details of the proof that P > 0 implies convergence of FOR.2. Prove the Householder lemma.
Hint: expand the right side using P = M + MT − A and theidentities xn+1 − xn = en − en+1 and 〈y ,Wz〉 = 〈W T y , z〉.
36 / 66
Topics
The connection to optimization
Descent Methods
Application to Stationary Iterative Methods
Parameter selection
The Steepest Descent Method
37 / 66
Picking ρ for FOR
I FOR: ρoptimal = (λmax + λmin)/2I This gives fastest overall convergence for fixed ρI Hard to calculate.I What about ρn =“Best possible for this step”
38 / 66
Variable ρ
Given x0 and a maximum number of iterations, itmax:
for n=0:itmaxCompute rn = b − Axn
Compute ρn via a few auxiliary calculationsxn+1 = xn + (ρn)−1rn
if converged, exit, endend
39 / 66
FOR again
Residuals rn = b − A ∗ xn for FOR satisfy
rn+1 = rn − ρ−1Arn.
Proof:
xn+1 = xn − ρ−1rn.
Multiply by “−A” and add b to both sides
b − Axn+1 = b − Axn − ρ−1Arn,
40 / 66
Homework
Text, Exercise 281Easy! Preparation for following work.
41 / 66
Option 1 Residual Minimization
Pick ρn to minimize ‖rn+1‖2:
‖rn+1‖2 = ‖rn − ρ−1Arn‖2 = 〈rn − ρ−1Arn, rn − ρ−1Arn〉.
Since rn is fixed, this is a simple function of ρ
J(ρ) = 〈rn − ρ−1Arn, rn − ρ−1Arn〉
orJ(ρ) = 〈rn, rn〉 − 2〈rn,Arn〉ρ−1 − 〈Arn,Arn〉ρ−2.
Taking J ′(ρ) = 0 and solving for ρ = ρoptimal gives
ρoptimal =〈Arn,Arn〉〈rn,Arn〉
.
The cost of using this optimal value at each step: two extra dotproducts per step.
42 / 66
Option 2 J minimization
Minimize J(xn+1):
φ(ρ) = J(xn + ρ−1rn) =
12〈xn + ρ−1rn,A(xn + ρ−1rn)〉 − 〈xn + ρ−1rn,b〉
Expanding, setting φ′(ρ) = 0 and solving, as before, gives
ρoptimal =〈rn,Arn〉〈rn, rn〉
.
I Option 2 requires SPD AI Faster than Option 1
43 / 66
Algorithm for Option 2
Given x1
the matrix Aa maximum number of iterations, itmax:
r1 = b − Ax1
for n=1:itmax
ρn = 〈Arn, rn〉/〈rn, rn〉xn+1 = xn + (1/ρn) rn
if satisfied, exit, end
rn+1 = b − Axn+1
end
44 / 66
Homework
Text, Exercise 283. Implement and test Options 1 and 2.
45 / 66
The “normal” equations
I Ae = r so ‖r‖22 = ‖Ae‖2
2 = ‖e‖2AT A
I AT A is SPD, so minimizing ‖e‖2AT A is minimizing
(AT A)x = AT b.
I Both bandwidth and condition number of AT A are much largerthan A, but AT A is SPD.
46 / 66
Condition number of normal equations (284)
Let A be N × N and invertible.Then AT A is SPD.If A is SPD then λ(AT A) = λ(A)2 and
cond2(AT A) = [cond2(A)]2 .
Proof
I Symmetry: (AT A)T = AT ATT = AT AI Positivity: 〈x ,AAT x〉 = 〈AT x ,AT x〉 = ‖AT x‖2 > 0I Condition no.: For A SPD
cond2(AtA) = cond2(A2) =λmax(A2)
λmin(A2)=
(λmax(A)
λmin(A)
)2
= (cond2(A))2.
I Squaring the condition number greatly increases number ofiterations
47 / 66
Condition number of normal equations (284)
Let A be N × N and invertible.Then AT A is SPD.If A is SPD then λ(AT A) = λ(A)2 and
cond2(AT A) = [cond2(A)]2 .
Proof
I Symmetry: (AT A)T = AT ATT = AT AI Positivity: 〈x ,AAT x〉 = 〈AT x ,AT x〉 = ‖AT x‖2 > 0I Condition no.: For A SPD
cond2(AtA) = cond2(A2) =λmax(A2)
λmin(A2)=
(λmax(A)
λmin(A)
)2
= (cond2(A))2.
I Squaring the condition number greatly increases number ofiterations
47 / 66
Condition number of normal equations (284)
Let A be N × N and invertible.Then AT A is SPD.If A is SPD then λ(AT A) = λ(A)2 and
cond2(AT A) = [cond2(A)]2 .
Proof
I Symmetry: (AT A)T = AT ATT = AT AI Positivity: 〈x ,AAT x〉 = 〈AT x ,AT x〉 = ‖AT x‖2 > 0I Condition no.: For A SPD
cond2(AtA) = cond2(A2) =λmax(A2)
λmin(A2)=
(λmax(A)
λmin(A)
)2
= (cond2(A))2.
I Squaring the condition number greatly increases number ofiterations
47 / 66
Topics
The connection to optimization
Descent Methods
Application to Stationary Iterative Methods
Parameter selection
The Steepest Descent Method
48 / 66
Steepest descent
I Reduce J as much as possible in gradient directionI Option 2
49 / 66
Steepest descent algorithm
GivenI SPD AI x0, r0 = b − Ax0
I maximum number of iterations itmax
for n=0:itmax
rn = b − Axn
αn = 〈rn, rn〉/〈rn,Arn〉xn+1 = xn + αnrn
if converged, exit, end
end
αn was called 1/ρn earlier.
50 / 66
Example: It is still slow!
I N = 2, x = (x1, x2)T .
I A =
[2 00 50
], b =
[20
]I
J(→x ) =
12
[x1, x2]
[2 00 50
] [x1x2
]− [x1, x2]
[20
]=
12
(2x21 + 50x2
2 )− 2x1 = x21 − 2x1 + 25x2
2︸ ︷︷ ︸ellipse
+1− 1
=(x1 − 1)2
12 +x2
2
( 15 )2− 1.
51 / 66
Steepest descent illustration
x0 = (11, 1)
x4 = (6.37, 0.54)
limit= (1, 0)
52 / 66
SD is no faster than FOR (287)
Let A be SPD and κ = λmax(A)/λmin(A).. The steepest descentmethod converges to the solution of Ax = b for any x0. The errorx − xn satisfies
‖x − xn‖A ≤(κ− 1κ+ 1
)n
‖x − x0‖A
and
J(xn)− J(x) ≤(κ− 1κ+ 1
)n (J(x0)− J(x)
).
But no eigenvalue estimates are required!
53 / 66
SD is no faster than FOR (287)
Let A be SPD and κ = λmax(A)/λmin(A).. The steepest descentmethod converges to the solution of Ax = b for any x0. The errorx − xn satisfies
‖x − xn‖A ≤(κ− 1κ+ 1
)n
‖x − x0‖A
and
J(xn)− J(x) ≤(κ− 1κ+ 1
)n (J(x0)− J(x)
).
But no eigenvalue estimates are required!
53 / 66
Proof
I 〈x − xn,A(x − xn)〉 = 〈x ,b〉+ 2J(xn)
I Minimizing J is the same as minimizing ‖x − xn‖A
I Steepest descent reduces J maximallyI FOR reduces ‖x − xn
FOR‖A with known boundI Doing one iteration of each, starting from same xn−1,
‖x − xn‖A ≤ ‖x − xnFOR‖A ≤
(κ− 1κ+ 1
)‖x − xn−1‖A
54 / 66
Theorem 288: inequality is sharp!
I Ax = bI If x0 = x − (λ2φ1 ± λ1φ2) where φ1 and φ2 are the eigenvectors
of λ1 = λmin(A) and λ2 = λmax(A) respectivelyI Then the convergence rate is precisely
λ2 − λ1
λ2 + λ1=κ− 1κ+ 1
55 / 66
Proof
x0 = x − (λ2φ1 ± λ1φ2)
e0 = λ2φ1 ± λ1φ2
r0 = Ae0 = −λ1λ2φ1 ∓ λ1λ2φ2
Ar0 = −λ21λ2φ1 ∓ λ1λ
22φ2
α0 = 〈r0, r0〉/〈r0,Ar0〉
=λ2
1λ22 + λ2
1λ22
λ31λ
22 + λ2
1λ32
=2
λ1 + λ2
x1 = x0 + αr0
= x −(λ2 −
2λ1 + λ2
λ1λ2
)φ1 −
(∓λ2
1 ∓2
λ1 + λ2λ1λ2
)φ2
= x − λ2
(λ2 − λ1
λ2 + λ1
)φ1 − λ1
(±λ1 ∓ λ2
λ1 + λ2
)φ2
= x − λ2 − λ1
λ2 + λ1(λ2φ1 ∓ λ1φ2)
56 / 66
Proof
x0 = x − (λ2φ1 ± λ1φ2)
e0 = λ2φ1 ± λ1φ2
r0 = Ae0 = −λ1λ2φ1 ∓ λ1λ2φ2
Ar0 = −λ21λ2φ1 ∓ λ1λ
22φ2
α0 = 〈r0, r0〉/〈r0,Ar0〉
=λ2
1λ22 + λ2
1λ22
λ31λ
22 + λ2
1λ32
=2
λ1 + λ2
x1 = x0 + αr0
= x −(λ2 −
2λ1 + λ2
λ1λ2
)φ1 −
(∓λ2
1 ∓2
λ1 + λ2λ1λ2
)φ2
= x − λ2
(λ2 − λ1
λ2 + λ1
)φ1 − λ1
(±λ1 ∓ λ2
λ1 + λ2
)φ2
= x − λ2 − λ1
λ2 + λ1(λ2φ1 ∓ λ1φ2)
56 / 66
Proof
x0 = x − (λ2φ1 ± λ1φ2)
e0 = λ2φ1 ± λ1φ2
r0 = Ae0 = −λ1λ2φ1 ∓ λ1λ2φ2
Ar0 = −λ21λ2φ1 ∓ λ1λ
22φ2
α0 = 〈r0, r0〉/〈r0,Ar0〉
=λ2
1λ22 + λ2
1λ22
λ31λ
22 + λ2
1λ32
=2
λ1 + λ2
x1 = x0 + αr0
= x −(λ2 −
2λ1 + λ2
λ1λ2
)φ1 −
(∓λ2
1 ∓2
λ1 + λ2λ1λ2
)φ2
= x − λ2
(λ2 − λ1
λ2 + λ1
)φ1 − λ1
(±λ1 ∓ λ2
λ1 + λ2
)φ2
= x − λ2 − λ1
λ2 + λ1(λ2φ1 ∓ λ1φ2)
56 / 66
Proof
x0 = x − (λ2φ1 ± λ1φ2)
e0 = λ2φ1 ± λ1φ2
r0 = Ae0 = −λ1λ2φ1 ∓ λ1λ2φ2
Ar0 = −λ21λ2φ1 ∓ λ1λ
22φ2
α0 = 〈r0, r0〉/〈r0,Ar0〉
=λ2
1λ22 + λ2
1λ22
λ31λ
22 + λ2
1λ32
=2
λ1 + λ2
x1 = x0 + αr0
= x −(λ2 −
2λ1 + λ2
λ1λ2
)φ1 −
(∓λ2
1 ∓2
λ1 + λ2λ1λ2
)φ2
= x − λ2
(λ2 − λ1
λ2 + λ1
)φ1 − λ1
(±λ1 ∓ λ2
λ1 + λ2
)φ2
= x − λ2 − λ1
λ2 + λ1(λ2φ1 ∓ λ1φ2)
56 / 66
Proof
x0 = x − (λ2φ1 ± λ1φ2)
e0 = λ2φ1 ± λ1φ2
r0 = Ae0 = −λ1λ2φ1 ∓ λ1λ2φ2
Ar0 = −λ21λ2φ1 ∓ λ1λ
22φ2
α0 = 〈r0, r0〉/〈r0,Ar0〉
=λ2
1λ22 + λ2
1λ22
λ31λ
22 + λ2
1λ32
=2
λ1 + λ2
x1 = x0 + αr0
= x −(λ2 −
2λ1 + λ2
λ1λ2
)φ1 −
(∓λ2
1 ∓2
λ1 + λ2λ1λ2
)φ2
= x − λ2
(λ2 − λ1
λ2 + λ1
)φ1 − λ1
(±λ1 ∓ λ2
λ1 + λ2
)φ2
= x − λ2 − λ1
λ2 + λ1(λ2φ1 ∓ λ1φ2)
56 / 66
Homework
Text, Exercise 291
57 / 66
Exercise 290: Convection-Diffusion Equation
−ε∆u + c · ∇u = f inside Ω = [0,1]× [0,1]
c =
[10
]u = g on ∂Ω
I The convection term is discretized as
c · ∇u =∂u∂x≈ u(i + 1, j)− u(i − 1, j)
2h
58 / 66
Exercise 290: Convection-Diffusion Equation
−ε∆u + c · ∇u = f inside Ω = [0,1]× [0,1]
c =
[10
]u = g on ∂Ω
I The convection term is discretized as
c · ∇u =∂u∂x≈ u(i + 1, j)− u(i − 1, j)
2h
58 / 66
Comments about CDE
I c is like a wind, blowing the solution to the right.I Interesting behavior for ε < ‖c‖
I Pe=”cell Péclet number”I Discretization becomes unstable as Pe becomes large.
59 / 66
Comments about CDE
I c is like a wind, blowing the solution to the right.I Interesting behavior for ε < ‖c‖I Pe=”cell Péclet number”I Discretization becomes unstable as Pe becomes large.
59 / 66
Modify jacobi2d.m
I Add in convection term.I Ends up with a factor of h in the numeratorI Debug with f = 1, g = x − y .
I Exact solution is u = gI Also exact discrete solution!I Diffusion is debugged already
I Debug with N = 5I Can see the solution on the screenI h = 0.2 so can recognize solution values on sight
I Start from exact: must get answer in 1 iteration.I Start from exact, run many iterations: must stay exactI Start from zero, get exact?
I If not, program in exact solution, print norm of error at eachiteration.
I Double-check original code!
60 / 66
Modify jacobi2d.m
I Add in convection term.I Ends up with a factor of h in the numeratorI Debug with f = 1, g = x − y .
I Exact solution is u = gI Also exact discrete solution!I Diffusion is debugged already
I Debug with N = 5I Can see the solution on the screenI h = 0.2 so can recognize solution values on sight
I Start from exact: must get answer in 1 iteration.I Start from exact, run many iterations: must stay exactI Start from zero, get exact?
I If not, program in exact solution, print norm of error at eachiteration.
I Double-check original code!
60 / 66
“Real” problem
I N = 50, f = x + y , g = 0I Jacobi, steepest descent, residual minimizationI ε = 1.0, 0.1, 0.01, 0.001
61 / 66
Solution, ε = 1.0
62 / 66
Solution, ε = 0.1
63 / 66
Solution, ε = 0.01
64 / 66
Solution, ε = 0.001
65 / 66
Iterative methods
Numbers of iterationsε Jacobi Steepest descent Residual minimization
1.0 3318 3246 32810.1 17161 1652 16900.01 13235 130 113
0.001 15735 diverges 1663
66 / 66