1 nonlinear equations jyun-ming chen. 2 contents bisection false position newton quasi-newton...
TRANSCRIPT
1
Nonlinear Equations
Jyun-Ming Chen
2
Contents
• Bisection
• False Position
• Newton
• Quasi-Newton
• Inverse Interpolation
• Method Comparison
3
Solve the Problem Numerically
• Consider the problem in the following general form:
f(x) = 0
• Many methods to choose from:– Interval Bisection
Method
– Newton
– Secant
– …
4
Interval Bisection Method
• Recall the following theorem from calculus
• Intermediate Value Theorem ( 中間值定理 )– If f(x) is continuous on [a,b]
and k is a constant, lies between f(a) and f(b), then there is a value x[a,b] such that
f(x) = k
5
Bisection Method (cont)
• Simply setting k = 0
• Observe:– if sign( f(a) ) ≠ sign( f(b) )– then there is a point x [a, b] such that f(x) = 0
6
Definition
• non-trivial interval [a,b]:f(a) ≠ 0, f(b) ≠ 0
and
sign( f(a) ) ≠ sign( f(b) )
sign(-2) = -1
sign(+5) = 1
7
Idea
• Start with a non-trivial interval [a,b]
• Set c(a+b)/2
• Three possible cases:
⑴ f(c) = 0, solution found
⑵ f(c) ≠ 0, [c,b] nontrivial
⑶ f(c) ≠ 0, [a,c] nontrivial
• Keep shrinking the interval until convergence
• → ⑴ problem solved• → ⑵⑶ a new smaller
nontrivial interval ½ size_______
8
Algorithm
What’s wrong with this code?
9
Remarks
• Convergence– Guaranteed once a nontrivial interval is found
• Convergence Rate– A quantitative measure of how fast the
algorithm is– An important characteristics for comparing
algorithms
10
Convergence Rate of Bisection
• Let: – Length of initial
interval L0
– After k iterations, length of interval is Lk
– Lk=L0/2k
– Algorithm stops when Lk eps
• Plug in some values…
93.1910
1log
10
1Let
62
6
k
eps
L
This is quite slow, compared to other
methods…Meaning of
eps
11
How to get initial (nontrivial) interval [a,b] ?
• Hint from the physical problem
• For polynomial equation, the following theorem is applicable:
roots (real and complex) of the polynomial
f(x) = anxn + an-1xn-1 +…+ a1x + aο
satisfy the bound:
) , , , (1
1 10 nn
aaaMaxa
x ) , , , (1
1 10 nn
aaaMaxa
x
12
Example
• Roots are bounded by
• Hence, real roots are in [-10,10]
• Roots are
–1.5251,
2.2626 ± 0.8844i
093 23 xxx109) 1, 3, ,1( max
1
11 x
complex
13
Other Theorems for Polynomial Equations
• Sturm theorem: – The number of real roots of an algebraic
equation with real coefficients whose real roots are simple over an interval, the endpoints of which are not roots, is equal to the difference between the number of sign changes of the Sturm chains formed for the interval ends.
14
Sturm Chain
15
Example
38879.1 ,32836.10802951.0 ,334734.0 ,21465.1
13)( 5
ix
xxxf
16
Sturm Theorem (cont)
• For roots with multiplicity:– The theorem does not apply, but …– The new equation : f(x)/gcd(f(x),f’(x))
• All roots are simple
• All roots are same as f(x)
17
Sturm Chain by Maxima
18
Maxima (cont)
19
Descarte’s Sign Rule• A method of determining the
maximum number of positive and negative real roots of a polynomial.
• For positive roots, start with the sign of the coefficient of the lowest power. Count the number of sign changes n as you proceed from the lowest to the highest power (ignoring powers which do not appear). Then n is the maximum number of positive roots.
• For negative roots, starting with a polynomial f(x), write a new polynomial f(-x) with the signs of all odd powers reversed, while leaving the signs of the even powers unchanged. Then proceed as before to count the number of sign changes n. Then n is the maximum number of
negative roots.
3 positive roots
4 negative roots
20
False Position Method
• x2 defined as the intersection of x axis and x0f0-x1f1
• Choose [x0,x2] or [x2,x1], whichever is non-trivial
• Continue in the same way as bisection
• Compared to bisection:x2=(x1+x0)/2
21
False Position (cont)
Determine intersection point
• Using similar triangles:
)(
)11
(
1
1
0
0
01
102
1
1
0
0
102
1
21
0
02
f
x
f
x
ff
ffx
f
x
f
x
ffx
f
xx
f
xx
)(1
011001
2 fxfxff
x
)(1
011001
2 fxfxff
x
22
False Position (cont)
Alternatively, the straight line passing thru (x0,f0) and (x1,f1)
Intersection: simply set y=0 to get x
)( 001
010 xx
xx
fffy
0001
01
001
010 )(0
xxfff
xx
xxxx
fff
23
Example
0 ,1 ,0
0sin3)(
1010
ffxx
exxxf x
k xk (Bisection) fk xk (False
Position)
fk
1 0.5 0.471
2 0.25 0.372
3 0.375 0.362
4 0.3125 0.360
5 0.34315 -0.042 0.360 2.93×10-5
24
False Position
• Always better than bisection?
25
(x0, f0)
Newton’s Method
tangent line thru (x0 , f0)
00 )( slope fxf
)(
)0 (
axis-on with intersecti
)(
0100
000
xxff
yset
x
xxffy
,...3,2,1 ,1
kf
fxx
k
kkk
,...3,2,1 ,1
kf
fxx
k
kkk
Graphical Derivation
Also known as Newton-Raphson method
26
Newton’s Method (cont)
• Derived using Taylor’s expansion
f(x)
))(x-x(xf)f(x(x)f
)(xfxx
xxxf
xxxfxfxf
ofion approximat good a is
ˆthen
large not too and near is if
)(2
)())(()()(
000
00
20
0000
27
Taylor’s Expansion (cont)
0)(ˆ
0)(for iteratenext theas
0)(ˆ ofroot theTake
xf
xf
xf
,...3,2,1 ,1
kf
fxx
k
kkk
,...3,2,1 ,1
kf
fxx
k
kkk
28
Example
• Old Barbarians used the following formula to compute the square root of a number a
explain why this works:
)(2
11
kkk x
axx
8-84
4-025.11
3
2-2
1-1
0
104.7 107.41
103 1.00030 )025.1(2
1
102.5 025.1)25.1
125.1(
2
1
102.5 25.1)2
12(
2
1
1 2
:Error
x
x
x
x
x
Finding square root of 1 (a=1)
with x0 = 2
29
Newton’s Method
)(2
1 ...
)22
( 2
2)(
)(
)(
)(
0)( solve toMethod sNewton' Use
0)( of roots theof one is
22
1
2
1
2
kk
kk
kk
k
kkk
k
kkk
x
ax
x
a
x
xx
x
axxx
xxf
axxf
xf
xfxx
xf
axxfa
01)( 2 xxf
30
Fast Inverse Square Root
To understand the magic number 0x5f3759df, read: Chris Lomont or Paul Hsieh
31
Definition
• Error of the ith iterate
• Order of a method m, satisfies
where Ek is an error bound of k
)lim (i.e.
valueconverged theis where
αx
x
ii
ii
constantlim 1
k
mk
k
Ε
Ε constantlim 1
k
mk
k
Ε
Ε
32
Linear Convergence of Bisection
root
a0
L2
L1
L0
a1 b1
a2 b2
b0
2
2
1222
0111
0
LabL
LabL
abL
22
is bounderror The
2 isroot of approx. reasonable a
,],[With
000
00
00
Lab
ba
ba
22
is bounderror The
2 isroot of approx. reasonable a
,],[With
000
00
00
Lab
ba
ba
33
/2
2
1limor
2
1 /2
/2
22
11
1
211
00
L
E
E
Ε
ΕL
L
k
k
k
Linear Convergence of Bisection (cont)
• We say the order of bisection method is one, or the method has linear convergence
34
Quadratic Convergence of Newton
• Let x* be the converged solution
• Recall
)()()(
)(2
1)()()( 2
xfxfxf
xfxfxfxf
)(
)(1
k
kkk xf
xfxx
35
Quadratic Convergence of Newton (cont)
• Subtracting x*:
21
2
2
1
1
)(
)(
2
1
)(
)(
2
1
)(
)(21
)()(
)(
)(
kk
kkk
kk
kk
k
kkk
xf
xf
xf
xf
xf
xfxfxf
xf
xfxxxx
Or we say Newton’s method has quadratic convergence
36
Example: Newton’s Method
• f(x)= x3–3x2 – x+9=0
10)9131max(1
1
9,1,3,1
]10,10[ thmRecall
0123
,,,x
aaaa
xk xk
0 0
1 9
2 6.41
46 -1.5250
163)(
93)(
0 choose
2
23
0
xxxf
xxxxf
x
Worse than bisection !?
37
Why?
• plot f(x) • Plot xk vs. k
-1.525k
xk 60
5
30
10
-1.525
-10 25 35 40
38
Newton Iteration
-20
0
20
40
60
80
100
0 10 20 30 40 50 60 70
k
xk 1數列
39
Case 1:
40
Case 2:
Diverge to
41
Recall Quadratic Convergence of Newton’s
• The previous example showed the importance of initial guess x0
• If you have a good x0, will you always get quadratic convergence?– The problem of multiple-root
21 )(
)(
2
1kk xf
xf
42
Example• f(x)=(x+1)3=0• Convergence is linear near multiple roots
Prove this!!
43
Multiple Root
• If x* is a root of f(x)=0, then (x-x*) can be factored out of f(x)– f(x) = (x-x*) g(x)
• For multiple roots:– f(x) = (x-x*)k g(x) – k>1 and g(x) has no factor of (x-x*)
44
Multiple Root (cont)
0][*)*(*)(:1
0*)(*)(
)(*)()()(:1
)](*)()([*)(
)(*)()(*)()(
1
1
1
k
k
kk
xxxfk
xgxf
xgxxxgxfk
xgxxxkgxx
xgxxxgxxkxf
Implication:
45
• where k is the multiplicity of the root
• Get quadratic convergence!
• Problem: do not know k in advance!
Remedies for Multiple Roots
)(
)(1
n
nnn xf
xfkxx
46
Modified Newton’s Method
0(x) ofroot thealso is 0(x) ofroot the
:Check
0)( ofroot thefind tomethod sNewton' use
)(
)()(function new a Define
)(*)()(*)()(
)(*)()( ofty multiplici1
fF
xF
xf
xfxF
xgxxxgxxkxf
xgxxxfkkk
k
47
Modified Newton’s Method (cont)
)(*)()(
)(*)(
)(*)()(*)(
)(*)(
)(
)()(
) converge alwaysNewton (hence,
roots multiple no has 0)(
1
xgxxxkg
xgxx
xgxxxgxxk
xgxx
xf
xfxF
llyquadratica
xF
kk
k
48
Examplef(x)=(x–1)3sin((x – 1)2)
49
Quasi-Newton’s Method
• Recall Newton:
• The denominator requires derivation and extra coding
• The derivative might not explicitly available (e.g., tabulated data)
• May be too time-consuming to compute in higher dimensions
)(
)(1
k
kkk xf
xfxx
50
Quasi-Newton (cont)
• Quasi:
• where gk is a good and easily computed approx. to f’(xk)
• The convergence rate is usually inferior to that of Newton’s
k
kkk g
xfxx
)(1
51
Secant Method
– Use the slope of secant to replace the slope of tangent
– Need two points to start
)()()(
)(
Or,
)()(
11
1
1
1
kkkk
kkk
kk
kkk
xxxfxf
xfxx
xx
xfxfg
Order: 1.62
52
Idea:
• x2: Intersection of x-axis and a line interpolating x0 f0 & x1 f1
• x3: Intersection of x-axis and a line interpolating x1 f1 & x2 f2
• xk+1: Intersection of x-axis and a line interpolating xk-1fk-1 & xkfk
x0x1
‧
‧
‧‧
x2
53
Comparison
• Newton’s method • False Position
(Newton)
xkxk+1
‧
‧‧‧
f ’(xk)
‧
54
Secant vs. False Position
False PositionSecant
55
Beyond Linear Approximations
• Both secant and Newton use linear approximations• Higher order approximation yields better accuracy?• Try to fit a quadratic polynomial g(x) thru the
following three points:g(xi) = f(xi), i = k, k–1, k – 2
• Let xk+1 be the root of g(x) = 0– Could have two roots; choose the one near xk
• This is called the Muller's Method
56
• See Textbook
• g(x) 通過 (xk-2, fk-2), (xk-1, fk-1), (xk,fk)
Muller's MethodOrder: 1.84
57
Finding the Interpolating Quadratic Polynomial g(x)
kkkk
kkkk
kkkk
faxaxaxg
faxaxaxg
faxaxaxg
axaxaxg
012
2
10112
121
20212
222
012
2
)(
)(
)(
)(
3 eqns to solveunknowns : a2 , a1 , a0
))((1
)()( 121
21
1
1
21
1
kkkk
kk
kk
kk
kkk
kk
kkk xxxx
xx
ff
xx
ff
xxxx
xx
fffxg ))((
1)()( 1
21
21
1
1
21
1
kkkk
kk
kk
kk
kkk
kk
kkk xxxx
xx
ff
xx
ff
xxxx
xx
fffxg
Or,
Double-check !
58
SummaryIterative Methods for Solving f(x)=0• Basic Idea:
– Local approximation + iterative computation
– At kth step, construct a polynomial p(x) of degree n, then solve p(x) = 0; take one of the roots as the next iterate, xk+1
• In other words,– construct p(x)
– solve p(x) = 0; find the intersection between y=p(x) and x-axis
– choose one root
59
Revisit Newton
‧
‧xk+1 xk
))(()()(
ofroot theis
)(
)(
1
1
kkk
k
k
kkk
xxxfxfxp
x
xf
xfxx
p(x): is a linear approximation passing thru(xk,fk) with the slope fk
‘
60
Revisit Secant
p(x)
)()(1
1k
kk
kkk xx
xx
fffxp
p(x): is a linear approximation
passing thru (xk-1,fk-1) and (xk,fk) with the secant slope
xk-1xk
‧
‧
‧‧
61
Revisit Muller
• p(x) is a parabola (2nd degree approximation) passing thru three points
• Heuristic: choose the root that is closer to the previous iterate
• Potential problem:– No solution (parabola and x-axis do not
intersect!)
62
Categorize by Starting Condition
• Bisection and False Position– Require non-trivial
interval [a,b]
– Convergence guaranteed
• Newton: one point – x0 → x1 →…
• Secant: two points– x0 x1 → x2 → …
• Muller: three points – x0 x1 x2 → x3→ …
• These methods converge faster but can diverge …
63
A Slightly Different Method:Inverse Interpolation
• Basic Idea (still the same)– Local approximation + iterative computation
• Method:– At kth step, construct a polynomial g(y) of
degree n; then compute the next iterate by setting g(y = 0):
)0(1 ygxk
64
Inverse Linear Interpolation
• Secant: Inverse linear Interpolation
‧
‧‧
(xk-1, fk-1)
(xk, fk)
(xk+1, fk+1)x
y
)(
)(
1
11
1
1
1
1
kkk
kkkk
kkk
kkk
kk
kk
k
k
fff
xxxx
fyff
xxxx
xx
ff
xx
fy
x = g(y), xk+1=g(0)
65
Inverse Quadratic Interpolation
• Find another parabola: x = g(y)
• Set the next iteratexk+1 = g(0)
))((1
)()( 121
21
1
1
21
1
kkkk
kk
kk
kk
kkk
kk
kkk fyfy
ff
xx
ff
xx
fffy
ff
xxxyg ))((
1)()( 1
21
21
1
1
21
1
kkkk
kk
kk
kk
kkk
kk
kkk fyfy
ff
xx
ff
xx
fffy
ff
xxxyg
66
Example (IQI)
• Solve f(x)=x3–x=0– x0 = 2, x1 = 1.2, x2 = 0.5
k xk+1 xk-2 fk-2 xk-1 fk-1 xk fk
1 0.8102335181319 2 6 1.2 0.528 0.5 -0.3752 1.3884057934643 1.2 0.528 0.5 -0.375 0.810234 -0.2783333 1.5252259894989 0.5 -0.375 0.810234 -0.27833 1.388406 1.2879834 0.9414762279683 0.810234 -0.27833 1.388406 1.287983 1.525226 2.0229295 0.9844320642825 1.388406 1.287983 1.525226 2.022929 0.941476 -0.1069736 1.0010409722070 1.525226 2.022929 0.941476 -0.10697 0.984432 -0.0304137 1.0000043435326 0.941476 -0.10697 0.984432 -0.03041 1.001041 0.0020858 0.9999999997110 0.984432 -0.03041 1.001041 0.002085 1.000004 8.69E-069 1.0000000000000 1.001041 0.002085 1.000004 8.69E-06 1 -5.78E-10
67
Professional Root Finder
• Need guaranteed convergence and high convergence rate
• Combine bisection and Newton (or inverse quadratic interpolation)– Perform Newton step whenever possible
(convergence is achieved)– If diverge, switch to bisection
68
Brent’s Method
• Guaranteed to converge• Combine root bracketing, bisection and
inverse quadratic interpolation– van Wijngaarden-Dekker-Brent method– Zbrent in NR
• Brent uses the similar idea in one-dimensional optimization problem– Brent in NR