lecture6.handout

Upload: chris-jerome

Post on 03-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Lecture6.Handout

    1/31

    Numerical Methods ISolving Nonlinear Equations

    Aleksandar DonevCourant Institute, NYU1

    [email protected]

    1Course G63.2010.001 / G22.2420-001, Fall 2010

    October 14th, 2010

    A. Donev (Courant Institute) Lecture VI 10/14/2010 1 / 31

  • 8/12/2019 Lecture6.Handout

    2/31

  • 8/12/2019 Lecture6.Handout

    3/31

    Final Presentations

    The final project writeup will be due Sunday Dec. 26th by midnight (Ihave to start grading by 12/27 due to University deadlines).

    You will also need to give a 15 minute presentation in front of me andother students.

    Our last class is officially scheduled for Tuesday 12/14, 5-7pm, and

    the final exam Thursday 12/23, 5-7pm. Neither of these are good!By the end of next week, October 23rd, please let me know thefollowing:

    Are you willing to present early Thursday December 16th during usual

    class time?Do you want to present during the official scheduled last class,Thursday 12/23, 5-7pm.If neither of the above, tell me when you cannot present Monday Dec.20th to Thursday Dec. 23rd (finals week).

    A. Donev (Courant Institute) Lecture VI 10/14/2010 3 / 31

  • 8/12/2019 Lecture6.Handout

    4/31

    Basics of Nonlinear Solvers

    Fundamentals

    Simplest problem: Root finding in one dimension:

    f(x) = 0 with x [a, b]

    Or more generally, solving a square system of nonlinear equations

    f(x) =0 fi(x1, x2, . . . , xn) = 0 for i= 1, . . . , n.

    There can be no closed-form answer, so just as for eigenvalues, weneed iterative methods.

    Most generally, starting from m 1 initial guesses x0, x1, . . . , xm,iterate:

    xk+1 =(xk, xk1, . . . , xkm).

    A. Donev (Courant Institute) Lecture VI 10/14/2010 4 / 31

    B i f N li S l

  • 8/12/2019 Lecture6.Handout

    5/31

    Basics of Nonlinear Solvers

    Order of convergence

    Consider one dimensional root finding and let the actual root be ,f() = 0.

    A sequence of iterates xk that converges to has order ofconvergence p>1 if as k

    xk+1 |xk |p = ek+1|ek|p C= const,

    where the constant 0

  • 8/12/2019 Lecture6.Handout

    6/31

    Basics of Nonlinear Solvers

    Local vs. global convergence

    A good initial guess is extremely important in nonlinear solvers!Assume we are looking for a unique root a bstarting with aninitial guess a x0 b.A method has local convergence if it converges to a given root for

    any initial guess that is sufficiently close to (in the neighborhoodof a root).

    A method has global convergence if it converges to the root for anyinitial guess.

    General rule: Global convergence requires a slower (careful) methodbut is safer.

    It is best to combine a global method to first find a good initial guessclose to and then use a faster local method.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 6 / 31

    Basics of Nonlinear Solvers

  • 8/12/2019 Lecture6.Handout

    7/31

    Basics of Nonlinear Solvers

    Conditioning of root finding

    f(+) f() +f

    () =f

    || |f||f()| abs=f()1 .

    The problem of finding a simple rootis well-conditioned when|f()|is far from zero.

    Finding roots with multiplicity m>1 is ill-conditioned:

    f()= = f(m1)()= 0 || |f||fm()|1/m

    Note that finding roots of algebraic equations (polynomials) is aseparate subject of its own that we skip.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 7 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    8/31

    One Dimensional Root Finding

    The bisection and Newtonalgorithms

    A. Donev (Courant Institute) Lecture VI 10/14/2010 8 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    9/31

    One Dimensional Root Finding

    Bisection

    First step is to locate a root by searching for a sign change, i.e.,

    finding a

    0

    and b

    0

    such thatf(a0)f(b0)

  • 8/12/2019 Lecture6.Handout

    10/31

    One Dimensional Root Finding

    Newtons Method

    Bisection is a slow but sure method. It uses no information about thevalue of the function or its derivatives.

    Better convergence, of order p= (1 +

    5)/2 1.63 (the goldenratio), can be achieved by using the value of the function at twopoints, as in the secant method.

    Achieving second-order convergence requires also evaluating thefunction derivative.

    Linearize the function around the current guess using Taylor series:

    f(xk+1)

    f(xk) + (xk+1

    xk)f(xk) = 0

    xk+1 =xk f(xk)

    f(xk)

    A. Donev (Courant Institute) Lecture VI 10/14/2010 10 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    11/31

    g

    Convergence of Newtonsmethod

    Taylor series with remainder:

    f() = 0 =f(xk)+(xk)f(xk)+ 12

    (xk)2f() = 0, for some [xn, ]

    After dividing by f(xk) = 0 we get

    xk f(xk)f(xk)

    = 12

    ( xk)2 f

    ()f(xk)

    xk+1 =ek+1 = 12

    ek

    2 f()

    f(xk)

    which shows second-order convergencexk+1 |xk |2

    =

    ek+1|ek|2

    =

    f()

    2f(xk)

    f()

    2f()

    A. Donev (Courant Institute) Lecture VI 10/14/2010 11 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    12/31

    g

    Proof of Local Convergence

    xk+1 |xk |2

    =

    f()2f(xk) M= sup

    |e0|x,y+|e0|

    f(x)2f(y)

    Mxk+1 = Ek+1 Mxk 2 = Ek2which will converge ifE0

  • 8/12/2019 Lecture6.Handout

    13/31

    g

    Fixed-Point Iteration

    Another way to devise iterative root finding is to rewrite f(x) in an

    equivalent formx=(x)

    Then we can use fixed-point iteration

    xk+1

    =(xk

    )

    whose fixed point (limit), if it converges, is x .For example, recall from first lecture solving x2 =cvia theBabylonian method for square roots

    xn+1 =(xn) =1

    2

    cx

    +x

    ,

    which converges (quadratically) for any non-zero initial guess.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 13 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    14/31

    Convergence theory

    It can be proven that the fixed-point iteration xk+1 =(xk)

    converges if(x) is a contraction mapping:(x) K

  • 8/12/2019 Lecture6.Handout

    15/31

    Applications of general convergence theory

    Think of Newtons method

    xk+1 =xk f(xk)

    f(xk)

    as a fixed-point iteration method xk+1 =(xk) with iteration

    function:(x) =x f(x)

    f(x).

    We can directly show quadratic convergence because (also seehomework)

    (x) = f(x)f(x)[f(x)]2

    () = 0

    () = f()

    f()= 0

    A. Donev (Courant Institute) Lecture VI 10/14/2010 15 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    16/31

    Stopping Criteria

    A good library function for root finding has to implement careful

    termination criteria.An obvious option is to terminate when the residual becomes smallf(xk)< ,which is only good for very well-conditioned problems,

    |f()

    | 1.

    Another option is to terminate when the increment becomes smallxk+1 xk< .For fixed-point iteration

    xk+1 xk =ek+1 ek 1 () ek ek [1 ()] ,

    so we see that the increment test works for rapidly convergingiterations (()

    1).

    A. Donev (Courant Institute) Lecture VI 10/14/2010 16 / 31

    One Dimensional Root Finding

  • 8/12/2019 Lecture6.Handout

    17/31

    In practice

    A robust but fast algorithm for root finding would combine bisectionwith Newtons method.

    Specifically, a method like Newtons that can easily take huge steps inthe wrong direction and lead far from the current point must besafeguarded by a method that ensures one does not leave the search

    interval and that the zero is not missed.

    Once xk is close to , the safeguard will not be used and quadratic orfaster convergence will be achieved.

    Newtons method requires first-order derivatives so often other

    methods are preferred that require function evaluation only.Matlabs function fzerocombines bisection, secant and inversequadratic interpolation and is fail-safe.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 17 / 31

    One Dimensional Root Finding

    ( )

    ( 2/ )

  • 8/12/2019 Lecture6.Handout

    18/31

    Find zeros of a sin(x) + bexp(x2/2) in MATLAB

    % f=@ m f i l e u s e s a f u n c t i o n i n an m f i l e

    % P a r a m e t e r i z e d f u n c t i o n s a r e c r e a t e d w i th :

    a = 1 ; b = 2 ;f = @( x ) a s i n( x ) + bexp(x 2/ 2) ; % H a n d l e

    f i g u r e( 1 )e z p l o t ( f , [ 5 , 5 ] ) ; g r i d

    x1=f z e r o( f , [ 2 ,0 ])[ x2 , f 2 ]= f z e r o( f , 2 . 0 )

    x1 = 1.227430849357917x2 = 3 . 1 5 5 3 6 6 4 1 5 4 9 4 8 0 1f 2 = 2.116362640691705e16

    A. Donev (Courant Institute) Lecture VI 10/14/2010 18 / 31

    One Dimensional Root Finding

    f f ( )

  • 8/12/2019 Lecture6.Handout

    19/31

    Figure of f(x)

    5 4 3 2 1 0 1 2 3 4 5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    2.5

    x

    a sin(x)+b exp(x2/2)

    A. Donev (Courant Institute) Lecture VI 10/14/2010 19 / 31

    Systems of Non-Linear Equations

    M l i V i bl

    T l E i

  • 8/12/2019 Lecture6.Handout

    20/31

    Multi-Variable Taylor Expansion

    We are after solving a square system of nonlinear equations forsome variables x:

    f(x) =0 fi(x1, x2, . . . , xn) = 0 for i= 1, . . . , n.It is convenient to focus on one of the equations, i.e., consider ascalar function f(x).The usual Taylor series is replaced by

    f(x+ x) =f(x) +gT (x) +1

    2(x)T H (x)

    where the gradient vector is

    g=xf = fx1 , fx2 , , fxn T

    and the Hessian matrix is

    H= 2xf =

    2f

    xixj

    ij

    A. Donev (Courant Institute) Lecture VI 10/14/2010 20 / 31

    Systems of Non-Linear Equations

    N M

    h d f S f E i

  • 8/12/2019 Lecture6.Handout

    21/31

    Newtons Method for Systems of Equations

    It is much harder if not impossible to do globally convergent methodslike bisection in higher dimensions!

    A good initial guess is therefore a must when solving systems, andNewtons method can be used to refine the guess.

    The first-order Taylor series is

    fxk + x fx

    k

    + J xk

    x=0where the Jacobian Jhas the gradients offi(x) as rows:

    [J (x)]ij= fixj

    So taking a Newton step requires solving a linear system:J

    xk

    x= fxk but denote J J xk

    xk+1 =xk + x= xk

    J1fx

    k

    .A. Donev (Courant Institute) Lecture VI 10/14/2010 21 / 31

    Systems of Non-Linear Equations

    C

    f N t th d

  • 8/12/2019 Lecture6.Handout

    22/31

    Convergenceof Newtons method

    Newtons method converges quadratically if started sufficiently closeto a root xat which the Jacobian is not singular.xk+1 x= ek+1= xk J1fxk x= ek J1fxkbut using second-order Taylor series

    J1fxk J1 f(x) +Jek +12ekT H ek

    =ek +J1

    2

    ekT

    H

    ek

    ek+1= J12 ekT H ek J1 H2 ek2

    Fixed point iteration theory generalizes to multiple variables, e.g.,replace f()

  • 8/12/2019 Lecture6.Handout

    23/31

    Systems of Non-Linear Equations

    I actice

  • 8/12/2019 Lecture6.Handout

    24/31

    In practice

    It is much harder to construct general robust solvers in higherdimensions and some problem-specific knowledge is required.

    There is no built-in function for solving nonlinear systems inMATLAB, but the Optimization Toolbox has fsolve.

    In many practical situations there is some continuity of the problemso that a previous solution can be used as an initial guess.

    For example, implicit methods for differential equations have atime-dependent Jacobian J(t) and in many cases the solution x(t)evolves smootly in time.

    For large problems specialized sparse-matrix solvers need to be used.In many cases derivatives are not provided but there are sometechniques for automatic differentiation.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 24 / 31

    Intro to Unconstrained Optimization

    Formulation

  • 8/12/2019 Lecture6.Handout

    25/31

    Formulation

    Optimization problems are among the most important in engineeringand finance, e.g., minimizing production cost, maximizing profits,

    etc.minxRn

    f(x)

    where x are some variable parameters and f : Rn R is a scalarobjective function.

    Observe that one only need to consider minimization as

    maxxRn

    f(x) = minxRn

    [f(x)]A local minimum x is optimal in some neighborhood,

    f(x

    ) f(x) x s.t. x x

    R>0.(think of finding the bottom of a valley)Finding the global minimumis generally not possible for arbitraryfunctions(think of finding Mt. Everest without a satelite)

    A. Donev (Courant Institute) Lecture VI 10/14/2010 25 / 31

    Intro to Unconstrained Optimization

    Connection to nonline

    ar systems

  • 8/12/2019 Lecture6.Handout

    26/31

    Connection to nonlinear systems

    Assume that the objective function is differentiable (i.e., first-orderTaylor series converges or gradient exists).

    Then a necessary condition for a local minimizer is that x be acritical point

    g (x) =xf (x) =

    f

    xi(x)i =0

    which is a system of non-linear equations!

    In fact similar methods, such as Newton or quasi-Newton, apply toboth problems.

    Vice versa, observe that solving f(x) =0is equivalent to anoptimization problem

    minx

    f(x)T f(x)

    although this is only recommended under special circumstances.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 26 / 31

    Intro to Unconstrained Optimization

    Sufficient Conditions

  • 8/12/2019 Lecture6.Handout

    27/31

    Sufficient Conditions

    Assume now that the objective function is twice-differentiable (i.e.,Hessian exists).

    A critical point xis a local minimum if the Hessian is positivedefinite

    H (x) =2

    x

    f (x)

    0

    which means that the minimum really looks like a valley or a convexbowl.

    At any local minimum the Hessian is positive semi-definite,

    2

    x

    f (x)

    0.

    Methods that require Hessian information converge fast but areexpensive (next class).

    A. Donev (Courant Institute) Lecture VI 10/14/2010 27 / 31

    Intro to Unconstrained Optimization

    Direct Search Method

    s

  • 8/12/2019 Lecture6.Handout

    28/31

    Direct-Search Methods

    A direct search method only requires f(x) to be continuous butnot necessarily differentiable, and requires only function evaluations.

    Methods that do a search similar to that in bisection can be devisedin higher dimensions also, but they may fail to converge and areusually slow.

    The MATLAB function fminsearch uses the Nelder-Mead orsimplex-search method, which can be thought of as rolling a simplexdownhill to find the bottom of a valley. But there are many othersand this is an active research area.

    Curse of dimensionality: As the number of variables(dimensionality)n becomes larger, direct search becomes hopelesssince the number of samples needed grows as 2n!

    A. Donev (Courant Institute) Lecture VI 10/14/2010 28 / 31

    Intro to Unconstrained Optimization

    Minimum of 100(x2

    x21 )

    2 + (a x1)2 in MATLAB

  • 8/12/2019 Lecture6.Handout

    29/31

    Minimum of 100(x2 x1 ) + (a x1) in MATLAB

    % R o se nb ro ck o r b anana f u n c t i o n :

    a = 1 ;b an an a = @( x ) 1 00( x ( 2 )x ( 1 ) 2 ) 2 + ( ax ( 1 ) ) 2 ;

    % Th is f u n c t i o n must a c ce p t a r r a y a rg um en ts ! b a n an a x y = @( x 1 , x 2 ) 1 0 0( x2x 1 . 2 ) . 2 + ( ax1 ) . 2 ;

    f i g u r e( 1 ) ; e z s u r f ( b an an a x y , [ 0 , 2 , 0 , 2 ] )

    [ x , y ] = m esh gr id (l i n s p a c e( 0 , 2 , 1 0 0 ) ) ;f i g u r e( 2 ) ; c o n t o u r f ( x , y , b a n a n a x y ( x , y ) , 1 0 0 )

    % C o r r e ct a ns we rs a r e x = [ 1 , 1] and f ( x )=0 [ x , f v a l ] = f m i n s e a r c h ( b an an a , [ 1 . 2 , 1 ] , o p t i m s e t ( T olX , 1 e 8))x = 0 . 9 9 9 9 9 9 9 9 9 1 8 7 8 1 4 0 . 9 9 9 9 9 9 9 9 8 4 4 1 9 1 9f v a l = 1 . 0 9 9 0 8 8 9 5 1 9 1 9 5 7 3 e18

    A. Donev (Courant Institute) Lecture VI 10/14/2010 29 / 31

    Intro to Unconstrained Optimization

    Figure of Rosenbrock

    f (x)

  • 8/12/2019 Lecture6.Handout

    30/31

    Figure of Rosenbrock f(x)

    1

    0.5

    0

    0.5

    1

    2

    1

    0

    1

    20

    200

    400

    600

    800

    x1

    100 (x2x

    1

    2)2+(ax

    1)2

    x2

    0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    A. Donev (Courant Institute) Lecture VI 10/14/2010 30 / 31

    Conclusions

    Conclusions/Summary

  • 8/12/2019 Lecture6.Handout

    31/31

    Conclusions/Summary

    Root finding is well-conditioned for simple roots (unit multiplicity),ill-conditioned otherwise.

    Methods for solving nonlinear equations are always iterative and theorder of convergence matters: second order is usually good enough.

    A good method uses a higher-order unsafe method such as Newtonmethod near the root, but safeguards it with something like thebisectionmethod.

    Newtons method is second-order but requires derivative/Jacobianevaluation. In higher dimensions having a good initial guess for

    Newtons method becomes very important.Quasi-Newtonmethods can aleviate the complexity of solving theJacobian linear system.

    A. Donev (Courant Institute) Lecture VI 10/14/2010 31 / 31