newton handouts

Upload: chris-jerome

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Newton Handouts

    1/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Nonlinear OptimizationOverview of methods; the Newton method with line search

    Niclas Brlin

    Department of Computing Science

    Ume University

    [email protected]

    November 19, 2007

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Line search

    Trust-region

    Overview

    Most deterministic methods for unconstrained optimization have thefollowing features:

    They areiterative, i.e. they start with an initial guess x0 of the

    variables and tries to find better points {xk}, k= 1, . . ..

    They aredescent methods, i.e. at each iteration k,

    f(xk+1) 0

    f(xk+ pk),

    where the scalar is called thestep length.

    In theory we would like optimal step lengths, but in practice it is

    more efficient to test trial step lengths until we find one that givesus a good enough point.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

  • 8/12/2019 Newton Handouts

    2/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Line search

    Trust-region

    2.5 2 1.5 1 0.5

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    xk

    p

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Line search

    Trust-region

    Trust-region

    In the trust-region strategy, the algorithm defines a region of trustaroundxkwhere the current model function mk is trusted.

    The region of trust is usually defined as

    p2 ,

    where the scalar is called the trust-region radius. A candidate steppis found by approximately solving the

    following subproblem

    minp

    mk(xk+ p) s.t. p2 .

    If the candidate step does not produce a good enough new point,

    we shrink the trust-region radius and re-solve the subproblem.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Line search

    Trust-region

    2.5 2 1.5 1 0.5

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    xk

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Line search

    Trust-region

    In the line search strategy, the direction is chosen first, followed

    by the distance.

    In the trust-region strategy, the maximum distance is chosen

    first, followed by the direction.

    2.5 2 1.5 1 0.5

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    xk

    p

    2.5 2 1.5 1 0.5

    0.6

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    xk

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

  • 8/12/2019 Newton Handouts

    3/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Convergence rate

    Linear convergence

    Quadratic convergence

    Local vs. global convergence

    Globalization strategies

    Convergence rate

    In order to compare different iterative methods, we need an

    efficiency measure.

    Since we do not know the number of iterations in advance, the

    computational complexitymeasure used by direct methods

    cannot be used.

    Instead the concept of aconvergence rateis defined.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Convergence rate

    Linear convergence

    Quadratic convergence

    Local vs. global convergence

    Globalization strategies

    Assume we have a series {xk}that converges to a solutionx.

    Define the sequence of errors as

    ek= xk x

    and note that

    limk

    ek= 0.

    We say that the sequence{xk}converges tox with rater and

    rate constantC if

    limk

    ek+1ekr

    = C

    andC

  • 8/12/2019 Newton Handouts

    4/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Convergence rate

    Linear convergence

    Quadratic convergence

    Local vs. global convergence

    Globalization strategies

    Quadratic convergence, examples

    Forr= 2,C= 0.1 oche0 = 1, the sequence becomes

    1, 101, 103, 107, . . .

    Forr= 2,C= 3 oche0 = 1, the sequence diverges

    1, 3, 27, . . .

    Forr= 2,C= 3 oche0 = 0.1, the sequence becomes

    0.1, 0.03, 0.0027, . . . ,

    i.e. it converges despite C>1.

    For quadratic convergence, the constantCis of lesser importance.

    Instead it is important that the initial approximation is close enough to

    the solution, i.e.e0is small.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Convergence rate

    Linear convergence

    Quadratic convergence

    Local vs. global convergence

    Globalization strategies

    Local vs. global convergence

    A method is called locally convergentif it produces a convergent

    sequence toward a minimizerx provided aclose enough

    starting approximation.

    A method is called globally convergentif it produces a

    convergent sequence toward a minimizer x providedany

    starting approximation.

    Note that global convergence does not imply convergencetowards a global minimizer.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Convergence rate

    Linear convergence

    Quadratic convergence

    Local vs. global convergence

    Globalization strategies

    Globalization strategies

    The line search and trust-region methods are sometimes called

    globalization strategies, since they modify a core method

    (typically locally convergent) to become globally convergent.

    There are two efficiency requirements on any globalizationstrategy: Far from the solution, they should stop the methods from going

    out of control. Close to the solution, when the core method is efficient, they

    should interfere as little as possible.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Descent directions

    Consider the Taylor expansion of the objective function along a

    search directionp

    f(xk+ p) = f(xk) + pTfk+

    1

    22pT2f(xk+ p)p,

    for some (0, )

    Any directionpsuch thatpTfk

  • 8/12/2019 Newton Handouts

    5/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    If the search direction has the form

    pk= B1k fk,

    the descent condition

    pTk fk= fTkB

    1k fk 0.

    Ideally we would like to find the global minimizer of for everyiteration. This is called anexactline search.

    However, it is possible to construct inexactline search methods

    that produce an adequate reduction of f at a minimal cost.

    Inexact line search methods construct a number of candidatevalues for and stop when certain conditions are satisfied.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

  • 8/12/2019 Newton Handouts

    6/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    Overview

    Exact and inexact line searches

    The Sufficient Decrease Condition

    Backtracking

    The Curvature Condition

    The Wolfe Condition

    The Sufficient Decrease Condition

    Mathematically, the descent conditionf(xk+ pk)

  • 8/12/2019 Newton Handouts

    7/9

  • 8/12/2019 Newton Handouts

    8/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    Properties of the Newton method

    Advantages:

    It converges quadraticallytoward a stationary point.

    Disadvantages:

    It does not necessarilyconverge toward a minimizer.

    It may diverge if the startingapproximation is too far from thesolution.

    It will fail if 2f(xk) is notinvertible for somek.

    It requires second-orderinformation2f(xk).

    Newtons method is rarely used in its classical formulation. However, many

    methods may be seen as approximations of Newtons method.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    Ensuring a descent direction

    Since the Newton search directionpN is written as

    pN = B1k fk,

    withBk= 2fk,p

    N will be a descent direction if 2fkis positive definite.

    If2fk isnotpositive definite, the Newton directionpN may not a

    descent direction.

    In that case we will chooseBkas a positive definite approximation of2fk.

    Performed in a proper way, this modified algorithm will converge towarda minimizer. Furthermore, close to the solution the Hessian is usuallypositive definite and the modification will only be performed far fromthe solution.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

  • 8/12/2019 Newton Handouts

    9/9

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    The positive definite approximation Bkof the Hessian may be

    found with minimal extra effort: The search direction pis

    calculated as the solution of2f(x)p= f(x).

    If2f(x) is positive definite, the matrix factorization

    2f(x) = LDLT

    may be used, where the diagonal elements of Dare positive. If2f(x) is not positive definite, at some point during the

    factorization, a diagonal element will bedii0. In this case, theelement may be replaced with a suitable positive entry.

    Finally, the factorization is used to calculate the search direction

    (LDLT)p= f(x).

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search

    Methods for unconstrained optimization

    Convergence

    Descent directions

    Line search

    The Newton Method

    The Newton-Raphson method in 1

    The Classical Newton minimization method inn

    Geometrical interpretation; the model function

    Properties of the Newton method

    Ensuring a descent direction

    The modified Newton algorithm with line search

    The modified Newton algorithm with line search

    Specify a starting approximationx0 and a convergence tolerance.Repeat fork= 0, 1, . . .

    Iff(xk)< , stop.

    Compute the modifiedLDLT factorization of the Hessian.

    Solve(LDLT)pNk = f(xk)

    for the search directionpN

    k. Perform a line search to determine the new approximation

    xk+1= xk+ kpNk.

    c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search