newton handouts
TRANSCRIPT
-
8/12/2019 Newton Handouts
1/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Nonlinear OptimizationOverview of methods; the Newton method with line search
Niclas Brlin
Department of Computing Science
Ume University
November 19, 2007
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Line search
Trust-region
Overview
Most deterministic methods for unconstrained optimization have thefollowing features:
They areiterative, i.e. they start with an initial guess x0 of the
variables and tries to find better points {xk}, k= 1, . . ..
They aredescent methods, i.e. at each iteration k,
f(xk+1) 0
f(xk+ pk),
where the scalar is called thestep length.
In theory we would like optimal step lengths, but in practice it is
more efficient to test trial step lengths until we find one that givesus a good enough point.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
-
8/12/2019 Newton Handouts
2/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Line search
Trust-region
2.5 2 1.5 1 0.5
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
xk
p
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Line search
Trust-region
Trust-region
In the trust-region strategy, the algorithm defines a region of trustaroundxkwhere the current model function mk is trusted.
The region of trust is usually defined as
p2 ,
where the scalar is called the trust-region radius. A candidate steppis found by approximately solving the
following subproblem
minp
mk(xk+ p) s.t. p2 .
If the candidate step does not produce a good enough new point,
we shrink the trust-region radius and re-solve the subproblem.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Line search
Trust-region
2.5 2 1.5 1 0.5
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
xk
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Line search
Trust-region
In the line search strategy, the direction is chosen first, followed
by the distance.
In the trust-region strategy, the maximum distance is chosen
first, followed by the direction.
2.5 2 1.5 1 0.5
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
xk
p
2.5 2 1.5 1 0.5
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
xk
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
-
8/12/2019 Newton Handouts
3/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Convergence rate
Linear convergence
Quadratic convergence
Local vs. global convergence
Globalization strategies
Convergence rate
In order to compare different iterative methods, we need an
efficiency measure.
Since we do not know the number of iterations in advance, the
computational complexitymeasure used by direct methods
cannot be used.
Instead the concept of aconvergence rateis defined.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Convergence rate
Linear convergence
Quadratic convergence
Local vs. global convergence
Globalization strategies
Assume we have a series {xk}that converges to a solutionx.
Define the sequence of errors as
ek= xk x
and note that
limk
ek= 0.
We say that the sequence{xk}converges tox with rater and
rate constantC if
limk
ek+1ekr
= C
andC
-
8/12/2019 Newton Handouts
4/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Convergence rate
Linear convergence
Quadratic convergence
Local vs. global convergence
Globalization strategies
Quadratic convergence, examples
Forr= 2,C= 0.1 oche0 = 1, the sequence becomes
1, 101, 103, 107, . . .
Forr= 2,C= 3 oche0 = 1, the sequence diverges
1, 3, 27, . . .
Forr= 2,C= 3 oche0 = 0.1, the sequence becomes
0.1, 0.03, 0.0027, . . . ,
i.e. it converges despite C>1.
For quadratic convergence, the constantCis of lesser importance.
Instead it is important that the initial approximation is close enough to
the solution, i.e.e0is small.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Convergence rate
Linear convergence
Quadratic convergence
Local vs. global convergence
Globalization strategies
Local vs. global convergence
A method is called locally convergentif it produces a convergent
sequence toward a minimizerx provided aclose enough
starting approximation.
A method is called globally convergentif it produces a
convergent sequence toward a minimizer x providedany
starting approximation.
Note that global convergence does not imply convergencetowards a global minimizer.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Convergence rate
Linear convergence
Quadratic convergence
Local vs. global convergence
Globalization strategies
Globalization strategies
The line search and trust-region methods are sometimes called
globalization strategies, since they modify a core method
(typically locally convergent) to become globally convergent.
There are two efficiency requirements on any globalizationstrategy: Far from the solution, they should stop the methods from going
out of control. Close to the solution, when the core method is efficient, they
should interfere as little as possible.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Descent directions
Consider the Taylor expansion of the objective function along a
search directionp
f(xk+ p) = f(xk) + pTfk+
1
22pT2f(xk+ p)p,
for some (0, )
Any directionpsuch thatpTfk
-
8/12/2019 Newton Handouts
5/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
If the search direction has the form
pk= B1k fk,
the descent condition
pTk fk= fTkB
1k fk 0.
Ideally we would like to find the global minimizer of for everyiteration. This is called anexactline search.
However, it is possible to construct inexactline search methods
that produce an adequate reduction of f at a minimal cost.
Inexact line search methods construct a number of candidatevalues for and stop when certain conditions are satisfied.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
-
8/12/2019 Newton Handouts
6/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
Overview
Exact and inexact line searches
The Sufficient Decrease Condition
Backtracking
The Curvature Condition
The Wolfe Condition
The Sufficient Decrease Condition
Mathematically, the descent conditionf(xk+ pk)
-
8/12/2019 Newton Handouts
7/9
-
8/12/2019 Newton Handouts
8/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
Properties of the Newton method
Advantages:
It converges quadraticallytoward a stationary point.
Disadvantages:
It does not necessarilyconverge toward a minimizer.
It may diverge if the startingapproximation is too far from thesolution.
It will fail if 2f(xk) is notinvertible for somek.
It requires second-orderinformation2f(xk).
Newtons method is rarely used in its classical formulation. However, many
methods may be seen as approximations of Newtons method.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
Ensuring a descent direction
Since the Newton search directionpN is written as
pN = B1k fk,
withBk= 2fk,p
N will be a descent direction if 2fkis positive definite.
If2fk isnotpositive definite, the Newton directionpN may not a
descent direction.
In that case we will chooseBkas a positive definite approximation of2fk.
Performed in a proper way, this modified algorithm will converge towarda minimizer. Furthermore, close to the solution the Hessian is usuallypositive definite and the modification will only be performed far fromthe solution.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
-
8/12/2019 Newton Handouts
9/9
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
The positive definite approximation Bkof the Hessian may be
found with minimal extra effort: The search direction pis
calculated as the solution of2f(x)p= f(x).
If2f(x) is positive definite, the matrix factorization
2f(x) = LDLT
may be used, where the diagonal elements of Dare positive. If2f(x) is not positive definite, at some point during the
factorization, a diagonal element will bedii0. In this case, theelement may be replaced with a suitable positive entry.
Finally, the factorization is used to calculate the search direction
(LDLT)p= f(x).
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search
Methods for unconstrained optimization
Convergence
Descent directions
Line search
The Newton Method
The Newton-Raphson method in 1
The Classical Newton minimization method inn
Geometrical interpretation; the model function
Properties of the Newton method
Ensuring a descent direction
The modified Newton algorithm with line search
The modified Newton algorithm with line search
Specify a starting approximationx0 and a convergence tolerance.Repeat fork= 0, 1, . . .
Iff(xk)< , stop.
Compute the modifiedLDLT factorization of the Hessian.
Solve(LDLT)pNk = f(xk)
for the search directionpN
k. Perform a line search to determine the new approximation
xk+1= xk+ kpNk.
c 2007 Niclas Brlin, CS, UmU Nonlinear Optimization; The Newton method w/ line search