advanced algorithms in nonlinear optimization

Upload: claitonjs

Post on 02-Jun-2018

241 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    1/429

    Advanced Algorithms in Nonlinear Optimization

    Philippe Toint

    Department of Mathematics, University of Namur, Belgium

    ( [email protected] )

    Belgian Francqui Chair, Leuven, April 2009

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    2/429

    Outline

    1 Nonlinear optimization: motivation, past and perspectives

    2 Trust region methods for unconstrained problems

    3 Derivative free optimization, filters and other topics

    4 Convex constraints and interior-point methods

    5 The use of problem structure for large-scale applications

    6 Regularization methods and nonlinear step control

    7 Conclusions

    Philippe Toint (Namur) April 2009 2 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    3/429

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    4/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    What is optimization?

    The best choice subject to constraints

    best criterion,objective functionchoice

    variableswhose value may be chosen

    constraints restrictionson allowed values of the variables

    Philippe Toint (Namur) April 2009 4 / 323

    N li i i i i i d i D fi i i d l

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    5/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    More formally

    variables x= (x1, x2, . . . , xn)objective function minimize/maximize f(x)constraints c(x)0

    Note: maximize f(x) equivalent to minimize

    f(x).

    minx

    f(x)

    such thatc

    (x

    )0(the general nonlinear optimization problem)

    (+ conditions on x, f and c)

    Philippe Toint (Namur) April 2009 5 / 323

    N li ti i ti ti ti t d ti D fi iti d l

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    6/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Nature optimizes

    Philippe Toint (Namur) April 2009 6 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    7/429

    Nonlinear optimization: motivation past and perspectives Definition and examples

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    8/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: PAL design (1)

    Design of modern Progressive Adaptive Lenses:

    vary optical power of lenses while minimizing astigmatism

    Loos, Greiner, Seidel (1997)

    Philippe Toint (Namur) April 2009 8 / 323

    Nonlinear optimization: motivation past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    9/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: PAL design (2)

    Achievements: Loos, Greiner, Seidel (1997)

    uncorrected short distance

    long distance PALPhilippe Toint (Namur) April 2009 9 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    10/429

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: PAL design (3)

    Is this nonlinear (

    difficult)?

    Assume the lens surface is z=z(x, y). Theoptical poweris

    p(x, y) = N3

    2

    1 +

    z

    x2

    2z

    y2+

    1 +

    z

    y2

    2z

    x2 2 z

    x

    z

    y

    2z

    xy

    where

    N=N(x, y) = 1

    1 +

    zx

    2+

    zy

    2.

    Thesurface astigmatismis then

    a(x, y) =2

    p(x, y) N4

    z

    x

    z

    y

    2z

    xy

    2

    Philippe Toint (Namur) April 2009 10 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    11/429

    p , p p p p

    Applications: Food sterilization (1)

    A common problem in the food processing industry:

    keep a max of vitamins while killing a prescribed fraction of the bacteria

    heating in steam/hot water autoclaves

    Sachs (2003)

    Philippe Toint (Namur) April 2009 11 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    12/429

    p , p p p p

    Applications: Food sterilization (2)

    Model: coupled PDEs

    Concentrationof micro-organisms and other nutrients:

    C

    t(x, t) =K[(x, t)]C(x, t),

    where (x, t) is the temperature, and where

    K[] =K1eK2

    1

    1r

    (Arrhenius equation)

    Evolution oftemperature:

    c()

    t = [k()],

    (with suitableboundary conditions: coolant, initial temperature,. . . )

    Philippe Toint (Namur) April 2009 12 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    13/429

    Applications: biological parameter estimation (1)

    K-channel in a the model of a neuron membrane:

    Sansom (2001)

    Doyle et al. (1998)

    Philippe Toint (Namur) April 2009 13 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    14/429

    Applications: biological parameter estimation (2)

    Where are these neurons?

    in a Pacific spiny lobster!

    Simmers, Meyrand and Moulin (1995)

    Philippe Toint (Namur) April 2009 14 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    15/429

    Applications: biological parameter estimation (3)

    After gathering experimental data (applying a current to the cell):

    estimate the biological model parameters that best fit experiments

    Model:

    Activation: pindependentgates

    Deactivation: nh gates with differentdynamics

    nh+ 2coupled ODEsfor the voltage, the activation level, the partialinactivations levels

    5-points BDF for50000time steps very nonlinear!

    Philippe Toint (Namur) April 2009 15 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    16/429

    Applications: data assimilation for weather forecasting (1)

    (Attempt to) predict. . .

    tomorrows weather

    the oceans average temperaturenext month

    future gravity field

    future currents in the ionosphere

    . . .

    Philippe Toint (Namur) April 2009 16 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    17/429

    Applications: data assimilation for weather forecasting (2)

    Data: temperature, wind, pressure, . . . everywhere and at all times!

    May involve up to250000000variables!

    Philippe Toint (Namur) April 2009 17 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    18/429

    Applications: data assimilation for weather forecasting (3)

    The principle:

    2.5 2 1.5 1 0.5 0 0.5 1

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    temp. vs. days

    Knownsituation2.5 days agoand background prediction

    Philippe Toint (Namur) April 2009 18 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    f f ( )

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    19/429

    Applications: data assimilation for weather forecasting (3)

    The principle:

    2.5 2 1.5 1 0.5 0 0.5 1

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    temp. vs. days

    Knownsituation2.5 days ago

    and background prediction Record temperaturefor the past 2.5 days

    Philippe Toint (Namur) April 2009 18 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    A li i d i il i f h f i (3)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    20/429

    Applications: data assimilation for weather forecasting (3)

    The principle:

    Minimize deviation between model and past observations

    2.5 2 1.5 1 0.5 0 0.5 1

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    temp. vs. days

    Knownsituation2.5 days agoand background prediction

    Record temperaturefor the past 2.5 days Run the model tominimizedifference

    Ibetween model and observations

    minx0

    1

    2x0 xb2B1+

    1

    2

    Ni=0

    HM(ti, x0) bi2R1i .

    Philippe Toint (Namur) April 2009 18 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    A li i d i il i f h f i (3)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    21/429

    Applications: data assimilation for weather forecasting (3)

    The principle:

    Minimize deviation between model and past observations

    2.5 2 1.5 1 0.5 0 0.5 1

    0.4

    0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    temp. vs. days

    Knownsituation2.5 days agoand background prediction

    Record temperaturefor the past 2.5 days Run the model tominimizedifferenceIbetween model and observations

    Predicttemperature for the next day

    Philippe Toint (Namur) April 2009 18 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    A li ti d t i il ti f th f ti (4)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    22/429

    Applications: data assimilation for weather forecasting (4)

    Analysis of the oceans heat content: CERFACS (2009)

    Much better fit!

    Philippe Toint (Namur) April 2009 19 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    A li ti ti l t t d i

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    23/429

    Applications: aeronautical structure design

    minimize weight while maintaining structural integrity

    mass reduction during optimization

    SAMTECH (2009)

    Philippe Toint (Namur) April 2009 20 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    A li ti s st id t j t t hi

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    24/429

    Applications: asteroid trajectory matching

    find todays asteroid whose orbital parameters

    match best one observed 50 years ago

    Milani, Sansaturio et al. (2005)

    Philippe Toint (Namur) April 2009 21 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: discrete choice modelling (1)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    25/429

    Applications: discrete choice modelling (1)

    Context: simulation of individual choices inTransportation(or other)

    (mode, route, time of departure,. . . )

    Random utility theory

    An individual iassigns to alternative j theutility

    Uij = [parametersexplaining factors] +[ random error ]

    Illustration :

    Ubus=distance 1.2 price of ticket 2.1 delay wrt to car travel+

    Philippe Toint (Namur) April 2009 22 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: discrete choice modelling (2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    26/429

    Applications: discrete choice modelling (2)

    Probability that individual i chooses alternative j rather thanalternative kgiven by

    prob(Uij Uik for all k)

    Data: mobility surveys (MOBEL)

    find the parameters in the utility function to

    maximize likelihood of observed behaviours

    Philippe Toint (Namur) April 2009 23 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: discrete choice modelling (3)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    27/429

    Applications: discrete choice modelling (3)

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 5 10 15 20 25 30 35 40 45

    NormalLognormal

    Spline

    Using advanced optimization

    Using standard statistics

    Estimation of thevalue of time lost in congested trafic(with and without advanced optimization)

    Philippe Toint (Namur) April 2009 24 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: Poisson image denoising (1)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    28/429

    Applications: Poisson image denoising (1)

    Consider a two dimensional image with noise proportional to signal

    zij=uij+nf(uij)

    where n is a random Gaussian noise. How to recover the original uij?

    use the pixel values as much as possiblewhile minimizing sharp transitions (gradients)

    This leads to the optimization problem

    minu

    ij

    (uij zijlog(uij)) +

    u

    Philippe Toint (Namur) April 2009 25 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: Poisson image denoising (2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    29/429

    Applications: Poisson image denoising (2)

    Somespectacular results: a 512512 picture with95% noise

    Philippe Toint (Namur) April 2009 26 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: Poisson image denoising (2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    30/429

    Applications: Poisson image denoising (2)

    Somespectacular results: a 512512 picture with95% noise

    Chan and Chen (2007)

    Philippe Toint (Namur) April 2009 26 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: shock simulation in video games

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    31/429

    Applications: shock simulation in video games

    Optimize the realism of the motion of multiple rigid bodies in space

    complementarity problem

    q[q(t)]v(t)0(q(t))0

    (q(t) = positions, v(t) = dqdt

    (t) = velocities)

    system of inequalities and equalities

    used in realtime for video animation

    Anitescu and Potra (1996)

    Philippe Toint (Namur) April 2009 27 / 323

    Nonlinear optimization: motivation, past and perspectives Definition and examples

    Applications: finance

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    32/429

    Applications: finance

    1 risk management

    2 portofolio analysis

    3 FX markets

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    -1.5 -1 -0.5 0 0.5 1 1.5

    NormalSpline

    OptimizedStandard

    Investment distributionfor the BoJ 1991-2004

    4 . . .

    Everybody lovesanoptimizer!

    Philippe Toint (Namur) April 2009 28 / 323

    Nonlinear optimization: motivation, past and perspectives History

    Where does optimization come from?

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    33/429

    Where does optimization come from?

    Nous sommes comme des nains juches sur des epaules de geants, de tellesorte que nous puissions voir plus de choses et de plus eloignees que nenvoyaient ces derniers. Et cela, non point parce que notre vue seraitpuissante ou notre taille avantageuse, mais parce que nous sommes porteset exhausses par la haute stature des geants.

    We are like dwarfs standing on the shoulders of giants, such that we cansee more things and further away than they could. And this, not becauseour sight would be more powerful or our height more advantageous, butbecause we are carried and heigthened by the high stature of the giants.

    Bernard de Chartres (1130-1160)

    Philippe Toint (Namur) April 2009 29 / 323

    Nonlinear optimization: motivation, past and perspectives History

    Euclid (300 BC) Al-Khwarizmi (783-850)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    34/429

    (300 C) ( 83 850)

    Philippe Toint (Namur) April 2009 30 / 323

    Nonlinear optimization: motivation, past and perspectives History

    Isaac Newton (1642-1727) Leonhardt Euler (1707-1783)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    35/429

    ( ) ( )

    Philippe Toint (Namur) April 2009 31 / 323

    Nonlinear optimization: motivation, past and perspectives History

    J. de Lagrange (1735-1813) Friedrich Gauss (1777-1855)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    36/429

    g g ( ) ( )

    Philippe Toint (Namur) April 2009 32 / 323

    Nonlinear optimization: motivation, past and perspectives History

    Augustin Cauchy (1789-1857) George Dantzig (1914-2005)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    37/429

    g y ( ) g g ( )

    Philippe Toint (Namur) April 2009 33 / 323

    Nonlinear optimization: motivation, past and perspectives History

    Michael Powell Roger Fletcher

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    38/429

    Philippe Toint (Namur) April 2009 34 / 323

    Nonlinear optimization: motivation, past and perspectives Basic concepts

    Return to the mathematical problem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    39/429

    minx

    f(x)

    such thatc(x)0

    Difficulties:

    the objective function f(x) is typically complicated(nonlinear)

    it is also oftencostly to compute

    there may bemany variables

    the constraints c(x) may defined acomplicated geometry

    Philippe Toint (Namur) April 2009 35 / 323

    Nonlinear optimization: motivation, past and perspectives Basic concepts

    An example unconstrained problem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    40/429

    minimize: f(, ) =102 + 102 + 4 sin() 2+4

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Two local minima: (2.20, 0.32) and (2.30, 0.34)

    How to find them?

    Philippe Toint (Namur) April 2009 36 / 323

    Nonlinear optimization: motivation, past and perspectives Basic concepts

    Trust-region methods

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    41/429

    iterative algorithms

    findlocal solutionsonly

    Algorithm 1.1: The trust-region framework

    Until an (approximate) solution is found:Step 1: use a model of the nonlinear function(s)

    within region where it can betrusted

    Step 2: notion of sufficientdecrease

    Step 3: measureachieved and predicted reductionsStep 4: decrease the region radius ifunsuccessful

    Philippe Toint (Namur) April 2009 37 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    42/429

    minimize: f(, ) =102 + 102 + 4 sin() 2+4

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Two local minima: (2.20, 0.32) and (2.30, 0.34)

    Philippe Toint (Namur) April 2009 38 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    43/429

    x0 = (0.71, 3.27) and f(x0)= 97.630Contours off Contours ofm0 around x0

    (quadratic model)

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 39 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    44/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s0

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    45/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s01 2 (0.62, 1.78) 2.306 1.354 x1+s1

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    46/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s01 2 (0.62, 1.78) 2.306 1.354 x1+s12 4 (3.21, 0.00) 6.295 0.004 x2

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    47/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s01 2 (0.62, 1.78) 2.306 1.354 x1+s12 4 (3.21, 0.00) 6.295 0.004 x23 2 (1.90, 0.08) 29.392 0.649 x2+s2

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    48/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s01 2 (0.62, 1.78) 2.306 1.354 x1+s12 4 (3.21, 0.00) 6.295 0.004 x23 2 (1.90, 0.08) 29.392 0.649 x2+s24 2 (0.32, 0.15) 31.131 0.857 x3+s3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    k k sk f(xk+sk) f/mk xk+1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    49/429

    0 1 (0.05, 0.93) 43.742 0.998 x0+s01 2 (0.62, 1.78) 2.306 1.354 x1+s12 4 (3.21, 0.00) 6.295 0.004 x23 2 (1.90, 0.08) 29.392 0.649 x2+s24 2 (0.32, 0.15) 31.131 0.857 x3+s35 4 (0.03, 0.02) 31.176 1.009 x4+s4

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 40 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    50/429

    Nonlinear optimization: motivation, past and perspectives Illustration

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    51/429

    Path of iterates: From another x0:

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    3 2 1 0 1 2 3

    3

    2

    1

    0

    1

    2

    3

    Philippe Toint (Namur) April 2009 41 / 323

    Nonlinear optimization: motivation, past and perspectives Illustration

    And then...

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    52/429

    Does it (always) work?

    The answertomorrow!(and subsequent days for a (biased) survey of new optimization methods)

    Thank you to you for your attention

    Philippe Toint (Namur) April 2009 42 / 323

    Trust region methods for unconstrained problems

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    53/429

    Philippe Toint (Namur) April 2009 43 / 323

    Trust region methods for unconstrained problems

    H(

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    54/429

    Lesson 2:

    Trust-region methodsfor unconstrained problems

    Philippe Toint (Namur) April 2009 43 / 323

    Trust region methods for unconstrained problems

    The basic text for this course

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    55/429

    A. R. Conn, N. I. M. Gould and Ph. L. Toint,Trust-Region Methods,

    Nr 01 in the MPS-SIAM Series on Optimization,SIAM, Philadelphia, USA, 2000.

    Philippe Toint (Namur) April 2009 44 / 323

    Trust region methods for unconstrained problems Background material

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    56/429

    2.1: Background material

    Philippe Toint (Namur) April 2009 45 / 323

    Trust region methods for unconstrained problems Background material

    Scalar mean-value theorems

    http://goforward/http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    57/429

    LetS be an open subset of IRn, and suppose f :S IR iscontinuously differentiable throughout S. Then, if the segmentx+s S for all [0, 1],

    f(x+s) =f(x) + xf(x+s), s

    for some

    [0, 1].

    Let Sbe an open subset of IRn, and supposef :S IR istwicecontinuously differentiable throughout S. Then, if the segmentx+s

    S for all

    [0, 1],

    f(x+s) =f(x) + xf(x), s + 12s, xxf(x+s)s

    for some [0, 1].

    Philippe Toint (Namur) April 2009 46 / 323

    Trust region methods for unconstrained problems Background material

    Vector mean-value theorem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    58/429

    LetS be an open subset of IRn, and suppose F :S IRm iscontinuously differentiable throughout S. Then, if the segmentx+s

    S for all

    [0, 1],

    F(x+s) =F(x) +

    10

    xF(x+s)s d.

    Philippe Toint (Namur) April 2009 47 / 323

    Trust region methods for unconstrained problems Background material

    Taylors scalar approximation theorems (1)

    http://goforward/http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    59/429

    LetS be an open subset of IRn, and suppose f :S IR iscontinuously differentiable throughoutS. Suppose further thatxf(x) is Lipschitz continuous at x, with Lipschitz constant(x) in some appropriate vector norm. Then, if the segment

    x+s S for all [0, 1],|f(x+s) m(x+s)| 1

    2(x)s2,

    wherem(x+s) =f(x) +

    xf(x), s

    .

    Philippe Toint (Namur) April 2009 48 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    60/429

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    61/429

    Trust region methods for unconstrained problems Background material

    Newtons method

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    62/429

    Solve F(x) = 0

    Idea: solve linear approximation

    F(x) +J(x)s= 0

    quadratic local convergence

    . . .butnotglobally convergent

    Yet the basis of everything that follows

    Philippe Toint (Namur) April 2009 51 / 323

    Trust region methods for unconstrained problems Background material

    Unconstrained optimality conditions

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    63/429

    Suppose that f

    C1, and that x is a local minimizer off(x).

    Thenxf(x) = 0.

    Suppose that f

    C2, and that x is a local minimizer off(x).Then the above holds and the objective functions Hessian atx is positive semi-definite, that is

    s, xxf(x)s 0 for all sIRn.

    s, xxf(x)s> 0 for all s= 0IRn

    strictlocal solution

    Philippe Toint (Namur) April 2009 52 / 323

    Trust region methods for unconstrained problems Background material

    Constrained optimality conditions (1)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    64/429

    minimize f(x)subject to ci(x) = 0, for i E,and ci(x) 0, for i I,

    Active set:

    x1

    x2

    x3

    C

    F

    F{3}

    F{1,3}

    F{2,3}

    F{2}

    F{1}F{1,2}

    c1(x) = 0

    c2(x) = 0

    c3(x) = 0

    A(x1) ={1, 2}A(x2) =A(x3) ={3}

    Philippe Toint (Namur) April 2009 53 / 323

    Trust region methods for unconstrained problems Background material

    Constrained optimality conditions (2): first order

    I i lifi i !

    http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    65/429

    Ignore constraint qualification!

    Suppose that f, cC1, and that x is a local solution. Thenthere exist a vector ofLagrange multipliers y such that

    xf(x) =

    iEI[y]ixci(x)

    ci(x) = 0 for all i Eci(x) 0 and [y]i 0 for all i I

    and ci(x)[y]i = 0 for all i

    I.

    Lagrangian: (x, y) =f(x)

    iEI

    yici(x)

    Philippe Toint (Namur) April 2009 54 / 323

    Trust region methods for unconstrained problems Background material

    Constrained optimality conditions (3): second order

    2

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    66/429

    Suppose that f, c C2, and that x is a local minimizer of f(x).Then there exist a vector of Lagrange multipliers y such that first-

    order conditions hold and

    s, xx(x, y)s 0 for all s N+where

    N+ is the set of vectors s such that

    s, xci(x)= 0 for all i E{j A(x)I | [y]j>0}

    and

    s, xci(x) 0 for all i {j A(x)I | [y]j= 0}strict complementarity:s, xx(x, y)s>0 for all s N+ (s= 0)

    strictlocal solutionPhilippe Toint (Namur) April 2009 55 / 323

    Trust region methods for unconstrained problems Background material

    Optimatity conditions (convex 1)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    67/429

    Assume now thatC is convexnormal cone ofC at x C,

    N(x) def={yIRn | y, u x 0,u C}

    tangent cone ofC at x C

    T(x) def=N(x)0 =cl{(u x)| 0 and u C}

    Philippe Toint (Namur) April 2009 56 / 323

    Trust region methods for unconstrained problems Background material

    Optimality conditions (convex 2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    68/429

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    00000000

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    11111111

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    00000000000000

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    11111111111111

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    000000000000

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    111111111111

    00000000000000000000000000000

    00000000000000000000000000000

    00000000000000000000000000000

    00000000000000000000000000000

    00000000000000000000000000000

    00000000000000000000000000000

    11111111111111111111111111111

    11111111111111111111111111111

    11111111111111111111111111111

    11111111111111111111111111111

    11111111111111111111111111111

    11111111111111111111111111111

    The Moreau decomposition

    Philippe Toint (Namur) April 2009 57 / 323

    Trust region methods for unconstrained problems Background material

    Optimatity conditions (convex 2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    69/429

    Suppose thatC = is convex, closed, that f is continuouslydifferentiable inC, and that x is a first-order critical point forthe minimization of f overC. Then, provided that constraintqualification holds,

    xf(x) N(x).

    Philippe Toint (Namur) April 2009 58 / 323

    Trust region methods for unconstrained problems Background material

    Conjugate gradients

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    70/429

    Idea: minimize a convex quadratic on successivenested Krylov subspaces

    Algorithm 2.1: Conjugate-gradients (CG)

    Given x0, set g0 =Hx0+cand let p0 =g0.For k= 0, 1, . . . , until convergence, perform the iteration

    k = gk22/pk, Hpkxk+1 = xk+kpk

    gk+1 = gk+kHpk

    k =

    gk+1

    22/

    gk

    22

    pk+1 = gk+1+kpk

    Philippe Toint (Namur) April 2009 59 / 323

    Trust region methods for unconstrained problems Background material

    Preconditioning

    Idea: change the variablesx=Rxand define M=RTR.

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    71/429

    g

    Algorithm 2.2: Preconditioned CG

    Given x0, set g0 =Hx0+c, and let v0 =M1g0 and p0 =v0.

    For k= 0, 1, . . . , until convergence, perform the iteration

    k = gk, vk/pk, Hpkxk+1 = xk+kpk

    gk+1 = gk+kHpk

    vk+1 = M1gk+1

    k = gk+1, vk+1/gk, vkpk+1 = vk+1+kpk

    Philippe Toint (Namur) April 2009 60 / 323

    Trust region methods for unconstrained problems Background material

    Lanczos method

    Idea: compute an orthonormal basis of the successive nested Krylov

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    72/429

    Idea: compute an orthonormal basis of the successive nested Krylovsubspaces

    makes QTkHQk tridiagonal

    Algorithm 2.3: Lanczos

    Given r0, set y0 =r0, q1 = 0.

    For k= 0, 1, . . ., perform the iteration,

    k = yk2qk = yk/k

    k = qk,

    Hqk

    yk+1 = Hqk kqk kqk1

    Philippe Toint (Namur) April 2009 61 / 323

    Trust region methods for unconstrained problems Background material

    Another view on the Conjugate-Gradients method

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    73/429

    Conjugate Gradients = Lanczos + LDLT (Cholesky)

    Lanczos

    Cholesky

    | {z }

    Conjugate gradients in one of the Krylov subspaces

    Philippe Toint (Namur) April 2009 62 / 323

    Trust region methods for unconstrained problems The Trust-region algorithm

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    74/429

    2.2: The trust-region algorithm

    Philippe Toint (Namur) April 2009 63 / 323

    Trust region methods for unconstrained problems The Trust-region algorithm

    The trust-region idea

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    75/429

    use amodelof the objective function

    define atrust-regionwhere it is thought adequate

    Bk={xIRn | x xkk k}

    find atrial pointbysufficiently decreasingthe model inBkcompute the objective function at the trial point

    compare achived vs. predicted reductions

    reduce k if unsatisfactory

    Philippe Toint (Namur) April 2009 64 / 323

    Trust region methods for unconstrained problems The Trust-region algorithm

    The basic trust-region algorithm

    Algorithm 2 4: Basic trust-region algorithm (BTR)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    76/429

    Algorithm 2.4: Basic trust region algorithm (BTR)

    Step 0: Initialization. x0 and 0 given, compute f(x0) and set k= 0.

    Step 1: Model definition. Choose kand define a model mk inBk.Step 2: Step calculation. Compute sk thatsufficiently reduces the

    model mk with xk+sk Bk.Step 3: Acceptance of the trial point. Compute f(xk+sk) and define

    k= f(xk)

    f(xk+sk)

    mk(xk) mk(xk+sk) .

    Ifk1, then define xk+1 =xk+sk; otherwise define xk+1=xk.Step 4: Trust-region radius update.

    k+1 [k, ) if k 2,[2k, k] if k[1, 2),

    [1k, 2k] if k< 1.

    Increment kby one and go to Step 1.

    Philippe Toint (Namur) April 2009 65 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    77/429

    2.3: Basic convergence theory

    Philippe Toint (Namur) April 2009 66 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    Assumptions

    f C 2

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    78/429

    fC2

    f(x)lbfxxf(x) ufh

    mk

    C2(

    Bk)

    mk(xk) =f(xk)

    gkdef=xmk(xk) =xf(xk)

    xxmk(x) umh 1 for all x Bk

    1une

    xk x unexk. . . but use k= 2 in what follows!

    Philippe Toint (Namur) April 2009 67 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    The Cauchy step

    Id i i i th C h

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    79/429

    Idea: minimize mkon theCauchy arc

    xCk(t) def={x| x=xk tgk, t 0 and x Bk}.

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    1 1.5 2 2.5 3 3.5 4 4.50

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    theCauchy point

    Philippe Toint (Namur) April 2009 68 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    The Cauchy point for quadratic models

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    80/429

    Three cases when minimizing thequadratic mkalong the Cauchy arc:

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 17

    6

    5

    4

    3

    2

    1

    0

    1

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 115

    10

    5

    0

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 17

    6

    5

    4

    3

    2

    1

    0

    1

    mk(xk) mk(xC

    k) 1

    2gk min gkk , k

    Philippe Toint (Namur) April 2009 69 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    The Cauchy point for general models

    Th h i i i i h l l h C h

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    81/429

    Three cases when minimizing thegeneral mkalong the Cauchy arc:

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 17

    6

    5

    4

    3

    2

    1

    0

    1

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 115

    10

    5

    0

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 17

    6

    5

    4

    3

    2

    1

    0

    1

    mk(xk) mk(xAC

    k )dcpgk min gkk , k

    Philippe Toint (Namur) April 2009 70 / 323 Trust region methods for unconstrained problems Basic convergence theory

    The meaning of sufficient decrease

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    82/429

    In both cases, we get:

    Sufficient decrease condition:

    mk(xk) mk(xk+sk)mdcgk mingk

    k, k

    ,

    Immediate consequence:

    Suppose thatxf(xk)= 0. Then mk(xk+sk)< mk(xk) andsk= 0.

    k is well defined!

    Philippe Toint (Namur) April 2009 71 / 323 Trust region methods for unconstrained problems Basic convergence theory

    The exact minimizer is OK

    Suppose that for all k s ensures that

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    83/429

    Suppose that, for all k, skensures that

    mk(xk) mk(xk+sk)amm[mk(xk) mk(xMk)],

    Then sufficient decrease is obtained.

    21

    5.2

    5

    20

    30

    45

    60

    80

    5.2

    5

    20

    30

    45

    60

    5.2

    21

    5

    1.5 1 0.5 0 0.5 1 1.51.5

    1

    0.5

    0

    0.5

    1

    1.5

    Philippe Toint (Namur) April 2009 72 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Taylor and minimum radius

    For all k, |f(xk +sk) mk(xk +sk)| ubh2k ,

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    84/429

    , | ( k + k) k( k + k)| ubh k,

    Suppose that gk= 0 and that

    k mdcgk(1 2)ubh

    .

    Then iteration kis very successful and

    k+1k.

    Suppose thatgk lbg > 0 for all k. Then is a constantlbd >0 such that, for all k

    k lbd.

    Philippe Toint (Namur) April 2009 73 / 323 Trust region methods for unconstrained problems Basic convergence theory

    First-order convergence (1)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    85/429

    Suppose that there are only finitely many successful iterations.Then xk = x for all sufficiently large k and x is first-ordercritical.

    Suppose that there are infinitely many successful iterations.Then

    lim infk

    xf(xk)= 0.

    idea: infinite descent if not critical

    Philippe Toint (Namur) April 2009 74 / 323 Trust region methods for unconstrained problems Basic convergence theory

    First-order convergence (2)

    Suppose that there are infinitely many successful iterations.

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    86/429

    Then limk

    xf(xk)

    = 0.

    For 1>0gk

    2

    kS {ti}{i} K

    Philippe Toint (Namur) April 2009 75 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Convex models (1)

    Suppose that min[xxmk(x)] for all x[xk, xk+sk] andf 0 Th

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    87/429

    for some >0. Then

    sk 2gk.

    idea: mkcurves upwards!

    Suppose that{xki} x and x is first-order critical, and thatthere is a constant smh>0 such that

    minxBk

    min[xxmk(x)]smhwhenever xk is sufficiently close to x Suppose finally thatxxf(x) is nonsingular. Then the complete sequence of it-erates{xk} converges to x.

    idea: steps too short to escapelocalbasin

    Philippe Toint (Namur) April 2009 76 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Convex models (2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    88/429

    But...

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    Philippe Toint (Namur) April 2009 77 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Asymptotically exact Hessians

    Assume also that

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    89/429

    limk xxf(xk) xxmk(xk)= 0 whenever limk gk= 0

    Suppose that{xki} x and x is first-order critical, thatsk= 0 for all ksufficiently large, and thatxxf(x) is positivedefinite. Then the complete sequence of iterates{xk} con-verges to x, all iterations are eventually very successful andthe trust-region radius k is bounded away from zero.

    idea: sufficient decrease implies that

    mk(xk) mk(xk+sk)mqdsk2 >0.

    Then k1.Philippe Toint (Namur) April 2009 78 / 323

    Trust region methods for unconstrained problems Basic convergence theory

    Second order: the eigen point

    Assume 0> k (Hk).Then fine the eigen direction uk such that

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    90/429

    Then fine theeigen direction uk such that

    uk, gk 0, ukk= k uk, Hkuk snck2k,

    Minimize the model along ukto compute theeigen point:mk(x

    Ek) =mk(xk+t

    Ekuk) = min

    t(0,1]mk(xk+tuk)

    2 1.5 1 0.5 0 0.5 1 1.5 22

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    Philippe Toint (Namur) April 2009 79 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Model decrease at the eigen point

    Suppose: 0> k(Hk), uk is an eigen direction and

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    91/429

    pp k ( k) k g

    xxm

    k(x

    ) xxm

    k(y

    ) lchx

    yfor all x, y Bk. Then

    mk(xk) mk(xEk) sodkmin[2k, 2k].

    (quadratic or general model)

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

    9

    8

    7

    6

    5

    4

    3

    2

    1

    0

    1

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

    9

    8

    7

    6

    5

    4

    3

    2

    1

    0

    1

    Philippe Toint (Namur) April 2009 80 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Second order: convergence theorems

    li [ f ( )] 0

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    92/429

    lim sup

    k

    min[

    xxf(xk)]

    0.

    Suppose that x is an isolated limit point of the sequence ofiterates{xk}. Then x is a second-order critical point.

    Assume also that, for 3>1,

    k2 and kmaxk+1[3k, 4k]

    Let x be any limit point of the sequence of iterates. Then xis a second-order critical point.

    Philippe Toint (Namur) April 2009 81 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Different trust-region norms

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    93/429

    2 1.5 1 0.5 0 0.5 1 1.5 22

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    2 1.5 1 0.5 0 0.5 1 1.5 22

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    2 1.5 1 0.5 0 0.5 1 1.5 22

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    Philippe Toint (Namur) April 2009 82 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Using norms for scaling

    Idea: change the variablesSkw=s

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    94/429

    ThenmSk(xk+w)f(xk+Skw) def= fS(w),

    BSk={xk+w|w k}.mSk(xk) =f(xk), g

    Sk=wf S(0) =STkxf(xk)

    HSk wwfS(0) =STkxxf(xk)Sk.Thus

    mSk(xk+w) = f(xk) + gSk, w + 12w, HSkw= f(xk) + S

    T

    kxf(xk), w + 1

    2w, ST

    k HkSkw= f(xk) + xf(xk), Skw + 12Skw, HkSkw= f(xk) + xf(xk), s + 12s, Hks= mk(xk+s)

    Philippe Toint (Namur) April 2009 83 / 323 Trust region methods for unconstrained problems Basic convergence theory

    Scaling: the geometry

    http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    95/429

    0.5 1 1.5 2 2.5 3 3.50.5

    1

    1.5

    2

    2.5

    3

    3.5

    0.5 1 1.5 2 2.5 3 3.50.5

    1

    1.5

    2

    2.5

    3

    3.5

    0.5 1 1.5 2 2.5 3 3.50.5

    1

    1.5

    2

    2.5

    3

    3.5

    Philippe Toint (Namur) April 2009 84 / 323 Trust region methods for unconstrained problems Solving the subproblem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    96/429

    2.4: Solving the subproblem

    Philippe Toint (Namur) April 2009 85 / 323 Trust region methods for unconstrained problems Solving the subproblem

    The subproblem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    97/429

    AssumeEuclidean norm

    quadratic model (possibly non-convex)

    (drop the index k)

    minsIRn

    q(s) g, s + 12s, Hs

    subject to s2

    Phili T i t (N ) A il 2009 86 / 323 Trust region methods for unconstrained problems Solving the subproblem

    Possible approaches

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    98/429

    exact minimization

    truncated conjugate-gradients

    CG + Lanczos (GLTR)

    doglegseigenvalue based methods

    (projection methods)

    Phili T i t (N ) A il 2009 87 / 323 Trust region methods for unconstrained problems Solving the subproblem

    The exact minimizer

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    99/429

    Anyglobalminimizer ofq(s) subject tos2 = satisfies theequation

    H(M)sM =g,where

    H(M

    ) def

    = H+M

    I is positive semi-definite,M 0 andM(sM2 ) = 0.

    IfH(M) is positive definite, sM is unique.

    Note: M is theLagrange multiplier

    Phili T i t (N ) A il 2009 88 / 323 Trust region methods for unconstrained problems Solving the subproblem

    The exact minimizer: a geometrical view

    2

    3

    4

    2

    3

    4

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    100/429

    4 3 2 1 0 1 2 3 4

    4

    3

    2

    1

    0

    1

    4 3 2 1 0 1 2 3 4

    4

    3

    2

    1

    0

    1

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    Phili T i t (N ) A il 2009 89 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    101/429

    Trust region methods for unconstrained problems Solving the subproblem

    The convex case

    30

    s()2

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    102/429

    6 5 4 3 2 1 0 1 2 3 4 5

    0

    5

    10

    15

    20

    25

    solution curve

    Phili T i (N ) A il 2009 91 / 323 Trust region methods for unconstrained problems Solving the subproblem

    A nonconvex case

    30

    s()2

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    103/429

    4 3 2 1 0 1 2 3 4 5 6

    0

    5

    10

    15

    20

    25

    1

    Philippe Toint (Namur) April 2009 92 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    The hard case: 1= 0

    30

    s()2

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    104/429

    4 3 2 1 0 1 2 3 4 5 6

    0

    5

    10

    15

    20

    25

    1

    no root larger than 2

    root near 2.2

    Philippe Toint (Namur) April 2009 93 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    Near the hard case: 10

    25

    30

    = 1 = 0.1

    = 0.01 = 0.00001

    s()2

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    105/429

    0

    5

    10

    15

    20

    25

    1 2 3

    Philippe Toint (Namur) April 2009 94 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    The secular equationIdea: consider thesecular equation

    def 1 1

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    106/429

    ()

    def

    =

    1

    s()2 1

    = 0

    Then

    0.5

    1

    1.5

    2

    2.5

    2 3 4

    = 1 = 0.1

    = 0.01 = 0.00001

    1/s()2

    apply Newtons methodto () = 0 :+ = ()/()Philippe Toint (Namur) April 2009 95 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    The derivatives of()

    Supposeg = 0. Then

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    107/429

    S pp g

    0

    () is strictly increasing ( >1), and concave.

    () =s(), s()s()32where s() =H()1s().

    Note: ifH() =LLT and Lw=s(), then

    s(), s()=s(), LTL1s()=w2

    Philippe Toint (Namur) April 2009 96 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    Newtons method on the secular equation

    Algorithm 2.5: Exact trust-region solver

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    108/429

    Let >1 and > 0 be given.1 Factorize H() =LLT.

    2 Solve LLTs=g.3 Solve Lw=s.

    4 Replace by+ s2 s2

    2w22.

    But . . . more complications due to

    bracketing the root (initial + update)termination rule

    may be preconditioned

    More (1978), More-Sorensen (1983), Dollar-Gould-Robinson(2009)

    Philippe Toint (Namur) April 2009 97 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    Approximate solution by truncated CG

    Fact: CGnever reentersthe 2 trust-region

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    109/429

    2 g

    1 1.5 2 2.5 3 3.5 4 4.50

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    1 1.5 2 2.5 3 3.5 4 4.50

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4 3 2 1 0 1 2 3 44

    3

    2

    1

    0

    1

    2

    3

    4

    May be preconditionedSteihaug (1983), T. (1981)

    Philippe Toint (Namur) April 2009 98 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    Approximate solution by the GLTR

    ST might hit the boundary for steepest descent stepsometimes slow

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    110/429

    Idea: solve the subproblem on the nested Krylov subspaces

    Algorithm 2.6: Two-phase GLTR algorithm

    as long asinterior: conjugate-gradients

    on theboundary: Lanczos method + subproblem solution inKrylov space

    (smooth transition)Gould-Lucidi-Roma-T. (1999)

    Philippe Toint (Namur) April 2009 99 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    Doglegs

    Idea: use steepest descent and the full Newtonstep (requires convexity?)

    dogleg curve

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    111/429

    0

    sDD

    sD

    sC

    sN

    trust-region boundary

    double-dogleg curve

    dogleg curve

    Powell (1970), Dennis-Mei (1979)

    Philippe Toint (Namur) April 2009 100 / 323

    Trust region methods for unconstrained problems Solving the subproblem

    An eigenvalue approach

    Rewrite(H+M)s=g

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    112/429

    as(H g)

    s

    1

    =Ms

    or (introducing the parameter )

    H ggT

    s1

    = ()

    M 00 1

    s1

    choose such that0,H+Mpositive semi-definite

    (sM ) = 0 Rendl-Wolkowicz (1997), Rojas-Santos-Sorensen (1999)Philippe Toint (Namur) April 2009 101 / 323

    Trust region methods for unconstrained problems Bibliography

    Bibliography for lesson 2 (1)J. E. Dennis and H. H. W. Mei,Two New Unconstrained Optimization Algorithms Which Use Function and Gradient Values,Journal of Optimization Theory and Applications, 28(4):453-482, 1979.

    H. S. Dollar, N. I. M. Gould and D. P. Robinson,On solving trust-region and other regularised subproblems in optimization ,

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    113/429

    Rutherford-Appleton Laboratory, Chilton, UK, Report RAL-TR-2009-003, 2009.S. M. Goldfeldt, R. E. Quandt and H. F. Trotter,Maximization by quadratic hill-climbing,Econometrica, 34:541-551, 1966.

    N. I. M. Gould, S. Lucidi, M. Roma and Ph. L. Toint,Solving the trust-region subproblem using the Lanczos method,SIAM Journal on Optimization, 9(2):504-525, 1999.

    K. Levenberg,

    A Method For The Solution Of Certain Problems In Least Squares,Quarterly Journal on Applied Mathematics, 2:164168, 1944.

    D. Marquardt,An Algorithm For Least-Squares Estimation Of Nonlinear Parameters,SIAM Journal on Applied Mathematics, 11:431-441, 1963.

    J. J. More,The Levenberg-Marquardt algorithm: implementation and theory,Numerical Analysis, Dundee 1977 (A. Watson, ed.), Springer Verlag, Heidelberg, 1978.

    J. J. More,Recent developments in algorithms and software for trust region methods ,in Mathematical Programming: The State of the Art (A. Bachem, M. Grotschel and B. Korte, eds.), Springer Verlag,Heidelberg, pp. 258-287, 1983.

    J. J. More and D. C. Sorensen,Computing A Trust Region Step,SIAM Journal on Scientific and Statistical Computing, 4(3):553-572, 1983.

    Philippe Toint (Namur) April 2009 102 / 323

    Trust region methods for unconstrained problems Bibliography

    Bibliography for lesson 2 (2)

    D D Morrison

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    114/429

    D. D. Morrison,

    Methods for nonlinear least squares problems and convergence proofs,Proceedings of the Seminar on Tracking Programs and Orbit Determination, (J. Lorell and F. Yagi, eds.), Jet PropulsionLaboratory, Pasadena, USA, pp. 1-9, 1960.

    M. J. D. Powell,A New Algorithm for Unconstrained Optimization,in Nonlinear Programming (J. B. Rosen, O. L. Mangasarian and K. Ritter, eds.), Academic Press, London, pp. 31-65,1970.

    F. Rendl and H. Wolkowicz,

    A Semidefinite Framework for Trust Region Subproblems with Applications to Large Scale Minimization,Mathematical Programming, 77(2):273-299, 1997.

    M. Rojas, S. A. Santos and D. C. Sorensen,A new matrix-free algorithm for the large-scale trust-region subproblem,CAAM, Rice University, TR99-19, 1999.

    D. Winfield,Function and functional optimization by interpolation in data tables,Ph.D. Thesis, Harvard University, Cambridge, USA, 1969.

    Philippe Toint (Namur) April 2009 103 / 323

    Derivative free optimization, filters and other topics

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    115/429

    Lesson 3:

    Derivative-free optimization,infinite dimensions and filters

    Philippe Toint (Namur) April 2009 104 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    116/429

    3.1: Derivative-free optimization

    Philippe Toint (Namur) April 2009 105 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    An application of trust-regions: unconstrained DFO

    Consider the unconstrained problem

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    117/429

    minx f(x)

    Gradient (and Hessian) off(x)unavailable

    physical measurement

    object code

    typically small-scale (but not always. . . )

    Derivative free optimization (DFO)f(x) typicallyvery costly

    Exploit each evaluation off(x) to the utmost possible

    considerableinterestof practitioners

    Philippe Toint (Namur) April 2009 106 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation methods for DFO

    Idea: Winfield (1973), Powell (1994)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    118/429

    Until convergence:

    Use the available function values to build apolynomialinterpolation model mk:

    mk(yi) =f(yi) yi Y;Minimize the model in a trust region, yielding a newpotentially good point;

    Compute a new function value.

    Y =interpolation set {points yiat which f(yi) is known}

    Philippe Toint (Namur) April 2009 107 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    119/429

    Derivative free optimization, filters and other topics Derivative free optimization

    A naive trust-region method for DFO: illustration

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    120/429

    21.510.500.511.5

    22.53

    2

    1

    0

    1

    2

    3

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    Philippe Toint (Namur) April 2009 108 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    A naive trust-region method for DFO: illustration

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    121/429

    21.510.500.511.5

    22.53

    2

    1

    0

    1

    2

    3

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    Philippe Toint (Namur) April 2009 108 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    A naive trust-region method for DFO: illustration

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    122/429

    21.510.500.511.5

    22.53

    2

    1

    0

    1

    2

    3

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    Philippe Toint (Namur) April 2009 108 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation methods for DFO (2)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    123/429

    To be considered:

    poisednessof the interpolation set Y

    choice of models (linear, quadratic, in between, beyond)

    convergence theorynumerical performance

    Philippe Toint (Namur) April 2009 109 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness

    Assume a quadratic model

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    124/429

    mk(xk+s) =fk+ gk, s + 12s, Hks

    Thusp= 1 +n+ 1

    2n(n+ 1) = 1

    2(n+ 1)(n+ 2)

    parameters to determine need p function values (|Y|= p)

    Not sufficient!

    needgeometricconditions for the points in Y . . .

    Philippe Toint (Namur) April 2009 110 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness: geometry with n= 2, p= 6

    20

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    125/429

    2

    1

    0

    1

    2

    21.5

    10.5

    00.51

    1.52

    0

    2

    4

    6

    8

    10

    12

    14

    16

    18

    20

    With these 6 data points in IR3. . . . . .

    Philippe Toint (Namur) April 2009 111 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness: geometry with n= 2, p= 6

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    126/429

    . . . is this the correct interpolation?

    Philippe Toint (Namur) April 2009 111 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness: geometry with n= 2, p= 6

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    127/429

    ...or this?

    Philippe Toint (Namur) April 2009 111 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness: geometry with n= 2, p= 6

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    128/429

    ...or this?

    Philippe Toint (Namur) April 2009 111 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Poisedness: geometry with n= 2, p= 6

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    129/429

    The difference ... is zero on a quadratic curve containing Y!

    Philippe Toint (Namur) April 2009 111 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    130/429

    Derivative free optimization, filters and other topics Derivative free optimization

    Lagrange polynomials

    Remarkable: replacey byy+ in Y:

    (Y )

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    131/429

    (Y+)

    (Y) =L(y+, y) isindependentof the basis{i()}pi=1

    where

    y Y L(y, y) =

    1 if y=y0 if y=y

    is theLagrange fundamental polynomial

    Note: for quadratic interpolation,L(, y) is a quadratic polynomial!Powell (1994)

    Philippe Toint (Namur) April 2009 113 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials

    Idea: use the Lagrange polynomials to define the (quadratic) interpolantby

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    132/429

    mk(xk+s) =yYk

    f(y)Lk(xk+s, y)

    And then. . .

    f(xk+ s) mk(xk+ s) yYk

    xk+ s y2|Lk(xk+ s, y)|

    Philippe Toint (Namur) April 2009 114 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    133/429

    The original function. . .

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    134/429

    . . . and the interpolation set

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    135/429

    The first Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    136/429

    The second Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    137/429

    The third Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    138/429

    The fourth Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    139/429

    The fifth Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    140/429

    The sixth Lagrange polynomial

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Interpolation using Lagrange polynomials: construction

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    141/429

    The final interpolating quadratic

    Philippe Toint (Namur) April 2009 115 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Other algorithmic ingredients

    include a new point in the interpolation set

    need to drop an existing interpolation point?

    select which one to drop: make Yas poised as possibleN d l/f i i i i d b d !!

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    142/429

    Note: model/function minimizer may produce bad geometry!! geometry improvement procedure. . .

    trust-region radius management

    trust region =Bk={xk+s| s k}

    standard: reduce kwhen no progressDFO: more complicated! (Could reduce to fast and prevent

    convergence. . . )

    verify that Y is poisedbeforereducing k

    Philippe Toint (Namur) April 2009 116 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Improving the geometry in a ball

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    143/429

    k

    attempt toreusepast points that are close to xk

    attempt to replace adistantpoint ofY

    attempt to replace aclosepoint ofY

    good geometry for the current kimprovement impossible

    Philippe Toint (Namur) April 2009 117 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Self-correction at unsuccessful iterations (1)

    At iteration k, define the set of exchangeablefarpoints:

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    144/429

    Fk={y Yk| y xk>k and Lk(xk+sk, y)= 0}

    and the set of exchangeableclosepoints (for some >1):

    Ck={y Yk\{xk} | yxk k and |Lk(xk+sk, y)| }

    Philippe Toint (Namur) April 2009 118 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Self-correction at unsuccessful iterations (2)

    Remarkably,

    Whenever

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    145/429

    iteration k is unsuccessful,

    Fk=kis small w.r.t.

    gk

    ,

    thenCk=.

    (an improvement of the geometry by a factor is always possible atunsuccessful iterations when kis small and all exchangeable far points

    have been considered) no need to reduce k forever!

    Philippe Toint (Namur) April 2009 119 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    Trust-region algorithm for DFO (1)

    Algorithm 3.1: TR for DFO

    Step 0: Initialization. Given: x0, 0, Y0 (L0(, y)). Set k= 0.

    Step 1: Criticality test [complicated and not discussed here]

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    146/429

    [ ]

    Step 2: Solve the subproblem. Compute skthat sufficiently reduces mk(xk+s)within the trust region,

    Step 3: Evaluation. Compute f(xk+sk) and

    k= f(xk) f(xk+sk)mk(xk) mk(xk+sk)

    .

    Step 4: Define the next iterate and interpolation set.

    the big question

    Step 5: Update the Lagrange polynomials.

    Philippe Toint (Namur) April 2009 120 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    147/429

    Derivative free optimization, filters and other topics Derivative free optimization

    Global convergence results

    If the model is at least fully linear, then

    lim inf

    k xf(xk)

    = lim inf

    k gk

    = 0

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    148/429

    Scheinberg and T. (2009)

    Withmore costlyalgorithm:

    If the model is at least fully linear, then

    limk

    xf(xk)= limk

    gk= 0

    If the model at least fullyquadratic, then iterates converge to2nd-order critical points

    Philippe Toint (Namur) April 2009 122 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    For an efficient numerical method. . .

    Many more issues:

    whichHessian approximation?

    (full/vs diagonal or structured)

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    149/429

    details ofcriticality testsdifficult

    details fornumerically handling interpolation polynomials(Lagrange, Newton),

    reference shifts,. . .

    good codesaround: NEWUOA, DFOefficient solvers

    Powell (2008 and previously), Conn, Scheinberg and T. (1998)Conn, Scheinberg and Vicente (2008)

    Philippe Toint (Namur) April 2009 123 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    150/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    151/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    152/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    153/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    154/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    155/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    156/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    157/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    158/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    159/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    160/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    161/429

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    15

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    162/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    163/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    164/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    165/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    166/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    167/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    168/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/http://goback/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    169/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    170/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    171/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    172/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://goforward/http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    173/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    174/429

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    175/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    176/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    10

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    177/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    10

    15

    20

    25

    http://find/
  • 8/10/2019 Advanced Algorithms in Nonlinear Optimization

    178/429

    0

    0.20.4

    0.6

    0.8

    1

    1.2

    0

    0.5

    1

    1.5

    10

    5

    0

    5

    10

    Philippe Toint (Namur) April 2009 124 / 323

    Derivative free optimization, filters and other topics Derivative free optimization

    On the ever famous banana function. . .

    10

    15

    20

    25

    http://find/
  • 8