global optimization software - mcmaster universitycs777/presentations/1_go_jonathan_lgo_… · what...
TRANSCRIPT
Global Optimization Software
Doron PearlJonathan Li Olesya PeshkoXie Feng
What is global optimization?
Global optimization is aimed at finding thebest solution of constrained optimization problem which (may) also have various local optima.
General global optimization problem (GOP)
Given a bounded, robust set D in the real n-space Rn and a continuous function f: D R ,find
global min f(x), subject to the constraint x Є D
Note: robust set : the closure of its nonempty interior.
First, we have to tell you..
No single optimization package can solve all global optimization problems efficiently.
Two General Classes In Global Optimization
Deterministic-Grid search -Branch and bound
Stochastic-Simulated Annealing-Tabu Search-Genetic Algorithm-Statistical Algorithm
Deterministic class and software
Actually, We can further classify deterministic class into two different sub-class :
Explicit Function RequiredSuch as ..Baron
Explicit Function isn’t requiredSuch as ..LGO( Lipschitz Global Optimization ).
Remark:1. In present deterministic solvers, the number of solvers in first class is more
then in second class. 2. Even though LGO is regard as using deterministic way to solve the problem,
the solution isn’t always guaranteed to be “deterministic” global optimal.3. There are some more solvers in first class won’t be discussed in detail, but
in later slides, they will be included in comparison.
LGO Lipschitz Global Optimization
0
min ( )
{ : ( ) 0, 1,..., }j
f x
x D x D f x j J∈ = ∈ ≤ =
1 2 1 2( ) ( )j j jf x f x L x x− ≤ −
represents a ‘simple’ explicit constraint set:frequently, it’s a finite n-dimensional interval or simplex, or Rn.
Furthermore, the objective function and constraint functions are Lipschitz-continuous on D0.That is,they satisfy the relation
0nD R∈
LGO Lipschitz Global Optimization
Three Key Components in the approach:
Lipschitz Continuous Function
Adaptive Partition Strategy
Branch and Bound
Lipschitz Continuous FunctionWith Lipschitz Continuous Property :
We can conclude the following observation of the function:
The ’slope’ is bounded with respect to the system input variables x.
If a function is Lipschitz Continuous on certain compact domain, it’s guaranteed that the bound of the function exists.
On the other hand, without the property, on the sole basisof sample points and corresponding function values, one cannot provide a lower bound after any finite numberof function evaluations of D.
Remark:It’s not necessary to compute L in global optimization, but the existence of it is a necessary condition to have lower bound.
1 2 1 2( ) ( )j j jf x f x L x x− ≤ −
Lipschitz Continuous Function
In Lipschitz continuous function , the more sample points we have , the more accurate approximation of the lower bound we can obtain.
Adaptive partition strategy
Usually implement on the relaxed feasible set, such as:
-Interval set: a<x<b (x, a, b is vector)
The strategy is to partition the interval into sub-interval by bisection. In high dimension, it could be regard as a box.
-Simplex set:The strategy is to partition the simplex into sub-simplex by each time cutting one vertex out.
-Convex Cone set:The strategy is to partition the cone into sub-cone.
Remark:As you may see, partition usually should:-Create linear bound constraint of each partition-Fulfill “exhaustive search”The choice of different partition strategy usually depends on how well the relaxation is, such as tightness.
Example of computing L when having relaxed bound constraint
2
1
( ) ((1/ 2) )n
k k k k kk
f x p x q x r=
= + +∑{ : , 1,.... }n
k k kP x R a x b k n= ∈ ≤ ≤ =
1 2
22 1/ 2
1
2
[ ( ) ( ) ]
{ : / (1/ 2)( )}{ : / (1/ 2)( )}
k k k k k kk I k I
k k k k
k k k k
then
L p a q p b q
whereI k q p a bI k q p a b
∈ ∈
= + + +
= − ≥ +
= − < +
∑ ∑
0, , , ( 1, 2,... )k k k k kp q r a b k n≠ < =
Branch and Bound
Branch literally means that the algorithm trying to partition the feasible region in some fashion.
Bound means while doing searching, we try to estimate the objective value by using upper bound and lower bound.
Upper bound: In each feasible region, the founded local optimum gives the upper bound, or the function evaluation of randomly sampling.
Lower bound : usually composed by certain approximation.
Branch and Bound
Set up domain D’ with simpleexplicit constraint
Pick samplepoints,x1,x2..,
calculatef(x1),f(x2)..
Do the local search ,set
local optimal asupper bound,and Record it.
Compute thelower bound
for thebounded area
Ifupper bound =lower bound
Partition the
domain D
Compute thelower bound
for eachpartition,
Do the local searchlocal optimal,
update the upperbound, andRecord it.
IfLower bound >
latest upperbound
No
Yes
Yes
stop
stop
Three approaches in LGO
LGO integrates a suite of robust and efficient global and local scope solvers. These include:
adaptive partition and search (branch-and-bound)
adaptive global random search (single & multi-start)
constrained local optimization( reduced gradient method)
Remark:The random option of approach is also usually used to handle black-box function.
General global optimization model in LGO
x is a real n-vector (to describe feasible decisions)
a, b are finite, component-wise vector bounds imposed on x
f(x) is a continuous function (to describe the model objective)
g(x) is a continuous vector function (to describe the model constraints; the inequality sign is interpreted component-wise).
min ( )( ) 0
f xg xa x b
≤≤ ≤
LGO interface
Library: LGO solver suite for C and Fortran compilers, with a text I/O interface, or embedded in a Windows GUI
Spreadsheets:Excel Premium Solver Platform/LGO solver engine, in cooperation with Frontline Systems
Modeling Language:GAMS/LGO solver engine, in cooperation with the GAMS Development Corporation
Integrated technical computing systems: AIMMS/LGO solver engine, in cooperation with Paragon Decision Technologies Global Optimization Toolbox for Maple, in cooperation with MaplesoftMPL/LGO solver engine, in cooperation with Maximal Software MathOptimizer for Mathematica, a native Mathematica product MathOptimizer Professional (LGO for Mathematica), in cooperation with Dr. Frank KampasTOMLAB/LGO for Matlab, in cooperation with TOMLAB Optimization
LGO
LGO has been used to solve models with up to one thousand variables and constraints.
These packages are developed by J. D. Pinter, who, since doing his PhD (1982 - Moscow State University) in optimisation, has become an internationally known expert in the field. One of his textbooks has won an international award (INFORMS Computing Society Prize for Research Excellence)
Further detail will be discussed in later slides.
LGO testing
In our numerical experiments described here, we have used LGO to solve a set of GAMS models based on the Handbook of Test Problems in Local and Global Optimization by Floudaset al.(1999). For brevity, we shall refer to the model collection studied as HTPLGO. The set of models considered is available from GLOBALLib (GAMS Global World, 2003).
GLOBALLib is a collection of nonlinear models that provides GO solver developers with a large and varied set of theoretical and practical test models.
The entire test set used consists of 117 models.
The test models included have up to 142 variables, 109 constraints, 729 non-zero and 567 nonlinear-non-zero model terms.
LGO test result
Figure 3. Efficiency profiles: all LGO solver modes are applied to GLOBALLib models.
Operational mode(for brevity we shall use opmode)
.opmode 0: local search from a given nominal solution, without a preceding global search mode (LS)
·opmode 1: global branch-and-bound search and local search (BB+LS)
· opmode 2: global adaptive random search and local search (GARS+LS)
· opmode 3: global multi-start random search and local search (MS+LS)
Using LGO
There are usually five stages while using LGO, they are problem definition, problem compilation, model parameters, model solution, and result analysis
-Problem definition: Define the function -Problem Compilation : Link to obj and lib-Model parameters: Set up lower bound, upper bound , and number of
constraint, etc-Model solution: There is automatic model and interactive model
Automatic model : Program determine which of the four module to use to compute with respect to the input file
Interactive model: User determine which module to use and in which order ,maximum search effort
Price
GAMS/LGO commercial $1,600academic $320
Premium Solver Platform $1,495
TOMLAB /LGOcommercial $1,600academic $600
Some important fact
Continuity of the functions (objective and constraints) defining the global optimization model is sufficient to use the LGO software.
Naturally, in such cases only a statistical guarantee can be given for the global lower bound estimate. The lower bound generated by LGO is statistical in all cases, since it is based partially on pseudo-random sampling.
LGO could only give the global optima deterministically based on deterministic L and deterministic boundary.
Comparison of complete global optimization solvers
Solvers being compared:We present test results for the global optimization systems BARON, COCOS, GlobSol, ICOS, LGO/GAMS, LINGO, OQNLP Premium Solver, and for comparison the local solver MINOS. All tests were made on the COCONUT benchmarking suite.
Outline of test set:The test set from three libraries consists of 1322 models varying in dimension (number of variables) between 1 and over 1000, coded in the modeling language AMPL.
Library 1 : GAMS Global library ; real life global optimization problems with industrial relevance, but currently most problems on this site are without computational results.
Library 2: CUTE library ; consist of global (and some local) optimization problems with
nonempty feasible domain
Library 3: EPFL library ; consists of pure constraint satisfaction problems (constant objective function) almost all being feasible.
Comparison of complete global optimization solvers(2)
Those excluded from libraries: 1.Certain difficult ones for testing, but the difficulties is unrelated to solver 2.Those contain function which aren’t support by ampl2dag converter.3. Problem actually contain objective function in Library3.4. Showing strange behavior, which might caused by bug in converter5. No solver can get optimal solution
Brief overview of special characteristic of other solvers
Globsol , Premium solver exploiting interval method.
ICOS is a pure constraint solver, which currently cannot handle models with an objective function
COCOS contains many modules that can be combined to yield various combination strategies for global optimization.
Characteristic comparison
Important related details
All solvers are tested with the default options suggested by theproviders of the codes.
The timeout limit used was (scaled to a 1000 MHz machine) around180 seconds CPU time for models of size 1, 900 seconds for models of size 2, and 1800 seconds for models of size 3
The solvers LGO and GlobSol required a bounded search region, and we bounded each variable between ¡1000 and 1000, except in a few cases where this leads to a loss of the global optimum.
The reliability of claimed results is the most poorly documented aspect of current global optimization software.
Reliability
Reliability
Performance
Note: Different solvers have different stopping criteria, Which should also be considered.
Like..Baron, Lingo : stop while time
is upLGO,OQNLP : stop based on
certain statistic
Final Remark
In a few cases, GlobSol and Premium Solver found solutions where BARON failed, which suggests that BARON would benefit from some of the advanced interval techniques implemented in GlobSol and Premium Solver.
However, GlobSol and Premium Solver are much less efficient in both time and solving capacity than BARON. To a large extent this may be due to the fact that both GlobSol and Premium Solve strive to achieve mathematical rigor, resulting insignificant slowdown due to the need of rigorously validated techniques.
Reference
http://myweb.dal.ca/jdpinter/index.htmlJanos D. Pinter (LGO ‘s creator) website
Global Optimization in Action Continuous and Lipschitz Optimization : Algorithm, Implementations and Applications, Author: Janos D. Pinter
Introduction to Global Optimization Author : Reiner Horst, Panos M.Pardalos and Nguyen V.ThoaiA comparison of complete global optimization solvers Arnold Neumaier. Oleg Shcherbina, Waltraud Huyer, Tamas Vinko , Mathematical Programming
http://www.mat.univie.ac.at/~neum/glopt.htmlWebsite maintained by Arnold Neumaier
p.s. If you like to check above two books, go asking Prof.Tamas. He will be generous to who like to learn.
Global OptimizationBARON
Feng Xie
BARON Branch And Reduce Optimization Navigator
It derives its name from “its combining interval analysis and duality in its reduce arsenal with enhanced branch and bound concepts as it winds its way through the hills and valleys of complex optimization problems in search of global solutions” (N. V. Sahinidis, University of Illinois at Urbana-Champaign, Department of Chemical Engineering).
• Basically an improved branch and bound algorithm.
• Range reduction is the major feature of the branch and reduce algorithm.
Two range reduction techniques are used:
• Optimality-based.• Feasibility-based.
Branch and ReduceAlgorithm Overview
Relaxation
min ( ) ( ). . ( ) 0 ( )
f x f x is convexs t g x g x is convex
x X X is a box≤
∈
min ( ). . ( ) 0
f xs t g x
x X≤
∈
Sub-problem Relaxed problem (convex optimization)
Variable
Objective
Variable
Objective
Converting a non-convex objective function into convex
Relaxation – an Example
Problem Relaxed problem
Perturbation Function
min ( ). . ( ) 0
f xs t g x
x X≤
∈
Given a relaxed problem (R):
( ) min ( ). . ( )
p y f xs t g x y
x X
=≤
∈
The corresponding perturbed problem (Ry) is:
perturbation
Properties of p(y) :
• p(0) is the solution of problem (R);
• p(y) is non-increasing (bigger y, bigger feasible set for (Ry));
• p(y) is a convex function (proof ignored).
Perturbation Function – Another View
y
z
G={(g(x),f(x)}
Let G be the set {(y,z) : y=g(x), z=f(x) for some x in X}. Then p(y) is the lower envelope of G.
y1
p(y), non-increasing and convex.
0
L=p(0), solution of (R).
p(y1), solution of (Ry).
Optimality-Based Range Reduction
z
G
A simple case :
p
U
[ , ] 0 .L U Uj j j j jx x x and constraint x x is active∈ − ≤
jxUjx
L
Ljx *
jκ
The upper bound of the global minimum
original range
Optimality-Based Range Reduction
z
G
p is unknown.
U
*, [ , ], [ , ].L U Uj j j j jThe range of x x x can be reduced to xκ
jxUjx
L
Ljx *
jκ
However, the perturbation function p is not explicitly given.
reducedrange
Optimality-Based Range Reduction
z
G
p
U
jxUjx
L
Ljx *
jκ
By nonlinear programming duality, the line passing through the optimum point and supporting G has slope -λj , where λj is the Lagrange multiplier corresponding to constraint in the solution of the Lagrangian dual problem.
0Uj jx x− ≤
supporting line of G
Optimality-Based Range Reduction
z
G
p
U
jxUjx
L
Ljx *
jκjκ
( ) .
[ , ] [ , ], ( ) / .
Uj j j
L U U Uj j j j j j j
Use the support line z L x x as the under estimator of p We can
reduce the range from x x to x where x U L
λ
κ κ λ
= − −
= − −
support line of Greduced range
Optimality-Based Range Reduction
The above process of range reduction can be extended to arbitrary constraints of the type .
The above range reduction process is called “optimality-based”because the new range is derived based on the optimal solution of the relaxed problem.
and are the optimal dual multipliers of nonlinear/linear constraints and simple bound constraints respectively.μ λ
( ) 0ig x ≤
Range Reduction – Example Continued
Relaxed problem:
• The optimum for the relaxed problem lies at (6, 0.89) with an objective function value of –6.89=L. Local search starting from (6, 0.89) gives U=-6.66 at (6, 0.66).
• In the solution of the relaxed problem, x1 is at its maximum (constraint x1-6<=0 is active) with a dual Lagrange multiplier =0.2. So the lower bound of x1 can be tightened to 6-(U-L)/ =4.86. Similarly x2 can be tightened to 0.66.
• The new bounds are 0<=x1<=4.86, 0.66<=x2<=4.
• Reconstruct the relaxation with the new bounds (see next page):
λλ
Range Reduction – Example Continued
Relaxed problem on new bounds:
• Reconstruct the relaxation with the new bounds (the new feasible set is indicated by the blue contour).
• The optimum for the new relaxed problem lies at (6, 0.66) with an objective function value of –6.66. Thus, the lower bound and upper bound of the global optimum converges. Global optimum reached (no branching is needed).
• If no range reduction is used, 4 branches have to be explored before the global optimum can be found (BFS traversal and bisection branching is used).
U=-6.66L=-6.89
Feasibility-Based Range Reduction
Feasibility-based range reduction is a process that tightens the bounds of problem variables by cutting off the infeasible portions.
Example:
1
, 1,...,n
ij j ij
a x b i m=
≤ =∑
, 1,...,L Uj j jx x x
Given
j n≤ ≤ =
(linear constraints)
(variable bounds)
Find tighter bounds for the variables, that is
. . 1,..., .L U L L U Uj j j j j j jfind and s t x x x for j nκ κ κ κ≤ ≤ ≤ ≤ =
Feasibility-Based Range Reduction
Best-effort range reduction through linear programming.
1
, 1,..., , :
min
. . , 1,..., .
j
j
n
ij j ij
For each variable x j n solve two LPs
x
s t a x b i m=
=
±
≤ =∑
• It gives the best/tightest new ranges.
• But it is expensive (2n LPs to solve).Best-effort range reduction in 2D
Feasibility-Based Range Reduction
“Poor man’s linear programming” heuristic.
Given any inequality ,1
n
ij j ij
a x b=
≤∑single out xk:
{ }min ,U Lik k i ij j i ij j ij j
j k j k
a x b a x b a x a x≠ ≠
≤ − ≤ −∑ ∑
{ }
{ }
1 min , , 0
1 min , , 0
U Lk i ij j ij j ik
j kik
U Lk i ij j ij j ik
j kik
x b a x a x if aa
x b a x a x if aa
≠
≠
⎧ ⎛ ⎞≤ − >⎪ ⎜ ⎟
⎪ ⎝ ⎠⎨
⎛ ⎞⎪ ≥ − <⎜ ⎟⎪⎝ ⎠⎩
∑
∑
So,
Feasibility-Based Range Reduction
Compared to the Linear Programming approach, “Poor man’s LP”is not guaranteed to give the maximum range reduction, but it isvery cheap.
The best-effort feasibility-based range reduction is only used in the preprocessing phase, on an one-time basis.
The bounds are not improved by “Poor man’s LP” at all.
“Poor man’s LP”gives a suboptimal range reduction.
“Poor man’s LP”gives the optimal range reduction.
Software Structure
BARON core-user interaction
High flexibility - The solver is highly customizable (by providing a certain number of user-written subroutines).
Software Structure
BARON specialized modules
• Separable Concave Programming (The objective function is the sum of concave functions.)
• Fractional Programming(The objective function is a fraction with linear functions as the numerator and denominator.)
• Mixed Integer Linear Programming
• Others
Availability
Under the GAMS or AIMMS modeling language.Base Module GAMS/BARON GAMS/CPLEX GAMS/MINOS GAMS/SNOPT
Commercial 3, 200. 00$ 1, 600. 00$ 6, 000. 00$ 3, 200. 00$ 3, 200. 00$ Academic 640. 00$ 320. 00$ 1, 280. 00$ 640. 00$ 640. 00$
Price list of related GAMS products (www.gams.com)
Through NEOS Server in GAMS format.
(As stated in the user’s manual)BARON is available as a callable library where users can supply problem-specific subroutines (range reduction, branching, local search, …) to improve the performance.
Limits (BARON)
Convex relaxation• Need the knowledge of the functions to get a good relaxation.
• Perform poorly on black-box or “unexpected” functions, for which the analytical information is limited.
Purely deterministic• Have to walk through a large number of branches before reaching the global optimum, resulting slow convergence for certain problems, especially the ones with large bounds or no bounds at all.
• As a comparison, solvers using statistical methods will focus on the branches that contain global optimum with high probability.
Limits (GAMS/BARON)
Variable and expression boundsAll nonlinear variables and expressions should be bounded below and above by finite numbers. If not, default bounds are given, which are usually large, and global optimum is not guaranteed.
Allowable nonlinear functions, ln , , , , | |, , .x x ye x x x x where Rα β α β ∈
Trigonometric functions are not allowed.
Solver dependency• An LP solver is required in most cases (CPLEX, OSL, MINOS or SNOPT).• An NLP solver is optional for the purpose of optimality-based range reduction (MINOS or SNOPT).
Limits – An Example
A Chemical Equilibrium problem with 6 variables (Ba, SO4, BaOH, OH, HSO4, H) and 6 constraints. The optimal solution is -1.000.
4
4
4
7 54 4
7 2 5 24 4
min. . 1
4.8
0.98
110 10
2 10 10 2 10 10
Bas t Ba SO
BaOHBa OHHSO
SO HH OHBa BaOH SO HSO
Ba BaOH H SO HSO OH
− −
− − − −
⋅ =
=⋅
=⋅
⋅ =
+ = +
⋅ + + = ⋅ + +
Source: wall.gms from GAMS model library
Limits – An ExampleGAMS/BARON: Branch and Reduce algorithm;GAMS/LGO: Multistart Random Sampling algorithm (default).Hardware: Mobile AMD Sempron 3300+,1.59GHz, 896MB RAM.
Unbounded intervals
Time (s) Solution Found Default Bounds
BARON 186.280 -1.000 Inferred by preprocessor
LGO 0.313 -1.000 Inferred by preprocessor
Bounded intervals
Bounds [-1E7,1E7] [-1E8,1E8] [-1E9,1E9] [-1E10,1E10]
Time (s) 0.532 0.640 13.532 127.953 BARON
Solution -1.000 -1.000 -1.000 -1.000
Time (s) 0.312 81.687 0.172 0.203 LGO
Solution 1.000 1.000 0.996 0.996
References
BARON User’s Manual (Version 4.0).
http://archimedes.scs.uiuc.edu/baron
http://www.gams.com
“Global Optimization, Deterministic Approaches”, Reiner Horst, Hoang Tuy.
“Nonlinear Programming, Theory and Algorithms”, Mokhtar S. Bazaraa, Hanif D. Sherali, C. M. Shetty.
“Optimization Theory and Methods, Nonlinear Programming”, Wenyu Sun, Ya-Xiang Yuan.
“Convex Optimization”, Stephen Boyd, Lieven Vandenberghe.