Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
New Surrogate-Assisted Search Control and
Restart Strategies for CMA-ES
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag
TAOINRIA − CNRS − Univ. Paris-Sud
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 1/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
This Talk
Part-I: Self-Adaptive Surrogate-Assisted CMA-ES
Full paper: ES-EP track, 11.30 a.m., Tuesday, July 10
BBOB papers: IPOP-saACM-ES and BIPOP-saACM-ES on thei). Noiseless and ii). Noisy Testbeds
Part-II: Alternative Restart Strategies for CMA-ES
Full paper: PPSN 2012BBOB paper: NIPOP-aCMA-ES and NBIPOP-aCMA-ES on the
Noiseless Testbed
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 2/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Part-I
Part-I: Self-Adaptive Surrogate-Assisted CMA-ES
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 3/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Motivations
Find Argmin {F : X 7→ R}
Context: ill-posed optimization problems continuous
Function F (fitness function) on X ⊂ Rd
Gradient not available or not useful
F available as an oracle (black box)
Build {x1,x2, . . .} → Argmin(F)Black-Box approaches
+ Robust
− High computational costs:
number of expensive functionevaluations (e.g. CFD)
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 4/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Surrogate-Assisted Optimization
Principle
Gather E = {(xi,F(xi))} training set
Build F̂ from E learn surrogate model
Use surrogate model F̂ for some time:Optimization: use F̂ instead of true F in std algo
Filtering: select promising xi based on F̂ in population-based algo.
Compute F(xi) for some xi
Update F̂Iterate
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 5/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Content
1 Covariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 6/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
(µ, λ)-Covariance Matrix Adaptation Evolution Strategy
Rank-µ Update
yi ∼ N (0,C) , C = I
xi = m + σ yi, σ = 1
sampling of λsolutions
Cµ = 1µ
∑yi:λy
Ti:λ
C← (1− 1)×C + 1×Cµ
calculating C from
best µ out of λ
mnew ←m + 1µ
∑yi:λ
new distribution
Ruling principles:i). the adaptation increases the probability of successful steps to appear again;ii). C ≈ H
−1
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 7/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Invariance: Guarantee for Generalization
Invariance properties of CMA-ES
Invariance to order preserving
transformations in function spacetrue for all comparison-based algorithms
Translation and rotation invariance
thanks to C
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
CMA-ES is almost parameterless
Tuning on a small set of functions Hansen & Ostermeier 2001
Default values generalize to whole classes
Exception: population size for multi-modal functionssee the second part of this talk
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 8/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Contents
1 Covariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 9/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationLinear Classifier
L1
2
3
L
L
b
w
w
w2/ || ||
-1
b
b+1
0b
xi
supportvector
Main Idea
Training Data:
D ={
(xi, yi)|xi ∈ IRd, yi ∈ {−1,+1}}n
i=1〈w, xi〉 − b ≥ +1 ⇒ yi = +1;〈w, xi〉 − b ≤ −1 ⇒ yi = −1;
Optimization Problem: Primal Form
Minimize{w, ξ}12 ||w||
2 + C∑n
i=1 ξisubject to: yi(〈w, xi〉 − b) ≥ 1− ξi, ξi ≥ 0
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 10/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationLinear Classifier
L1
2
3
L
L
b
w
w
w2/ || ||
-1
b
b+1
0b
xi
supportvector
Optimization Problem: Dual Form
Quadratic in Lagrangian multipliers:
Maximize{α}∑n
i αi −12
∑ni,j=1 αiαjyiyj 〈xi, xj〉
subject to: 0 ≤ αi ≤ C,∑n
i αiyi = 0
Properties
Decision Function:
F̂ (x) = sign(∑n
i αiyi 〈xi, x〉 − b)The Dual form may be solved using standard
quadratic programming solver.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 11/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationNon-Linear Classifier
ww, (x)F -b = +1
w, (x)F -b = -1
w2/ || ||
a) b) c)
support vector
xi
F
Non-linear classification with the "Kernel trick"
Maximize{α}∑n
i αi −12
∑ni,j=1 αiαjyiyjK(xi, xj)
subject to: 0 ≤ αi ≤ C,∑n
i αiyi = 0,where K(x, x′) =def < Φ(x),Φ(x′) > is the Kernel function a
Decision Function: F̂ (x) = sign(∑n
i αiyiK(xi, x)− b)
aΦ must be chosen such that K is positive semi-definite
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 12/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationNon-Linear Classifier: Kernels
Gaussian or Radial Basis Function (RBF): K(xi, xj) = exp(‖xi−xj‖
2
2σ2 )
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 13/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Regression To Value or Regression To Rank?Why Comparison-Based Surrogates?
Example
F (x) = x21 + (x1 + x2)
2.
An efficient Evolutionary Algorithm (EA) withsurrogate models may be 4.3 faster on F (x).
But the same EA is only 2.4 faster on G(x) = F (x)1/4! a
aCMA-ES with quadratic meta-model (lmm-CMA-ES) on fSchwefel 2-D
Comparison-based surrogate models → invariance to
rank-preserving transformations of F (x)!
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 14/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Rank-based Support Vector Machine
Find F̂ (x) which preserves the ordering of training/test points
On training set E = {xi, i = 1 . . . n}
expert gives preferences:
(xik ≻ xjk), k = 1 . . .K
underconstrained regression
w
L( 1r )
L( 2r)
xx
x
Order constraints Primal form
{
Minimize 12 ||w||2 + C
∑Kk=1 ξk
subject to ∀ k, 〈w,xik 〉 − 〈w,xjk 〉 ≥ 1− ξk
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 15/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Surrogate Models for CMA-ES
Using Rank-SVM
Builds a global model using Rank-SVM
xi ≻ xj iff F(xi) < F(xj)
Kernel and parameters highly problem-dependent
ACM Algorithm
Use C from CMA-ES as Gaussian kernel
I. Loshchilov, M. Schoenauer, M. Sebag (2010). "Comparison-based optimizers need
comparison-based surrogates”
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 16/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Model LearningNon-Separable Ellipsoid Problem
K(xi, xj) = e−(xi−xj)
T (xi−xj)
2σ2 ; KC(xi, xj) = e−(xi−xj)
T C−1(xi−xj)
2σ2
Invariance to rotation of the search space thanks to C ≈ H−1 !
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 17/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Contents
1 Covariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 18/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Using the Surrogate ModelDirect Optimization of Surrogate Model
Simple optimization loop:
optimize F for 1 generation, then optimize F̂ for n̂ generations.
0 5 10 15 200
0.5
1
1.5
2
2.5
3
3.5
4
Number of generations
Speedup
Dimension 10
F1 Sphere
F6 AttractSector
F8 Rosenbrock
F10 RotEllipsoid
F11 Discus
F12 Cigar
F13 SharpRidge
F14 SumOfPow
How to
choose n̂?
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 19/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Adaptation of Surrogate’s Life-Length
Test set: λ recently evaluated points
Model error: fraction of incorrectly ordered points
Number of generation is inversely proportion to the model error:
Model Error
Num
ber
of genera
tions
0 0.5 1.0
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 20/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
The Proposed s∗aACM Algorithm
F F
F
F
-
-
-
Surrogate-assisted
CMA-ES with onlineadaptation of model
hyper-parameters.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 21/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Results
Self-adaptation of model hyper-parameters is better than thebest offline settings! (IPOP-s∗aACM vs IPOP-aACM)
Improvements of original CMA lead to improvements of its
surrogate-assisted version (IPOP-s∗aACM vs IPOP-s∗ACM)
0 1000 2000 3000 4000 5000 6000 7000
10−5
100
105
Number of function evaluations
Fitness
Rotated Ellipsoid 10−D
IPOP−CMA−ES
IPOP−aCMA−ES
IPOP−ACM−ES
IPOP−aACM−ES
IPOP−s∗
ACM−ES
IPOP−s∗
aACM−ES
0 1000 2000 3000 4000 5000 6000 7000
10−5
100
105
Number of function evaluations
Fitness
Rosenbrock 10−D
IPOP−CMA−ES
IPOP−aCMA−ES
IPOP−ACM−ES
IPOP−aACM−ES
IPOP−s∗
ACM−ES
IPOP−s∗
aACM−ES
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 22/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Results on Black-Box Optimization CompetitionBIPOP-s∗aACM and IPOP-s∗aACM (with restarts) on 24 noiseless 20 dimensional functions
0 1 2 3 4 5 6 7 8 9log10 of (ERT / dimension)
0.0
0.5
1.0
Proportion of functions
RANDOMSEARCHSPSABAYEDADIRECTDE-PSOGALSfminbndLSstepRCGARosenbrockMCSABCPSOPOEMSEDA-PSONELDERDOERRNELDERoPOEMSFULLNEWUOAALPSGLOBALPSO_BoundsBFGSONEFIFTHCauchy-EDANBC-CMACMA-ESPLUSSELNEWUOAAVGNEWUOAG3PCX1komma4mirser1plus1CMAEGSDEuniformDE-F-AUCMA-LS-CHAINVNSiAMALGAMIPOP-CMA-ESAMALGAMIPOP-ACTCMA-ESIPOP-saACM-ESMOSBIPOP-CMA-ESBIPOP-saACM-ESbest 2009f1-24
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 23/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noiseless case - 1
2 3 5 10 20 40
0
1
2
3
ftarget=1e-08
1 Sphere
BIPOP-CMA
BIPOP-saACM
IPOP-aCMA
IPOP-saACM
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
2 Ellipsoid separable
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
3 Rastrigin separable
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
4 Skew Rastrigin-Bueche separ
2 3 5 10 20 40
0
1
2
ftarget=1e-08
5 Linear slope
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
6 Attractive sector
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 24/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noiseless case - 2
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
7 Step-ellipsoid
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
8 Rosenbrock original
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
9 Rosenbrock rotated
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
10 Ellipsoid
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
11 Discus
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
12 Bent cigar
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 25/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noiseless case - 3
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
13 Sharp ridge
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
14 Sum of different powers
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
15 Rastrigin
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
16 Weierstrass
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
17 Schaffer F7, condition 10
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
18 Schaffer F7, condition 1000
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 26/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noiseless case - 4
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
19 Griewank-Rosenbrock F8F2
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
20 Schwefel x*sin(x)
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
21 Gallagher 101 peaks
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
22 Gallagher 21 peaks
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
23 Katsuuras
2 3 5 10 20 400
1
2
3
4
5
6
7
8
ftarget=1e-08
24 Lunacek bi-Rastrigin
BIPOP-CMA
BIPOP-saACM
IPOP-aCMA
IPOP-saACM
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 27/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noisy case - 1
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0
Proportion of functions
best 2009
IPOP-aCMA
IPOP-saACMf101-106
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0
Proportion of functions
best 2009
IPOP-aCMA
IPOP-saACMf107-121
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0
Proportion of functions
IPOP-aCMA
IPOP-saACM
best 2009f122-130
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0Proportion of functions
IPOP-aCMA
IPOP-saACM
best 2009f101-130
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 28/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Noisy case - 2
2 3 5 10 20 40
0
1
2
3
ftarget=1e-08
101 Sphere moderate Gauss
IPOP-aCMA
IPOP-saACM
2 3 5 10 20 40
0
1
2
3
ftarget=1e-08
102 Sphere moderate unif
2 3 5 10 20 40
0
1
2
3
ftarget=1e-08
103 Sphere moderate Cauchy
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
104 Rosenbrock moderate Gauss
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
105 Rosenbrock moderate unif
2 3 5 10 20 400
1
2
3
4
ftarget=1e-08
106 Rosenbrock moderate Cauchy
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 29/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Time Complexity
5 10 15 20 25 30 35 4010
−3
10−2
10−1
100
Dimension
Tim
e (
sec)
CPU cost per function evaluation
F1 Sphere
F8 Rosenbrock
F10 RotEllipsoid
F15 RotRastrigin
CPU cost for IPOP-aACM-ES with fixed hyper-parameters.
With self-adaptation will be about 10-20 times more expensive.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 30/ 45
Covariance Matrix Adaptation Evolution Strategy
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Machine Learning for Optimization: Discussion
s∗aACM is from 2 to 4 times faster on 8 out of 24 noiselessfunctions.
Invariant to rank-preserving and orthogonal transformations of
the search space : Yes
The computation complexity (the cost of speed-up) is O(d3)
The source code is available online:https://sites.google.com/site/acmesgecco/
Open Questions
How to improve the search on multi-modal functions ?
What else from Machine Learning can improve the search ?
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 31/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Part-II
Part-II: Alternative Restart Strategies for CMA-ES
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 32/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Content
4 Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
5 Preliminary Analysis
6 New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
7 Results
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 33/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
IPOP-CMA-ES
IPOP-CMA-ES: 1
1 set population size λlarge = λdefault and initial step-size
σ0large = σdefault
2 run/restart CMA-ES until stopping criterion is reached
3 if not happy then set λlarge = 2λlarge and GO TO STEP 2.
1Anne Auger, Nikolaus Hansen (CEC 2005). "A Restart CMA Evolution Strategy WithIncreasing Population Size"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 34/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
BIPOP-CMA-ES
BIPOP-CMA-ES: 2
Regime-1 (large populations, IPOP part):
Each restart: λlarge = 2 ∗ λlarge , σ0large = σ0
default
Regime-2 (small populations):
Each restart:
λsmall =
⌊
λdefault
(
12
λlarge
λdefault
)U [0,1]2⌋
, σ0small = σ0
default × 10−2U [0,1]
where U [0, 1] stands for the uniform distribution in [0, 1].
BIPOP-CMA-ES launches the first run with default population size
and initial step-size. In each restart, it selects the restart regimewith less function evaluations used so far.
2Nikolaus Hansen (GECCO BBOB 2009). "Benchmarking a BI-population CMA-ES on theBBOB-2009 function testbed"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 35/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Preliminary Analysis
λ / λdefault
σ / σ
de
fau
lt
F15 Rotated Rastrigin 20−D
100
101
102
10−2
10−1
100
λ / λdefault
σ / σ
de
fau
lt
F22 Gallagher 21 peaks 20−D
100
101
102
10−2
10−1
100
Legends indicate that the optimum up to precision f(x) = 10−10 is
found in 15 runs always (+), sometimes (⊕) or never (◦).
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 36/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Preliminary Analysis
λ / λdefault
σ / σ
de
fau
lt
F23 Katsuuras 20−D
100
101
102
10−2
10−1
100
λ / λdefault
σ / σ
de
fau
lt
F24 Lunacek bi−Rastrigin 20−D
100
101
102
10−2
10−1
100
Legends indicate that the optimum up to precision f(x) = 10−10 is
found in 15 runs always (+), sometimes (⊕) or never (◦).
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 37/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
NIPOP-CMA-ES (New IPOP-CMA-ES)
NIPOP-CMA-ES: 3
1 set population size λlarge = λdefault and initial step-size
σ0large = σdefault
2 run/restart CMA-ES until stopping criterion is reached
3 if not happy then set λlarge = 2λlarge, σ0 = σ0/ρσdec and GO TO
STEP 2.
For ρσdec = 1.6 used in this study, NIPOP-CMA-ES reaches thelower bound (σ = 10−2σdefault) used by BIPOP-CMA-ES after 9
restarts.
3Ilya Loshchilov, Marc Schoenauer, Michele Sebag (PPSN 2012). "Alternative RestartStrategies for CMA-ES"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 38/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
NBIPOP-CMA-ES (New BIPOP-CMA-ES)
NBIPOP-CMA-ES: 4
Regime-1 (large populations, NIPOP part):
Each restart: λlarge = 2 ∗ λlarge , σ0 = σ0/ρσdec
Regime-2 (small populations):Each restart: λsmall = λdefault, σ0
small = σ0default × 10−2U [0,1]
where U [0, 1] stands for the uniform distribution in [0, 1].
Adaptation of allowed budgets for regimes A and B:IF A found the best solution so far
THEN budget of A = ρbudget× budget of B (ρbudget = 2)
4Ilya Loshchilov, Marc Schoenauer, Michele Sebag (PPSN 2012). "Alternative RestartStrategies for CMA-ES"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 39/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
An illustration of λ and σ hyper-parameters distribution
100
101
102
103
10−2
10−1
100
λ / λdefault
σ / σ
defa
ult
9 restarts of IPOP-CMA-ES (◦), BIPOP-CMA-ES (◦ and · for 10 runs),
NIPOP-CMA-ES (�) and NBIPOP-CMA-ES (� and many △ forλ/λdefault = 1, σ/σdefault ∈ [10−2, 100]).
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 40/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
On Multi-Modal Functions
2 3 5 10 20 400
1
2
3
4
5
ftarget=1e-08
15 Rastrigin
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
16 Weierstrass
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
21 Gallagher 101 peaks
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
22 Gallagher 21 peaks
2 3 5 10 20 400
1
2
3
4
5
6
ftarget=1e-08
23 Katsuuras
2 3 5 10 20 400
1
2
3
4
5
6
7
ftarget=1e-08
24 Lunacek bi-Rastrigin
BIPOP-aCMA
IPOP-aCMA
NBIPOP-aCMA
NIPOP-aCMA
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 41/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
NIPOP-aCMA-ES on 20-dimensional F15 Rastrigin
0 0.5 1 1.5 2 2.5 3 3.5 4
x 105
10−10
10−5
100
105
2e−0061e−006
f=−1.16608589451062e−009
blue:abs(f), cyan:f−min(f), green:sigma, red:axis ratio
σ step-size is green line
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 42/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
ECDFs
Results for 40-D
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0
Pro
po
rtio
n o
f fu
nct
ion
s
IPOP-aCMA
NIPOP-aCMA
BIPOP-aCMA
best 2009
NBIPOP-aCMAf20-24
0 1 2 3 4 5 6 7 8log10 of (# f-evals / dimension)
0.0
0.5
1.0
Pro
po
rtio
n o
f fu
nct
ion
s
IPOP-aCMA
NIPOP-aCMA
BIPOP-aCMA
NBIPOP-aCMA
best 2009f1-24
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 43/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Restart Strategies for CMA-ES: Discussion
(Of course) It all depends on benchmark functions we use.
Against Overfitting: Results confirmed on a real-world problem of
Interplanetary Trajectory Optimization (see PPSN2012 paper).
NIPOP can be competitive (while simpler) with BIPOP.
The decrease of the initial step-size is not that bad idea for IPOP.
Adaptation of computational budgets for regimes works well.
Further Work
Surrogate-assisted search in the space of λ, σ hyper-parameters.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 44/ 45
Original Restarts Strategies: IPOP-CMA-ES and BIPOP-CMA-ES
Preliminary Analysis
New Restart Strategies: NIPOP-CMA-ES and NBIPOP-CMA-ES
Results
Thank you for your attention!
Questions?
Check the source code! it’s available online:
Self-Adaptive Surrogate-Assisted CMA-ES :https://sites.google.com/site/acmesgecco/
NIPOP and NBIPOP: https://sites.google.com/site/ppsnbipop/
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate and Restart Strategies for CMA-ES 45/ 45