self-adaptive surrogate-assisted covariance matrix...
Post on 12-Oct-2020
3 Views
Preview:
TRANSCRIPT
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Self-Adaptive Surrogate-Assisted Covariance
Matrix Adaptation Evolution Strategy
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag
TAO
INRIA − CNRS − Univ. Paris-Sud
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 1/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Motivations
Find Argmin {F : X 7→ R}
Context: ill-posed optimization problems continuous
Function F (fitness function) on X ⊂ Rd
Gradient not available or not useful
F available as an oracle (black box)
Build {x1,x2, . . .} → Argmin(F)Black-Box approaches
+ Robust
− High computational costs:
number of expensive functionevaluations (e.g. CFD)
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 2/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Surrogate-Assisted Optimization
Principle
Gather E = {(xi,F(xi))} training setBuild F̂ from E learn surrogate model
Use surrogate model F̂ for some time:Optimization: use F̂ instead of true F in std algo
Filtering: select promising xi based on F̂ in population-based algo.
Compute F(xi) for some xi
Update F̂Iterate
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 3/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Surrogate-Assisted Optimization, cont
Issues
Learning
Hypothesis space (polynoms, neural nets, Gaussian processes,...)
Selection of training set (prune, update, ...)
What is the learning quality indicator ?
Interaction of Learning & Optimization modules
Schedule (when to relearn)
∗ How to use F̂ to support optimization search
∗∗ How to use search results to support learning F̂
This talk
∗ Using Covariance-Matrix Estimation within Support Vector
Machines
∗∗ Self-adaptation of surrogate model hyper-parameters
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 4/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Content
1 Black-box optimizationCovariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 5/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Covariance Matrix Adaptation Evolution Strategy
(µ, λ)-Covariance Matrix Adaptation Evolution Strategy
Rank-µ Update
yi ∼ N (0,C) , C = I
xi = m + σ yi, σ = 1
sampling of λ
solutions
Cµ = 1µ
∑yi:λy
Ti:λ
C← (1− 1)×C + 1×Cµ
calculating C frombest µ out of λ
mnew ←m + 1µ
∑yi:λ
new distribution
Ruling principles:i). the adaptation increases the probability of successful steps to appear again;ii). C ≈ H
−1
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 6/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Covariance Matrix Adaptation Evolution Strategy
Invariance: Guarantee for Generalization
Invariance properties of CMA-ES
Invariance to order preserving
transformations in function spacetrue for all comparison-based algorithms
Translation and rotation invariancethanks to C
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
CMA-ES is almost parameterless
Tuning on a small set of functions Hansen & Ostermeier 2001
Default values generalize to whole classes
Exception: population size for multi-modal functions
try restarts (see our PPSN’2012 paper)
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 7/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Covariance Matrix Adaptation Evolution Strategy
State-of-the-art Results
BBOB – Black-Box Optimization Benchmarking
ACM-GECCO workshop, in 2009, 2010 and 2012
Set of 24 benchmark functions, dimensions 2 to 40
With known difficulties (ill-conditioning, non-separability, . . . )
Noisy and non-noisy versions
Competitors include
BFGS (Matlab version),
DFO (Derivative-FreeOptimization, Powell 04)
Differential Evolution
Particle Swarm
Optimization
and many more
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 8/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Contents
1 Black-box optimizationCovariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 9/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationLinear Classifier
L1
2
3
L
L
b
w
w
w2/ || ||
-1
b
b+1
0b
xi
supportvector
Main Idea
Training Data:
D ={
(xi, yi)|xi ∈ IRd, yi ∈ {−1,+1}}n
i=1〈w, xi〉 − b ≥ +1 ⇒ yi = +1;〈w, xi〉 − b ≤ −1 ⇒ yi = −1;
Optimization Problem: Primal Form
Minimize{w, ξ}12 ||w||
2 + C∑n
i=1 ξisubject to: yi(〈w, xi〉 − b) ≥ 1− ξi, ξi ≥ 0
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 10/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationLinear Classifier
L1
2
3
L
L
b
w
w
w2/ || ||
-1
b
b+1
0b
xi
supportvector
Optimization Problem: Dual Form
Quadratic in Lagrangian multipliers:
Maximize{α}∑n
i αi −12
∑ni,j=1 αiαjyiyj 〈xi, xj〉
subject to: 0 ≤ αi ≤ C,∑n
i αiyi = 0
Properties
Decision Function:
F̂ (x) = sign(∑n
i αiyi 〈xi, x〉 − b)The Dual form may be solved using standardquadratic programming solver.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 11/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationNon-Linear Classifier
ww, (x)F -b = +1
w, (x)F -b = -1
w2/ || ||
a) b) c)
support vector
xi
F
Non-linear classification with the "Kernel trick"
Maximize{α}∑n
i αi −12
∑ni,j=1 αiαjyiyjK(xi, xj)
subject to: 0 ≤ αi ≤ C,∑n
i αiyi = 0,where K(x, x′) =def < Φ(x),Φ(x′) > is the Kernel function a
Decision Function: F̂ (x) = sign(∑n
i αiyiK(xi, x)− b)
aΦ must be chosen such that K is positive semi-definite
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 12/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Support Vector Machine for ClassificationNon-Linear Classifier: Kernels
Gaussian or Radial Basis Function (RBF): K(xi, xj) = exp(‖xi−xj‖
2
2σ2 )
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 13/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Regression To Value or Regression To Rank?Why Comparison-Based Surrogates?
Example
F (x) = x21 + (x1 + x2)
2.
An efficient Evolutionary Algorithm (EA) with
surrogate models may be 4.3 faster on F (x).
But the same EA is only 2.4 faster on G(x) = F (x)1/4! a
aCMA-ES with quadratic meta-model (lmm-CMA-ES) on fSchwefel 2-D
Comparison-based surrogate models → invariance to
rank-preserving transformations of F (x)!
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 14/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Rank-based Support Vector Machine
Find F̂ (x) which preserves the ordering of training/test points
On training set E = {xi, i = 1 . . . n}
expert gives preferences:
(xik ≻ xjk), k = 1 . . .K
underconstrained regression
w
L( 1r )
L( 2r)
xx
x
Order constraints Primal form
{
Minimize 12 ||w||2 + C
∑Kk=1 ξk
subject to ∀ k, 〈w,xik 〉 − 〈w,xjk 〉 ≥ 1− ξk
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 15/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Surrogate Models for CMA-ES
Using Rank-SVM
Builds a global model using Rank-SVM
xi ≻ xj iff F(xi) < F(xj)
Kernel and parameters highly problem-dependent
ACM Algorithm
Use C from CMA-ES as Gaussian kernel
I. Loshchilov, M. Schoenauer, M. Sebag (2010). "Comparison-based optimizers need
comparison-based surrogates”
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 16/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Support Vector Machine (SVM)
Rank-based SVM
Model LearningNon-Separable Ellipsoid Problem
K(xi, xj) = e−(xi−xj)
T (xi−xj)
2σ2 ; KC(xi, xj) = e−(xi−xj)
T C−1(xi−xj)
2σ2
Invariance to rotation of the search space thanks to C ≈ H−1 !
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 17/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Contents
1 Black-box optimizationCovariance Matrix Adaptation Evolution Strategy
2 Support Vector Machines
Support Vector Machine (SVM)
Rank-based SVM
3 Self-Adaptive Surrogate-Assisted CMA-ES
AlgorithmResults
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 18/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Using the Surrogate ModelDirect Optimization of Surrogate Model
Simple optimization loop:optimize F for 1 generation, then optimize F̂ for n̂ generations.
0 5 10 15 200
0.5
1
1.5
2
2.5
3
3.5
4
Number of generations
Speedup
Dimension 10
F1 Sphere
F6 AttractSector
F8 Rosenbrock
F10 RotEllipsoid
F11 Discus
F12 Cigar
F13 SharpRidge
F14 SumOfPow
How tochoose n̂?
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 19/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Adaptation of Surrogate’s Life-Length
Test set: λ recently evaluated points
Model error: fraction of incorrectly ordered points
Number of generation is inversely proportion to the model error:
Model Error
Num
ber
of genera
tions
0 0.5 1.0
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 20/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Model Hyper-Parameter Sensitive Results
The speed-up of surrogate-assisted CMA-ES is very sensitive to thenumber of training points!
50 100 150 200 250 300 350 4000.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Number of training points
Speedup
Dimension 10
F1 Sphere
F5 LinearSlope
F6 AttractSector
F8 Rosenbrock
F10 RotEllipsoid
F11 Discus
F12 Cigar
F13 SharpRidge
F14 SumOfPow
Average
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 21/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
The Proposed s∗aACM Algorithm
F F
F
F
-
-
-
Surrogate-assisted
CMA-ES with online
adaptation of modelhyper-parameters.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 22/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Online Adaptation of Model Hyper-Parameters
1000 2000 3000 4000 5000 60000
0.5
1
F8 Rosenbrock 20−D
Number of function evaluations
Va
lue
1000 2000 3000 4000 5000 600010
−10
100
105
Fitn
ess
Ntraining
Cbase
Cpow
csigma
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 23/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Results
Self-adaptation of model hyper-parameters is better than the
best offline settings! (IPOP-s∗aACM vs IPOP-aACM)
Improvements of original CMA lead to improvements of itssurrogate-assisted version (IPOP-s∗aACM vs IPOP-s∗ACM)
0 1000 2000 3000 4000 5000 6000 7000
10−5
100
105
Number of function evaluations
Fitness
Rotated Ellipsoid 10−D
IPOP−CMA−ES
IPOP−aCMA−ES
IPOP−ACM−ES
IPOP−aACM−ES
IPOP−s∗
ACM−ES
IPOP−s∗
aACM−ES
0 1000 2000 3000 4000 5000 6000 7000
10−5
100
105
Number of function evaluations
Fitness
Rosenbrock 10−D
IPOP−CMA−ES
IPOP−aCMA−ES
IPOP−ACM−ES
IPOP−aACM−ES
IPOP−s∗
ACM−ES
IPOP−s∗
aACM−ES
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 24/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Algorithm
Results
Results on Black-Box Optimization CompetitionBIPOP-s∗aACM and IPOP-s∗aACM (with restarts) on 24 noiseless 20 dimensional functions
0 1 2 3 4 5 6 7 8 9log10 of (ERT / dimension)
0.0
0.5
1.0
Proportion of functions
RANDOMSEARCHSPSABAYEDADIRECTDE-PSOGALSfminbndLSstepRCGARosenbrockMCSABCPSOPOEMSEDA-PSONELDERDOERRNELDERoPOEMSFULLNEWUOAALPSGLOBALPSO_BoundsBFGSONEFIFTHCauchy-EDANBC-CMACMA-ESPLUSSELNEWUOAAVGNEWUOAG3PCX1komma4mirser1plus1CMAEGSDEuniformDE-F-AUCMA-LS-CHAINVNSiAMALGAMIPOP-CMA-ESAMALGAMIPOP-ACTCMA-ESIPOP-saACM-ESMOSBIPOP-CMA-ESBIPOP-saACM-ESbest 2009f1-24
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 25/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Machine Learning for Optimization: Discussion
s∗aACM is from 2 to 4 times faster on Uni-Modal Problems.
Invariant to rank-preserving and orthogonal transformations of
the search space : Yes
The computation complexity (the cost of speed-up) is O(d3)
The source code is available online:https://sites.google.com/site/acmesgecco/
Open Questions
How to improve the search on multi-modal functions ?
What else from Machine Learning can improve the search ?
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 26/ 27
Black-box optimization
Support Vector Machines
Self-Adaptive Surrogate-Assisted CMA-ES
Thank you for your attention!
Questions?
Related Publications:
Joint work with Marc Schoenauer and Michèle Sebag:"Alternative Restart Strategies for CMA-ES". Parallel Problem Solving from Nature (PPSN XII),
September 2012.
"Black-box Optimization Benchmarking of IPOP-saACM-ES and BIPOP-saACM-ES on theBBOB-2012 Noiseless Testbed". GECCO’2012 BBOB Workshop, July 2012.
"Dominance-Based Pareto-Surrogate for Multi-Objective Optimization". Simulated Evolution andLearning (SEAL 2010), December 2010.
"Comparison-Based Optimizers Need Comparison-Based Surrogates". Parallel Problem Solvingfrom Nature (PPSN XI), September 2010.
"A Pareto-Compliant Surrogate Approach for Multiobjective Optimization". GECCO’2010 , July2010.
"A Mono Surrogate for Multiobjective Optimization". GECCO’ 2010 EMO workshop, July 2010.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Surrogate optimization: SVM for CMA 27/ 27
top related