automatic machine learning (automl) and how to speed it...
TRANSCRIPT
![Page 1: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/1.jpg)
Automatic Machine Learning (AutoML)and How To Speed It Up
Frank Hutter
Department of Computer Science
University of Freiburg, Germany
![Page 2: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/2.jpg)
2
AutoML and Meta-Learning
Current deep learning practice
Expert chooses architecture &
hyperparameters
Deep learning
“end-to-end”
AutoML: true end-to-end learning
End-to-end learning
Meta-level learning &
optimization
Learning box
![Page 3: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/3.jpg)
End-to-end learning
Meta-level learning &
optimization
Learning box
3
AutoML as Blackbox Optimization
f()
Blackbox optimization
Random search, evolutionary methods, reinforcement learning,
…Bayesian optimization
![Page 4: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/4.jpg)
4
Effectiveness of Bayesian Optimization
Random search
Bayesianoptimization 20x speedup
no speedup
Example: Optimizing a deep feedforward net on dataset adult, 7 hyperparameters
“Sometimes, BayesOpt is only twice as fast as Random Search“• But sometimes it is dramatically faster
![Page 5: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/5.jpg)
5
Effectiveness of Bayesian Optimization
Example: Optimizing CPLEX on combinatorial auctions (Regions 100), 76 hyperparameters
Random search
Bayesian optimization(SMAC)
20x speedup
200x speedup
Loss
(ru
nti
me
of
op
tim
ized
solv
er)
![Page 6: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/6.jpg)
6
Same Pattern Occurs in RL vs. Random Search
Figure taken from „Neural Architecture Search by Reinforcement Learning“, Zoph & Le
Up to 1200 function evaluations: RL not better than Random Search
Imp
rove
men
to
fR
L vs
. ran
do
mse
arch
(per
ple
xity
)
Larger budgets: greater improvements
![Page 7: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/7.jpg)
End-to-end learning
Meta-level learning &
optimization
Learning box
7
AutoML as Blackbox Optimization
f()
Blackbox optimization
Random search, evolutionary methods, reinforcement learning,
…Bayesian optimization Too slow for big data
![Page 8: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/8.jpg)
8
ways to go beyondblackbox optimization
AutoML systems
![Page 9: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/9.jpg)
• Large-scale challenge run by ChaLearn & CodaLab
– 17 months, 5 phases with 5 new datasets each (2015-2016)
– 2 tracks: code submissions / Kaggle-like human track
• Code submissions: true end-to-end learning necessary
– Get training data, learn model, make predictions for test data
– 1 hour end-to-end
• 25 datasets from wide range of application areas
– Already featurized
– Inputs: features X, targets y
9
Benchmark: AutoML Challenge
![Page 10: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/10.jpg)
– Parameterize ML framework: WEKA [Witten et al, 1999-current]
• 27 base classifiers (with up to 10 hyperparameters each)
• 2 ensemble methods; in total: 786 hyperparameters
– Optimize CV performance by Bayesian optimization (SMAC)• Only evaluate more folds for good configurations
– 5x speedups for 10-fold CV
10
AutoML System 1: Auto-WEKA
Meta-level learning &
optimizationWEKA
[Thornton, Hutter, Hoos, Leyton-Brown, KDD 2013; Kotthoff et al, JMLR 2016]
Available in WEKA package manager; 400 downloads/week
![Page 11: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/11.jpg)
• Optimize CV performance by SMAC
– Meta-learning to warmstart Bayesian optimization• Reasoning over different datasets
• Dramatically speeds up the search (2 days 1 hour)
– Automated posthoc ensemble construction to combine the models Bayesian optimization evaluated• Efficiently re-uses its data; improves robustness
11
AutoML System 2: Auto-sklearn
Meta-level learning &
optimization
Scikit-learn
[Feurer, Klein, Eggensperger, Springenberg, Blum, Hutter; NIPS 2015]
![Page 12: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/12.jpg)
• Winning approach in the AutoML challenge– Auto-track: overall winner, 1st place in 3 phases, 2nd place in 1
• Close competitor: variant of automatic statistician [Lloyd et al]
– Human track: always in top-3 vs. 150 teams of human experts
– Final two rounds: won both tracks
• Trivial to use:
12
Auto-sklearn: Ready for Prime Time
https://github.com/automl/auto-sklearn
![Page 13: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/13.jpg)
• CV performance optimized by SMAC
• Joint optimization of:
– Network architecture
– Hyperparameters
13
AutoML System 3: Auto-Net
Meta-level learning &
optimization
Deep neural net
![Page 14: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/14.jpg)
• Featurized data fully-connected network
– Up to 5 layers (with 3 layer hyperparameters each)
– 14 network hyperparameters, in total 29 hyperparameters
– Optimized for 18h on 5GPUs
• Auto-Net won several datasets against human experts
– E.g., Alexis data set: • 54491 data points,
5000 features, 18 classes
– First automated deep learning system to win a ML competition data set against human experts
14
Auto-Net in AutoML Challenge[Mendoza, Klein, Feurer, Springenberg & Hutter, AutoML 2016]
![Page 15: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/15.jpg)
• Reasoning across subsets of the data
– Up to 1000x speedups [Klein et al, AISTATS 2017]
• Reasoning across training epochs[Swersky et al, arXiv 2014][Domahn et al, IJCAI 2015]
15
Using Cheap Approximations of the Blackboxlo
g(C
)
log() log() log() log()
log(
C)
log(
C)
log(
C)
log()
![Page 16: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/16.jpg)
• Successive Halving [Jamieson & Talwalkar, AISTATS 2015]
– Run N (=many) configurations for a small budget B
– Iteratively:Select best half of configurations and double their budget
• Hyperband [Li et al, ICLR 2017]
– Calls Successive Halving iteratively withdifferent tradeoffs of N and B
16
Hyperband & Successive Halving
![Page 17: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/17.jpg)
17
Hyperband vs. Random Search
Biggest advantage: much improved anytime performance
20x speedup
3x speedup
Auto-Net on dataset adult
![Page 18: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/18.jpg)
18
Bayesian Optimization vs. Random Search
Biggest advantage: much improved final performance
no speedup (1x)
10x speedup
Auto-Net on dataset adult
![Page 19: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/19.jpg)
19
Combining Bayesian Optimization & Hyperband
Best of both worlds: strong anytime and final performance
[Falkner, Klein & Hutter, BayesOpt 2017]
20x speedup
50x speedup
Auto-Net on dataset adult
![Page 20: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/20.jpg)
20
Almost Linear Speedups By Parallelization[Falkner, Klein & Hutter, BayesOpt 2017]
8 parallel workers
7.5x speedup
Auto-Net on dataset adult
![Page 21: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/21.jpg)
• Six design decisions
– Depth, widening factor
– Learning rate, batch size, weight decay, momentum
• Maximum budget per CNN run: 2 hours on a Titan X
– Ran BO-HB for 12 hours on 10 GPUs
– Result: 4% test error
• Maximum budget per CNN run: 3 hours on a Titan X
– Ran BO-HB for 12 hours on 10 GPUs
– Result: 3.5% test error
21
Tuning CNNs on a Budget: CIFAR-10[Falkner, Klein & Hutter, BayesOpt 2017]
![Page 22: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/22.jpg)
22
Neural Architecture Search on a Budget[Elsken, Metzen & Hutter, MetaLearn 2017]
Result: architecture search in 12 hours on 1 GPU: 5.7% on CIFAR-10
Online Adaptation of Architecture & Hyperparams
Network morphisms[Chen et al, 2015;
Wei et al, 2016;
Cai et al, 2017] Cosine annealing[Loshchilov & Hutter, 2017]
![Page 23: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/23.jpg)
• Bayesian optimization enables true end-to-end learning– Auto-WEKA, Auto-sklearn & Auto-Net
• Large speedups by going beyond blackbox optimization– Learning across datasets
– Learning across data subsets & epochs
– Combination of Hyperband and Bayesian optimization
– Online adaptation of architectures & hyperparameters
• Links to code: http://automl.org
23
Conclusion
![Page 24: Automatic Machine Learning (AutoML) and How To Speed It Upmetalearning-symposium.ml/files/hutter.pdf · Automatic Machine Learning (AutoML) and How To Speed It Up Frank Hutter Department](https://reader030.vdocuments.site/reader030/viewer/2022041015/5ec60955f93b2b072f30b7ab/html5/thumbnails/24.jpg)
24
Thanks!
My fantastic team
Other collaboratorsUBC: Chris Thornton, Holger Hoos, Kevin Leyton-Brown, Kevin Murphy
DeepMind: Ziyu Wang, Nando de Freitas
Bosch: Thomas Elsken, Jan Hendrik Metzen
MPI Tübingen: Philipp Hennig
Uni Freiburg: Tobias Springenberg, Robin Schirrmeister, Tonio Ball, Thomas Brox, Wolfram Burgard
EU projectRobDREAM
Funding sources
I‘m looking for more great postdocs!