machine learning in 25 minutes or less and why the hotos folks should care... terran lane dept. of...
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/1.jpg)
Machine Learning in 25 minutes or
lessAnd why the HotOS folks should care...
Terran LaneDept. of Computer ScienceUniversity of New Mexico
20
15
![Page 2: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/2.jpg)
Machine learning is the study of algorithms or systems that improve their performance in response to experience.
![Page 3: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/3.jpg)
Machine learning is the study of algorithms or systems that improve their performance in response to experience.
![Page 4: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/4.jpg)
Machine learning is the study of algorithms or systems that improve their performance in response to experience.
![Page 5: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/5.jpg)
Machine learning is the study of algorithms or systems that improve their performance in response to experience.
![Page 6: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/6.jpg)
The core ML problem
The W
orl d
![Page 7: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/7.jpg)
The core ML problemThe W
orl d
- Network- CPU- Program memory footprint- User activity- Multi-process performance
![Page 8: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/8.jpg)
The core ML problem
The W
orl d
Senso
rs
![Page 9: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/9.jpg)
The core ML problem
The W
orl d
Senso
rs
- Latency; bandwidth- Branches taken; cache misses- Memory allocs; object age- Keystroke rates; recent commands- Process throughput; cache activity; synch delays
![Page 10: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/10.jpg)
The core ML problem
The W
orl d
Senso
rs
X
![Page 11: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/11.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
prediction
![Page 12: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/12.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
- Compression/redundancy rates- Branch prediction- Object lifetime- Legitimate/hostile- Normal/abnormal
![Page 13: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/13.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
![Page 14: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/14.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
L(ŷ)
assessment
![Page 15: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/15.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
L(ŷ,y)
assessment
y
![Page 16: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/16.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
L(ŷ,y)
assessment
y
- accuracy (0/1 loss)- squared error- time-to-response
![Page 17: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/17.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
control
![Page 18: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/18.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
response
![Page 19: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/19.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
L(ŷ,X’)
![Page 20: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/20.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
L(ŷ,X’)
- Correctness- Stability- Robustness- Total system performance (throughput, latency, etc.)
![Page 21: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/21.jpg)
The core ML problem
The W
orl d
Senso
rsModel
f(X)
X
Performancemeasure
assessment
![Page 22: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/22.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
Performancemeasure
assessment
- ???- Do you like the model?- Does it make sense?- Does it make you feel warm and fuzzy?
![Page 23: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/23.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
The ML job:find this...
![Page 24: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/24.jpg)
The core ML problemThe W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
assessment
The ML job:find this...
... so thatthis is as good
as possible.
![Page 25: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/25.jpg)
Types of learning•Supervised
•Reinforcement learning
•Unsupervised
•Special cases:
•Semi-supervised
•Anomaly detection
•Behavioral cloning
•etc...
![Page 26: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/26.jpg)
Supervised Learning•Characteristics:
•Measure features/sensor values ⇒ X
•Want to predict system “output”, y
•Have some source of example (X,y) pairs
•System, human-labeling, etc.
•Have a well-defined performance criterion
![Page 27: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/27.jpg)
Example sup. learners•Discriminative: only produces classifier
•Decision tree: fast; comprehensible models
•Support vector machine: high dim data; accurate
•Nearest-neighbor / k-nn: low-dim data; slow
•Neural net: special case of SVM
•Generative: produces complete probability model
•Naive Bayes: very simple; surprisingly accurate
•Bayesian network: powerful; descriptive; accurate
•Markov random field: closely related to BNs
•Meta-learners/ensemble methods: sets of models
•Boosting
•Bagging
•Winnow
![Page 28: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/28.jpg)
Key assumption #1
The train/test data reflect the same data
distribution that will be experienced when the
learned model is embedded in
performance system.•System not changing over time
•Model doesn’t affect behavior of system
![Page 29: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/29.jpg)
Key assumption #2
All data points are statistically
independent.
•No linkage between “adjacent”/“successive” points
•No other process that is affecting data generation
![Page 30: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/30.jpg)
Reinforcement learning•Characteristics:
•Measure features of system ⇒ X
•Want to control sys. -- model outputs are “knobs”
•Can interact with system/simulation
•Have performance measure that recognizes “good” system behavior
•Don’t need to know “correct” control actions
![Page 31: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/31.jpg)
Key criterion•Are the sensor readings enough to completely
characterize state of the system?
•Knowing X tells you everything relevant
•Yes:
•“Fully observable”
•Learning optimal performance fairly tractable (*)
•No (multiple system states produce same X):
•“Partially observable”
•Learning barely satisfactory performance incredibly difficult (PSPACE-complete. Or worse.)
![Page 32: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/32.jpg)
RL: The good news•It does everything that traditional control
doesn’t!
•Stochasticity ok
•Don’t need a model
•Don’t need linearity
•Discrete time ok
•No messy ODEs or z transforms!
•Delay ok
![Page 33: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/33.jpg)
RL: The bad news•Low dimensions
•Discrete variables/features
•Need to know state space
•Convergence can be slow
•Glacial
•Optimal control can be intractable
![Page 34: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/34.jpg)
Example RL•Fully observable systems
•Q-learning
•SARSA
•Dyna
•E3
•Partially observable
•Reinforce
•Utile distinction memories
•Policy gradient methods
![Page 35: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/35.jpg)
Key difference #1Unlike supervised learning...
Distinct data points can be temporally
correlated.•Key parameter: how much history is
necessary to characterize the system?
•Markov order
•1 time unit? 2? All of them?
![Page 36: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/36.jpg)
Key difference #2Unlike supervised learning...
Model is expected to influence behavior of
system•It’s a good thing...
![Page 37: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/37.jpg)
References (partial)•General:
•Mitchell, Machine Learning, McGraw-Hill, 1997.
•Duda, Hart, & Stork, Pattern Classification, Wiley, 2001.
•Hastie, Tibshirani, & Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, 2001.
•Software (general; mostly supervised):
•Weka: Data Mining Software in Java.http://www.cs.waikato.ac.nz/ml/weka/
![Page 38: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/38.jpg)
References (partial)•Decision trees:
•Quinlan, C4.5: Programs for machine learning, Morgan Kaufmann, 1993.
•Brieman, Classification & Regression Trees (CART), Wadsworth, 1983.
•Support vector machines:
•Burges, “A Tutorial on Support Vector Machines for Pattern Recognition”, Data Mining and Knowledge Discovery, 2(2), 1998.
•Software: SVMlighthttp://svmlight.joachims.org/
![Page 39: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/39.jpg)
References (partial)•Reinforcement learning
•Sutton & Barto, Reinforcment Learning: An Introduction, MIT Press, 1998.
•Kaelbling, Littman, & Moore, “Reinforcement Learning: A Survey”, Journal of Artificial Intelligence Research, 4, 1996.
•Kaelbling, Littman, & Cassandra, “Planning and Acting in Partially Observable Stochastic Domains”, Artificial Intelligence, 101,1998.
![Page 40: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/40.jpg)
Thank you!
Questions?
![Page 41: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/41.jpg)
ML keywords•Learning
•Adaptive
•Self-tuning
•State estimation
•Parameter estimation
•Data mining
•Computational statistics
•Predictive modeling
•Pattern recognition
•etc...
![Page 42: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/42.jpg)
The Learning LoopThe W
orl d
Senso
rsModel
f(X)
X
ŷ
Performancemeasure
L(ŷ,y)
assessment
y
Generate“training”
data
Learningmodule
f(X)
Performancemeasure
![Page 43: Machine Learning in 25 minutes or less And why the HotOS folks should care... Terran Lane Dept. of Computer Science University of New Mexico terran@cs.unm.edu](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649d415503460f94a1bd54/html5/thumbnails/43.jpg)
The training process•Gather large set of “training data”
•Dtrain
=[ (X1,y
1), (X
2,y
2), ... , (X
n,y
n) ]
•Also large set of “testing” (eval; holdout) data
•Deval
=[ (X1,y
1), ... , (X
m,y
m) ]
•Apply learner to train to get model
•f() = learn(Dtrain
,L)
•Evaluate results on test set
•[ ŷtest
] = f(Xtest
)
•assessment = L(ŷtest
,ytest
)