![Page 1: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/1.jpg)
A field guide to the machine learning zooTheodore Vasiloudis SICS/KTH
![Page 2: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/2.jpg)
From idea to objective function
![Page 3: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/3.jpg)
Formulating an ML problem
![Page 4: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/4.jpg)
Formulating an ML problem
● Common aspects
Source: Xing (2015)
![Page 5: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/5.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)
Source: Xing (2015)
![Page 6: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/6.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)○ Data (D)
Source: Xing (2015)
![Page 7: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/7.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)○ Data (D)
● Objective function: L(θ, D)
Source: Xing (2015)
![Page 8: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/8.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)○ Data (D)
● Objective function: L(θ, D)● Prior knowledge: r(θ)
Source: Xing (2015)
![Page 9: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/9.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)○ Data (D)
● Objective function: L(θ, D)● Prior knowledge: r(θ)● ML program: f(θ, D) = L(θ, D) + r(θ)
Source: Xing (2015)
![Page 10: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/10.jpg)
Formulating an ML problem
● Common aspects○ Model (θ)○ Data (D)
● Objective function: L(θ, D)● Prior knowledge: r(θ)● ML program: f(θ, D) = L(θ, D) + r(θ)● ML Algorithm: How to optimize f(θ, D)
Source: Xing (2015)
![Page 11: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/11.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
![Page 12: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/12.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): ● Model (θ): ● Objective function - L(D, θ): ● Prior knowledge (Regularization):● Algorithm:
![Page 13: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/13.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi, yi● Model (θ): ● Objective function - L(D, θ): ● Prior knowledge (Regularization): ● Algorithm:
![Page 14: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/14.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi, yi● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤx))
● Objective function - L(D, θ): ● Prior knowledge (Regularization): ● Algorithm:
![Page 15: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/15.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi, yi● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤx))
● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤxi))
● Prior knowledge (Regularization): r(w) = λ*wΤw● Algorithm:
Warning: Notation abuse
![Page 16: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/16.jpg)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter● Assumption: Users churn because they don’t engage with the platform● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi, yi● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤx))
● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤxi))
● Prior knowledge (Regularization): r(w) = λ*wΤw● Algorithm: Gradient Descent
![Page 17: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/17.jpg)
Data problems
![Page 18: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/18.jpg)
Data problems
● GIGO: Garbage In - Garbage Out
![Page 19: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/19.jpg)
Data readiness
Source: Lawrence (2017)
![Page 20: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/20.jpg)
Data readiness
● Problem: “Data” as a concept is hard to reason about.● Goal: Make the stakeholders aware of the state of the data at all stages
Source: Lawrence (2017)
![Page 21: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/21.jpg)
Data readiness
Source: Lawrence (2017)
![Page 22: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/22.jpg)
Data readiness
● Band C○ Accessibility
Source: Lawrence (2017)
![Page 23: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/23.jpg)
Data readiness
● Band C○ Accessibility
● Band B○ Representation and faithfulness
Source: Lawrence (2017)
![Page 24: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/24.jpg)
Data readiness
● Band C○ Accessibility
● Band B○ Representation and faithfulness
● Band A○ Data in context
Source: Lawrence (2017)
![Page 25: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/25.jpg)
Data readiness
● Band C○ “How long will it take to bring our user data to C1 level?”
● Band B○ “Until we know the collection process we can’t move the data to B1.”
● Band A○ “We realized that we would need location data in order to have an A1 dataset.”
Source: Lawrence (2017)
![Page 26: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/26.jpg)
Data readiness
● Band C○ “How long will it take to bring our user data to C1 level?”
● Band B○ “Until we know the collection process we can’t move the data to B1.”
● Band A○ “We realized that we would need location data in order to have an A1 dataset.”
![Page 27: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/27.jpg)
Selecting algorithm & software:“Easy” choices
![Page 28: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/28.jpg)
Selecting algorithms
![Page 29: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/29.jpg)
An ML algorithm “farm”
Source: scikit-learn.org
![Page 30: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/30.jpg)
The neural network zoo
Source: Asimov Institute (2016)
![Page 31: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/31.jpg)
Selecting algorithms
● Always go for the simplest model you can afford
![Page 32: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/32.jpg)
Selecting algorithms
● Always go for the simplest model you can afford○ Your first model is more about getting the infrastructure right
Source: Zinkevich (2017)
![Page 33: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/33.jpg)
Selecting algorithms
● Always go for the simplest model you can afford○ Your first model is more about getting the infrastructure right○ Simple models are usually interpretable. Interpretable models are easier to debug.
Source: Zinkevich (2017)
![Page 34: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/34.jpg)
Selecting algorithms
● Always go for the simplest model you can afford○ Your first model is more about getting the infrastructure right○ Simple models are usually interpretable. Interpretable models are easier to debug.○ Complex model erode boundaries
Source: Sculley et al. (2015)
![Page 35: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/35.jpg)
Selecting algorithms
● Always go for the simplest model you can afford○ Your first model is more about getting the infrastructure right○ Simple models are usually interpretable. Interpretable models are easier to debug.○ Complex model erode boundaries
■ CACE principle: Changing Anything Changes Everything
Source: Sculley et al. (2015)
![Page 36: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/36.jpg)
Selecting software
![Page 37: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/37.jpg)
The ML software zoo
Leaf
![Page 38: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/38.jpg)
Your model vs. the world
![Page 39: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/39.jpg)
What are the problems with ML systems?
Data ML Code Model
Expectation
![Page 40: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/40.jpg)
What are the problems with ML systems?
Data ML Code Model
Reality
Sculley et al. (2015)
![Page 41: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/41.jpg)
Things to watch out for
![Page 42: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/42.jpg)
● Data dependencies
Things to watch out for
Sculley et al. (2015)& Zinkevich (2017)
![Page 43: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/43.jpg)
● Data dependencies○ Unstable dependencies
Things to watch out for
Sculley et al. (2015)& Zinkevich (2017)
![Page 44: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/44.jpg)
● Data dependencies○ Unstable dependencies
● Feedback loops
Things to watch out for
Sculley et al. (2015)& Zinkevich (2017)
![Page 45: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/45.jpg)
● Data dependencies○ Unstable dependencies
● Feedback loops○ Direct
Things to watch out for
Sculley et al. (2015)& Zinkevich (2017)
![Page 46: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/46.jpg)
● Data dependencies○ Unstable dependencies
● Feedback loops○ Direct○ Indirect
Things to watch out for
Sculley et al. (2015)& Zinkevich (2017)
![Page 47: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/47.jpg)
Bringing it all together
![Page 48: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/48.jpg)
Bringing it all together
● Define your problem as optimizing your objective function using data● Determine (and monitor) the readiness of your data● Don't spend too much time at first choosing an ML framework/algorithm● Worry much more about what happens when your model meets the world.
![Page 49: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/49.jpg)
Thank you.
![Page 50: learning zoo A field guide to the machine - FOSDEM€¦ · A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH. From idea to objective function. Formulating an](https://reader036.vdocuments.site/reader036/viewer/2022070107/6021b281bed9d846ed2e37d0/html5/thumbnails/50.jpg)
Sources
● Google auto-replies: Shared photos, and text● Silver et al. (2016): Mastering the game of Go● Xing (2015): A new look at the system, algorithm and theory foundations of Distributed ML● Lawrence (2017): Data readiness levels● Asimov Institute (2016): The Neural Network Zoo● Zinkevich (2017): Rules of Machine Learning - Best Practices for ML Engineering● Sculley et al. (2015): Hidden Technical Debt in Machine Learning Systems