devoxx real-time learning
DESCRIPTION
An expanded description of real-time learning including system designs that Ted Dunning presented at Devox France in March 2013TRANSCRIPT
![Page 1: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/1.jpg)
1©MapR Technologies - Confidential
Real-time Learning
![Page 2: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/2.jpg)
2©MapR Technologies - Confidential
whoami – Ted Dunning
Chief Application Architect, MapR Technologies Committer, member, Apache Software Foundation– particularly Mahout, Zookeeper and Drill
(we’re hiring)
Contact me [email protected]@[email protected]@ted_dunning
![Page 3: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/3.jpg)
3©MapR Technologies - Confidential
Slides and such (available late tonight):– http://www.mapr.com/company/events/devoxx-3-29-2013
Hash tags: #mapr #devoxxfr
![Page 4: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/4.jpg)
4©MapR Technologies - Confidential
Agenda
What is real-time learning? A sample problem Philosophy, statistics and the nature of the knowledge A solution System design
![Page 5: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/5.jpg)
5©MapR Technologies - Confidential
What is Real-time Learning?
Training data arrives one record at a time
The system improves a mathematical model based on a small amount of training data
We retain at most a fixed amount of state
Each learning step takes O(1) time and memory
![Page 6: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/6.jpg)
6©MapR Technologies - Confidential
We have a product to sell … from a web-site
![Page 7: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/7.jpg)
7©MapR Technologies - Confidential
What picture?
What tag-line?
What call to action?
![Page 8: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/8.jpg)
8©MapR Technologies - Confidential
The Challenge
Design decisions affect probability of success– Cheesy web-sites don’t even sell cheese
The best designers do better when allowed to fail– Exploration juices creativity
But failing is expensive– If only because we could have succeeded– But also because offending or disappointing customers is bad
![Page 9: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/9.jpg)
9©MapR Technologies - Confidential
A Quick Diversion
You see a coin– What is the probability of heads?– Could it be larger or smaller than that?
I flip the coin and while it is in the air ask again I catch the coin and ask again I look at the coin (and you don’t) and ask again Why does the answer change?– And did it ever have a single value?
![Page 10: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/10.jpg)
10©MapR Technologies - Confidential
A Philosophical Conclusion
Probability as expressed by humans is subjective and depends on information and experience
![Page 11: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/11.jpg)
11©MapR Technologies - Confidential
So now you understand Bayesian probability
![Page 12: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/12.jpg)
12©MapR Technologies - Confidential
Another Quick Diversion
Let’s play a shell game This is a special shell game It costs you nothing to play The pea has constant probability of being under each shell
(trust me)
How do you find the best shell? How do you find it while maximizing the number of wins?
![Page 13: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/13.jpg)
13©MapR Technologies - Confidential
Pause for short con-game
![Page 14: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/14.jpg)
14©MapR Technologies - Confidential
Conclusions
Can you identify winners or losers without trying them out?No
Can you ever completely eliminate a shell with a bad streak?No
Should you keep trying apparent losers?Yes, but at a decreasing rate
![Page 15: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/15.jpg)
15©MapR Technologies - Confidential
So now you understand multi-armed bandits
![Page 16: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/16.jpg)
16©MapR Technologies - Confidential
Is there an optimum strategy?
![Page 17: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/17.jpg)
17©MapR Technologies - Confidential
Thompson Sampling
Select each shell according to the probability that it is the best
Probability that it is the best can be computed using posterior
But I promised a simple answer
![Page 18: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/18.jpg)
18©MapR Technologies - Confidential
Thompson Sampling – Take 2
Sample θ
Pick i to maximize reward
Record result from using i
![Page 19: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/19.jpg)
19©MapR Technologies - Confidential
Nearly Forgotten until Recently
Citations for Thompson sampling
![Page 20: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/20.jpg)
20©MapR Technologies - Confidential
Bayesian Bandit for the Shells
Compute distributions based on data so far Sample p1, p2 and p3 from these distributions
Pick shell i where i = argmaxi pi
Lemma 1: The probability of picking shell i will match the probability it is the best shell
Lemma 2: This is as good as it gets
![Page 21: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/21.jpg)
21©MapR Technologies - Confidential
And it works!
![Page 22: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/22.jpg)
22©MapR Technologies - Confidential
Video Demo
![Page 23: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/23.jpg)
23©MapR Technologies - Confidential
The Basic Idea
We can encode a distribution by sampling Sampling allows unification of exploration and exploitation
Can be extended to more general response models
![Page 24: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/24.jpg)
24©MapR Technologies - Confidential
The Original Problem
x1x2
x3
![Page 25: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/25.jpg)
25©MapR Technologies - Confidential
Mathematical Statement
Logistic or probit regression
![Page 26: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/26.jpg)
26©MapR Technologies - Confidential
Same Algorithm
Sample θ
Pick design x to maximize reward
![Page 27: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/27.jpg)
27©MapR Technologies - Confidential
Context Variables
x1x2
x3
y1=user.geo y2=env.time y3=env.day_of_week y4=env.weekend
![Page 28: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/28.jpg)
28©MapR Technologies - Confidential
Two Kinds of Variables
The web-site design - x1, x2, x3– We can change these– Different values give different web-site designs
The environment or context – y1, y2, y3, y4– We can’t change these– They can change themselves
Our model should include interactions between x and y
![Page 29: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/29.jpg)
29©MapR Technologies - Confidential
Same Algorithm, More Greek Letters
Sample θ, π, φ
Pick design x to maximize reward, y’s are constant
This looks very fancy, but is actually pretty simple
![Page 30: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/30.jpg)
30©MapR Technologies - Confidential
Surprises
We cannot record a non-conversion until we wait
We cannot record a conversion until we wait for the same time
Learning from conversions requires delay
We don’t have to wait very long
![Page 31: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/31.jpg)
31©MapR Technologies - Confidential
![Page 32: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/32.jpg)
32©MapR Technologies - Confidential
![Page 33: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/33.jpg)
33©MapR Technologies - Confidential
![Page 34: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/34.jpg)
34©MapR Technologies - Confidential
![Page 35: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/35.jpg)
35©MapR Technologies - Confidential
Required Steps
Learn distribution of parameters from data– Logistic regression or probit regression (can be on-line!)– Need Bayesian learning algorithm
Sample from posterior distribution– Generally included in Bayesian learning algorithm
Pick design– Simple sequential search
Record data
![Page 36: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/36.jpg)
36©MapR Technologies - Confidential
Required system design
![Page 37: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/37.jpg)
37©MapR Technologies - Confidential
t
now
Hadoop is Not Very Real-time
UnprocessedData
Fully processed
Latest full period
Hadoop job takes this long for this data
![Page 38: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/38.jpg)
38©MapR Technologies - Confidential
t
now
Hadoop works great back here
Storm workshere
Real-time and Long-time together
Blended view
Blended view
Blended View
![Page 39: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/39.jpg)
39©MapR Technologies - Confidential
Traditional Hadoop Design
Can use Kafka cluster to queue log lines Can use Storm cluster to do real time learning Can host web site on NAS Can use Flume cluster to import data from Kafka to Hadoop Can record long-term history on Hadoop Cluster
How many clusters?
![Page 40: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/40.jpg)
40©MapR Technologies - Confidential
Kafka
Kafka Cluster
Kafka Cluster
Kafka Cluster
Storm
Users
Web Site
Kafka API
Web Service NAS
Design Targeting
Hadoop
HDFS Data
Flume
![Page 41: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/41.jpg)
41©MapR Technologies - Confidential
That is a lot of moving parts!
![Page 42: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/42.jpg)
42©MapR Technologies - Confidential
Alternative Design
Can host log catcher on MapR via NFS Storm can read data directly from queue Can host web server directly on cluster
Only one cluster needed– Total instances drops by 3x– Admin burden massively decreased
![Page 43: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/43.jpg)
43©MapR Technologies - Confidential
Users
Catcher Storm
Topic Queue
Web-server
http
Web Data
MapR
![Page 44: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/44.jpg)
44©MapR Technologies - Confidential
You can do thisyourself!
![Page 45: Devoxx Real-Time Learning](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b757ab4a7959380b8b4571/html5/thumbnails/45.jpg)
45©MapR Technologies - Confidential
Contact Me!
We’re hiring at MapR in US and Europe
MapR software available for research use
Contact me at [email protected] or @ted_dunning
Share news with @apachemahout
Tweet #devoxxfr #mapr #mahout @ted_dunning