![Page 1: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/1.jpg)
Beyond DataFrom User Information to Business Value
October, 2013
Xavier AmatriainDirector - Algorithms Engineering - Netflix
@xamat
![Page 2: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/2.jpg)
“In a simple Netlfix-style item recommender, we would simply apply some form of matrix factorization (i.e NMF)”
![Page 3: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/3.jpg)
From the Netflix Prize to today
2006 2013
![Page 4: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/4.jpg)
Everything is
Personalized
![Page 5: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/5.jpg)
Everything is personalized
Over 75% of what people watch comes from a recommendation
Ranking
![Page 6: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/6.jpg)
Top 10
Personalization awareness
Diversity
![Page 7: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/7.jpg)
But…
![Page 8: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/8.jpg)
Support for Recommendations
Social Support
![Page 9: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/9.jpg)
Gen
re R
ows
![Page 10: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/10.jpg)
Similars
![Page 11: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/11.jpg)
EVERYTHING is a Recommendation
![Page 12: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/12.jpg)
Consumer (Data) Science
![Page 13: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/13.jpg)
Consumer (Data) Science
1. Start with a hypothesis:■ Algorithm/feature/design X will increase member engagement
with our service, and ultimately member retention2. Design a test
■ Develop a solution or prototype■ Think about dependent & independent variables, control,
significance…3. Execute the test4. Let data speak for itself
![Page 14: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/14.jpg)
Offline/Online testing process
![Page 15: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/15.jpg)
Executing A/B testsMeasure differences in metrics across statistically identical populations that each experience a different algorithm.
■ Decisions on the product always data-driven■ Overall Evaluation Criteria (OEC) = member retention
■ Use long-term metrics whenever possible■ Short-term metrics can be informative and allow faster decisions
■ But, not always aligned with OEC■ Significance and hypothesis testing (1000s of members and 2-
20 cells)■ A/B Tests allow testing many (radical) ideas at the same
time (typically 100s of customer A/B tests running)
![Page 16: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/16.jpg)
Offline testing■ Measure model performance, using (IR) metrics■ Offline performance used as an indication to make
informed decisions on follow-up A/B tests■ A critical (and mostly unsolved) issue is how offline
metrics can correlate with A/B test results.■ Extremely important to define offline evaluation
framework that maps to online OEC■ e.g. How to create training/testing datasets may not be trivial
![Page 17: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/17.jpg)
Data&
Models
![Page 18: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/18.jpg)
Big Data @Netflix ■ > 40M subscribers■ Ratings: ~5M/day■ Searches: >3M/day■ Plays: > 50M/day■ Streamed hours:
○ 5B hours in Q3 2013Member Behavior
Geo-informationTime
Impressions
Device Info
Metadata
Social
Ratings
Demographics
![Page 19: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/19.jpg)
Smart Models ■ Regression models (Logistic, Linear, Elastic nets)
■ SVD & other MF models■ Factorization Machines■ Restricted Boltzmann Machines■ Markov Chains & other graph
models■ Clustering (from k-means to
HDP)■ Deep ANN■ LDA■ Association Rules■ GBDT/RF■ …
![Page 20: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/20.jpg)
SVD for Rating Prediction■ User factor vectors and item-factors vectors■ Baseline (bias) (user & item deviation
from average)■ Predict rating as■ SVD++ (Koren et. Al) asymmetric variation w.
implicit feedback
■ Where ■ are three item factor vectors■ Users are not parametrized, but rather represented by:
■ R(u): items rated by user u & N(u): items for which the user has given implicit preference (e.g. rated/not rated)
![Page 21: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/21.jpg)
Restricted Boltzmann Machines
■ Restrict the connectivity in ANN to make learning easier.■ Only one layer of hidden units.
■ Although multiple layers are possible
■ No connections between hidden units.
■ Hidden units are independent given the visible states..
■ RBMs can be stacked to form Deep Belief Networks (DBN) – 4th generation of ANNs
![Page 22: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/22.jpg)
Ranking■ Ranking = Scoring + Sorting + Filtering
bags of movies for presentation to a user■ Key algorithm, sorts titles in most contexts■ Goal: Find the best possible ordering of a
set of videos for a user within a specific context in real-time
■ Objective: maximize consumption & “enjoyment”
■ Factors■ Accuracy■ Novelty■ Diversity■ Freshness■ Scalability■ …
![Page 23: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/23.jpg)
Popularity
Pred
icte
d R
atin
g
1
2
34
5
Linear Model:f
rank(u,v) = w
1 p(v) + w
2 r(u,v) + b
Final R
ankin
gExample: Two features, linear model
![Page 24: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/24.jpg)
Popularity
1
2
34
5
Final R
ankin
gPr
edic
ted
Rat
ing
Example: Two features, linear model
![Page 25: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/25.jpg)
Ranking
![Page 26: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/26.jpg)
Learning to Rank Approaches
■ ML problem: construct ranking model from training data
1. Pointwise (Ordinal regression, Logistic regression, SVM, GBDT, …)■ Loss function defined on individual relevance judgment
2. Pairwise (RankSVM, RankBoost, RankNet, FRank…)■ Loss function defined on pair-wise preferences■ Goal: minimize number of inversions in ranking
3. Listwise ■ Indirect Loss Function (RankCosine, ListNet…)■ Directly optimize IR measures (NDCG, MRR, FCP…)
■ Genetic Programming or Simulated Annealing■ Use boosting to optimize NDCG (Adarank)■ Gradient descent on smoothed version (CLiMF, TFMAP, GAPfm @cikm13)■ Iterative Coordinate Ascent (Direct Rank @kdd13)
![Page 27: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/27.jpg)
Other research questions we are working on
● Row selection● Diversity● Similarity● Context-aware recommendations● Explore/exploit● Presentation bias correction● Mood and session intent inference● Unavailable Title Search● ...
![Page 28: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/28.jpg)
More data or better models?
![Page 29: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/29.jpg)
More data or better models?
Really?
Anand Rajaraman: Former Stanford Prof. & Senior VP at Walmart
![Page 30: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/30.jpg)
Sometimes, it’s not about more data
More data or better models?
![Page 31: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/31.jpg)
[Banko and Brill, 2001]
Norvig: “Google does not have better Algorithms, only more Data”
Many features/ low-bias models
More data or better models?
![Page 32: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/32.jpg)
More data or better models?
Sometimes, it’s not about more data
![Page 33: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/33.jpg)
XMore data or better models?
![Page 34: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/34.jpg)
“Data without a sound approach = noise”
![Page 35: Cikm 2013 - Beyond Data From User Information to Business Value](https://reader033.vdocuments.site/reader033/viewer/2022051819/54c65a1c4a79594b538b4575/html5/thumbnails/35.jpg)
More data + Smarter models +
More accurate metrics + Better approaches
Lots of room for improvement!