Robson Motta | [email protected]
Aprendizado de Máquina e Visualização de Informaçãopara otimização de Sistemas de Recomendação
312.000.000.000(this means billions)
recommendations in 2014
Get to know our solutions
How to present the
bestrecommendation for each client/context?
recommendations
data
recommendations
data
preprocessing
processing
postprocessing
● products● pageviews● clicks● buyorders
etc.
Machine Learning
“All models are wrong,but some are useful”
(George E. P. Box)
Collaborative Filtering1
Collaborative Filtering
1
Customers Who Bought This Item Also Bought, PaulsHealthBlog.com, 11.04.2014
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
Collaborative Filtering
1
user-based
Collaborative Filtering
1
10 5 7 0 2 3 4 1...Collaborative
Filtering
1
10 5 7 0 2 3 4 1...item-based
Collaborative Filtering
1
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???new items
How do weguarantee qualityto our clients?
● subjective evaluation: Visualization● objective evaluation: Quality measures● online evaluation: A/B test● online optimization: Bandit
Multidimensional Projection(tSNE technique)
Stability, purity and coverage measures
Content-based Filtering2
Content-based Filtering
2
frequency of term n in document d
IDF factor ofterm n
weight of term nwithin document d
reference
reference
reference
reference
Content-based Filtering
2
Content-based Filtering
2
Content-based Filtering
2
… main issues
the numberof clusters
Clustering
3
… main issues
false positives(pair of products wrongly
assigned to the same cluster)
false negatives(pair of products wrongly
assigned to different clusters)
Clustering
3
… main issues
unbalanced classes
unlabeled areas
Classification
4
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???new items
Challenges
+...
popular items
outliers
incompatible
principal-accessory
+
+
???new itemsx
Circular connected chart: alternatives
Circular connected chart: complementars
Tabular information
Circular connected chart: complementars
A/B tests
+16%clicks
final result:10 days95% significance
Multi-armed Bandit5
Multi-armed Bandit
5
Exploration-Exploitation trade-off
Multi-armed Bandit
5
… case 1
algorithm 2
algorithm 1
…algorithm N
Multi-armed Bandit
5
… case 2
order 2
order 1
…
Multi-armed Bandit
chance to be picked
5
Multi-armed Bandit
5 chance to be picked
Multi-armed Bandit
5 chance to be picked
Multi-armed Bandit
5 chance to be picked
Multi-armed Bandit
user feedback: click
5 chance to be picked
Multi-armed Bandit
5 chance to be picked
Multi-armed Bandit
user feedback: click
5 chance to be picked
Bandit - Beta Distribution
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and10 attempts
0 success and0 attempts
5 success and10 attempts
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and10 attempts
0 success and0 attempts
5 success and10 attempts
Bandit - Beta Distribution
http://www.distributome.org/js/sim/BetaSimulation.html
0 success and10 attempts
0 success and0 attempts
5 success and10 attempts
Bandit - Beta Distribution
0 success and10 attempts
0 success and0 attempts
5 success and10 attempts
Bandit - Thompson Sampling
http://www.distributome.org/js/sim/BetaSimulation.html
Bandit - Thompson Sampling
success and attempts: [(0, 10), (0, 7), (0, 7), (0, 6), (0, 4), (0, 3), (0, 4), (0, 3), (0, 0), (0, 0), ...
success and attempts: [(1, 44), (10, 398), (0, 66), (1, 57), (2, 25), (14, 324), (0, 3), (1, 46), ...
Bandit - Thompson Sampling
success and attempts: [(103, 1183), (64, 1138), (48, 900), (25, 524), (56, 527), (37, 546), (11, 216), …
success and attempts: [(143, 2227), (8, 299), (119, 1706), (28, 889), (146, 1288), (86, 1646), (63, 1272) ...
Bandit convergence
A/B tests
+3,5 % purchases
final result:25 days95% significance