living labs challenge workshop
TRANSCRIPT
Open Recommendation Platform
For Researchersand DevelopersLiving Labs Challenge WorkshopUniversity of AmsterdamJune 6th, 2014
Torben Brodtplista GmbH
-> http://orp.plista.com-> http://living-labs.net/llc/
@torbenbrodt
1. what we built for ourselves○ recommendation engine
2. how we built it○ big data math○ system architecture
3. application for “living labs”○ for developers, researchers and geeks
Contents
@torbenbrodt
Not just opening algorithms to partners, But opening our platform to algorithms.
where● news websites● below the article● now in NL too!
different types● content● advertising
What we built for ourselves
Recommendation Engine
Visitors Publisher
#@torbenbrodt
What we built for ourselves
Recommendation Engine
Visitors PublisherResults
Request
Engine
@torbenbrodt
Context
Personalized
II
What we built for ourselves
Collaborative Filtering
Peter James
Peter and James have sth in common.They both like football
Term: User Similarity@torbenbrodt
What we built for ourselves
Collaborative Filtering
Peter James
Tennis will be recommendation for Peter, because James likes it too.
Item Recommendation from User Similarity@torbenbrodt
● more data => more knowledge
● not needed: ○ domain knowledge○ concrete user○ concrete article
What we built for ourselves
Collaborative Filtering
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching recommendation content
● but which ads to present to political content?
Most Popular
etc ...
● premise: what everybody likes is also good to me
● e.g. public trends, social likes, wiki data
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching recommendation content
● but which ads to present to political content?
Most Popular
etc ...
● premise: what everybody likes is also good to me
● e.g. public trends, social likes, wiki data
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching recommendation content
● but which ads to present to political content?
Most Popular
etc ...
● premise: what everybody likes is also good to me
● e.g. public trends, social likes, wiki data, NLP, Matrix Fac.
@torbenbrodt
What we built for ourselves
good recommendations for...
User happy!
Advertiser happy!
Publisher happy!
plista happy!
@torbenbrodt
What we built for ourselves
What are the goals?
high number of...● clicks● attention● orders● engages/videos● time on site● page depth
bad good
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone su...
500 +1
twitter buys p.. 500
google has seri. 500
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone su...
500 +1
twitter buys p.. 500
google has seri. 500
used to A/B test our algorithms
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone su...
500 +1
twitter buys p.. 500
google has seri. 500
@torbenbrodt
What we built for ourselves
All goals have a context
Ad or Content or Recommender
...
...
...
● user agent > device > mobile● IP address > geolocation● referer > origin (search,
direct)● anonym!
@torbenbrodt
What we built for ourselves
All goals have a context
Which channel to show the Advertising
Which publishers tend to click on Semantic Recommendations
Which geolocation is the right for this Content
Questions the context can answer
Answers to this are given by the algorithms
@torbenbrodt
What we built for ourselves
All goals have a context
● answers change each second● bayesian bandit approach
temporary success?
No. 1 getting most
local minima?
@torbenbrodt
✓ easy exploration
● minimum pre-testing● no risk if recommender
crashs● "bad" code might find
its context
numbers in short● 5k recs per second● 250 Mbit contextual data● 100 items per second
quite scaling issues
● big data math● message bus
How we built it?
#@torbenbrodt
Events
Technology Stack
Message Bus
Subscribers● algorithms● payment● etc
Visitor
● new articles● delivered● clicks
@torbenbrodt
How we built it?
Big Data Math
Article 1+1 10
Article 100 2+5
Art...
@torbenbrodt
number of● clicks● orders● engages● time on site● money
What math do we need?
● Addition can solve most formulas● with Logarithm also multiplications● Real-Time Ready
○ atomic○ fast
How we built it?
Big Data Math
@torbenbrodt
How we built it?
Big Data Math
welt.de_201406
new iphone su... 500 +1
twitter buys p.. 400
google has seri... 300
ZINCRBY (WRITE)"welt.de_201406"
"article 1"
"1"
ZUNION (JOIN) “welt.de_201406”
“geolocation:NL_201406”
ZREVRANGEBYSCORE (FETCH)
@torbenbrodt
Application for Living Labs
#
● These are your visitors
@torbenbrodt
● This is your data
● Assume this is open!
● This is your challenge
● Message Bus provides YOU with data
Application for Living Labs
Your role in the ORP
@torbenbrodt
plistaORPmaster
YOU!
● Real-Time Results are provided by YOU
● ORP master will choose YOU● User will see YOUR results
Try latest technologies
Application for Living Labs
YOU, a technology enthusiast
● Mahout implementation exists with Kornakapi
● what will be next? Oryx? MyMediaLite? LensKit? Predict.io?
we have strong open source connections
@torbenbrodt
● try if ideas work● write papers● we are on
conferences!○ sigir 2013○ recsys 2013○ clef 2014○ … 2015 ?
we have strong university cooperations
Application for Living Labs
YOU, a researcher
@torbenbrodt
● plista earns money with recommendations on publishers
● help us -> we help you● weekly contest with
250 € prices
http://contest.plista.com(currently in maintenance)
Application for Living Labs
YOU, a partner
@torbenbrodt
Application for Living Labs
YOU, a developer
● APIs in php and java exists
● start your own using the api
@torbenbrodt
Your server is probably hosted by university, plista or any cloud provider
Application for Living Labs
YOU, a developer
@torbenbrodt
"message bus"● event notifications
○ impression○ click
● error notifications● item updates
train model from it
Application for Living Labs
YOU, a developer
@torbenbrodt
{ // json
"type": "impression",
"context": {
"simple": {
27: 418, // publisher
14: 31721, // widget
...
},
"lists": {
"10": [100, 101] // channel
}
...
}
Application for Living Labs
YOU, a developer
@torbenbrodt
recs
Your response shown to real users
{ // json
"recs": {
"int": {
"3": [13010630, 84799192]
// 3 refers to content recommendations
}
...
}
API
Real User
YOU
Application for Living Labs
YOU, a developer
api specs hosted at http://orp.plista.com
@torbenbrodt
recs
Real User
YOU
● user, publisher,
advertiser, plista
YOU can profit
● real user feedback
● real benchmark
with others
Application for Living Labs
quality is win win
@torbenbrodt
● 2012
○ Contest v1
● 2013 October
○ ACM RecSys “News
Recommender Challenge”
● 2014 November
○ CLEF News Recommendation
Evaluation Labs “newsreel”
Application for Living Labs
Overview
@torbenbrodt
Application for Living Labs
Challenge Numbers :)
● during recsys’13:○ 571,744,114 impressions delivered by researchers
○ 23 registrations => 11 active teams
● news articles of ~13 publishers
● contextual data with ~50 attributes
● cross domain application
Application for Living Labs
Challenge Challenges :(
● what is the benchmark○ click per impression?○ absolute number of clicks?○ absolute number weighted by time range?
● integration in real application is challenging○ starting from scratch?○ having runtime environment?
● papers better match offline data○ here i can compare against previous work○ are we working for papers or for passion?
● real users = real privacy issues?
[email protected]://lnkd.in/MUXXuvxing.com/profile/Torben_Brodtwww.plista.com
Open Recommendation Platformhttp://orp.plista.com@torbenbrodt @plista
questions?
@torbenbrodt