Transcript
Page 1: Open recommendation platform

Open Recommendation Platform

ACM RecSys 2013, Hong Kong

Torben Brodtplista GmbH

Keynote

International News RecommenderSystems Workshop and Challenge

October 13th, 2013

Page 2: Open recommendation platform

where● news websites● below the article

different types● content● advertising

Where it’s coming from

Recommendations

Visitors Publisher

Page 3: Open recommendation platform

* company i am working for

Where it’s coming from

good recommendations for...

User happy!

Advertiser happy!

Publisher happy!

plista* happy!

Page 4: Open recommendation platform

Where it’s coming from

some years ago

Visitors PublisherContext

Recommendations

Collaborative Filtering

Page 5: Open recommendation platform

● well known algorithm● more data means more

knowledge

Where it’s coming from

one recommender

Collaborative Filtering

● time● trust● mainstream

Parameter Tuning

Page 6: Open recommendation platform

2008● finished studies● 1st publication● plista was born

today● 5k recs/second● many publishers

Where it’s coming from

one recommender = good results

Page 7: Open recommendation platform

"use as many recommenders as possible!"

Where it’s coming from

netflix prize

Page 8: Open recommendation platform

Collaborative Filtering

Most Popular

Text Similarity

etc ...

Where it’s coming from

more recommenders

Page 9: Open recommendation platform

● we have one score● lucky success? bad loss?● we needed to keep track

on different recommenders

success: 0.31 %

understanding performance

lost in serendipity

Page 10: Open recommendation platform

number of● clicks● orders● engages● time on site● money

bad good

understanding performance

how to measure success

10

Page 11: Open recommendation platform

● features?● big data math?● counting!

for blending we just count floats

understanding performance

evaluation technology

Algo1 1+1 10

Algo2 100 2+5

Algo...

Page 12: Open recommendation platform

understanding performance

evaluation technology

impressions

collaborative filtering 500 +1

most popular 500

text similarity 500

ZINCRBY"impressions"

"collaborative_filtering"

"1"

ZREVRANGEBYSCORE "impressions"

Page 13: Open recommendation platform

understanding performance

evaluation technology

impressions

collaborative filtering 500

most popular 500

text similarity 500

clicks

collaborative filtering 100

most popular 10

... 1

needs division

ZREVRANGEBYSCORE "clicks"

ZREVRANGEBYSCORE "impressions"

Page 14: Open recommendation platform

● CF is "always" the best recommender

● but "always" is just avg of all context

lets check on context!

understanding performance

evaluation resultssuccess

Page 15: Open recommendation platform

● We like anonymization! We have a big context featured by the web

● URL + HTTP Headers provide○ user agent -> device -> mobile○ IP address -> geolocation○ referer -> origin (search, direct)

Context

Context

Page 16: Open recommendation platform

consider list of best recommender in each context attribute sorted list for what is relevant by● clicks (content recs)● price (advertising recs)

publisher = welt.de

collaborative filtering 689

most popular 420

text similarity 135

category = archive

text similarity 400

collaborative filtering 200

... 100

hour = 15

recent 80

collaborative filtering 10

... 5

Context

Context

Page 17: Open recommendation platform

publisher = welt.de

collaborative filtering 689

most popular 420

text similarity 135

weekday = sunday

collaborative filtering 400

most popular 200

... 100category = archive

text similarity 200

collaborative filtering 10

... 5

ZUNION clk ... WEIGHTS p:welt.de:clk 4 w:sunday:clk 1 c:archive:clk 1

ZREVRANGEBYSCORE

"clk"

ZUNION imp ... WEIGHTS p:welt.de:imp 4 w:sunday:imp 1 c:archive:imp 1

ZREVRANGEBYSCORE

"imp"

Context

evaluation context

Page 18: Open recommendation platform

Context can be used for optimization and targeting.

classical targeting is limitation

Context

Targeting

Page 19: Open recommendation platform

Recommenders

collaborative filtering 500 +1

most popular 500

text similarity 500

Advertising

RWE Europe 500 +1

IBM Germany 500

Intel Austria 500

Onsite

new iphone su...

500 +1

twitter buys p.. 500

google has seri. 500

Advertising

RWE Europe 500 +1

IBM Germany 500

Intel Austria 500

Recommenders

collaborative filtering 500 +1

most popular 500

text similarity 500

Onsite

new iphone su...

500 +1

twitter buys p.. 500

google has seri. 500

Context

Livecube

Page 20: Open recommendation platform

context

recap● added another

dimension

result

● better for news: Collaborative Filtering

● better for content: Text Similarity

Context

evaluation contextsuccess

20

Page 21: Open recommendation platform

what did we get?

● possibly many recommenders

● know how to measure success

● technology to see success

now breath!

Page 22: Open recommendation platform

● real-time evaluation technology exists

● to choose best algorithm for current context we need to learn: multi-armed bayesian bandit

the ensemble

Page 23: Open recommendation platform

Data Science

“shuffle” exploration exploitation

temporary success?

No. 1 getting most

local minima?

Interested? Look for Ted Dunning + Bayesian Bandit

Page 24: Open recommendation platform

● new total / avg is much better

● thx bandit● thx ensemble

more research● timeseries

✓ better results

time

success

Page 25: Open recommendation platform

✓ easy exploration

● tradeoff (money decision)● between price/time we

“waste” in offline evaluation● and price we loose with

bad recommendations

Page 26: Open recommendation platform

● minimum pre-testing● no risk if recommender

crashs● "bad" code might find

its context

try and error

Page 27: Open recommendation platform

● now plista developers can try ideas

● and allow researchers to do the same

collaboration

Page 28: Open recommendation platform

Ensemble is able to choose

big pool of algorithms

Collaborative Filtering

Most Popular

Text Similarity

Ensemble

BPR-LinearWR-MFSVD++etc.

Research Algorithms

Page 30: Open recommendation platform

● first and only dataset in news context○ millions of items○ only relevant for short time

● dataset has many attributes !!● many publishers have user intersection

○ regional○ contextual

● real world !!!○ you can guide the user○ you don’t need to follow his route

● real time !!○ This is industry, it has to be usable

researcher has idea

src http://userserve-ak.last.fm/serve/_/7291575/Wickie%2B4775745.jpg

30

Page 31: Open recommendation platform

... probably hosted by university, plista or any cloud provider?

... needs to start the server

Page 32: Open recommendation platform

"message bus"● event notifications

○ impression○ click

● error notifications● item updates

train model from it

... api implementation

Page 33: Open recommendation platform

{ // json

"type": "impression",

"context": {

"simple": {

"27": 418, // publisher

"14": 31721, // widget

...

},

"lists": {

"10": [100, 101] // channel

}

...

}

... package content

api specs hosted at http://orp.plista.com

Page 34: Open recommendation platform

{ // json

"type": "impression",

"recs": ...

// what was recommended

}

api specs hosted at http://orp.plista.com

... package content

Page 35: Open recommendation platform

{ // json

"type": "click",

"context": ...

// will include the position

}

... package content

api specs hosted at http://orp.plista.com

Page 36: Open recommendation platform

recs

{ // json

"recs": {

"int": {

"3": [13010630, 84799192]

// 3 refers to content recommendations

}

...

}

generated by researchersto be shown to real user

API

Real User

Researcher

... reply to recommendation requests

api specs hosted at http://orp.plista.com

Page 37: Open recommendation platform

recs

Real User

Researcher

● happy user

● happy researcher

● happy plista

research can profit

● real user feedback

● real benchmark

quality is win win #2

Page 38: Open recommendation platform

use common frameworks

src http://en.wikipedia.org/wiki/Pac-Man

how to build fast system?

Page 39: Open recommendation platform

● no movies!

● news articles will outdate!

● visitors need the recs NOW

● => handle the data very fast

src http://static.comicvine.com/uploads/original/10/101435/2026520-flash.jpg

quick and fast

Page 40: Open recommendation platform

● fast web server

● fast network protocol

● fast message queue

● fast storage

or Apache Kafka

"send quickly" technologies

40

Page 41: Open recommendation platform

"real-time features feel better in a real-time world"

we don't need batch! see http://goo.gl/AJntul

our setup● php, its easy● redis, its fast● r, its well known

comparison to plista

Page 42: Open recommendation platform

Overview

Publisher

Recommendations

Feedback

Collaborative Filtering

Most Popular

Text Similarity

etc.

Preferences

EnsembleVisitors

Page 43: Open recommendation platform

● 2012

○ Contest v1

● 2013

○ ACM RecSys “News

Recommender Challenge”

● 2014

○ CLEF News Recommendation

Evaluation Labs “newsreel”

Overview


Top Related