open recommendation platform

Download Open recommendation platform

Post on 15-Jan-2015




2 download

Embed Size (px)




  • 1. Open Recommendation PlatformACM RecSys 2013, Hong KongTorben Brodt plista GmbH Keynote International News Recommender Systems Workshop and Challenge October 13th, 2013

2. Where its coming fromRecommendationswhere news websites below the article VisitorsPublisherdifferent types content advertising 3. Where its coming fromgood recommendations for...User happy!Advertiser happy!Publisher happy!plista* happy! * company i am working for 4. Where its coming fromsome years agoRecommendations ContextVisitorsPublisherCollaborative Filtering 5. Where its coming fromone recommender Collaborative Filtering well known algorithm more data means more knowledge Parameter Tuning time trust mainstream 6. Where its coming fromone recommender = good results 2008 finished studies 1st publication plista was born today 5k recs/second many publishers 7. Where its coming fromnetflix prize"use as many recommenders as possible!" 8. Where its coming frommore recommenders Collaborative FilteringMost Popular Text Similarity etc ... 9. understanding performancelost in serendipity we have one score lucky success? bad loss? we needed to keep track on different recommenderssuccess: 0.31 % 10. understanding performancehow to measure success badgoodnumber of clicks orders engages time on site money10 11. understanding performanceevaluation technologyAlgo11+110Algo21002+5Algo... features? big data math? counting! for blending we just count floats 12. understanding performanceevaluation technologyimpressions collaborative filtering500 +1most popular500text similarity500ZINCRBY "impressions" "collaborative_filtering" "1" ZREVRANGEBYSCORE "impressions" 13. understanding performanceevaluation technology clicks collaborative filtering most popular ...needs division 100 10 1ZREVRANGEBYSCORE "clicks"collaborative filtering500most popular500ZREVRANGEBYSCORE "impressions"text similarity500impressions 14. understanding performanceevaluation results success CF is "always" the best recommender but "always" is just avg of all contextlets check on context! 15. ContextContext We like anonymization! We have a big context featured by the web URL + HTTP Headers provide user agent -> device -> mobile IP address -> geolocation referer -> origin (search, direct) 16. ContextContextconsider list of best recommender in each context attribute sorted list for what is relevant by clicks (content recs) price (advertising recs) category = archive hour = 15 publisher = welt.detext similarity400recentcollaborative filtering689most popular135collaborative filtering 200collaborative filtering10...420text similarity80...5100 17. Contextevaluation context publisher = collaborative filteringZUNION clk ... WEIGHTS 4 w:sunday:clk 1 c:archive:clk 1689= sunday most popular weekday 420 text similarity collaborative filtering 135 most popular ... category = archive400ZREVRANGEBYSCORE "clk"200 100text similarity200collaborative filtering10...5ZUNION imp ... WEIGHTS 4 w:sunday:imp 1 c:archive:imp 1 ZREVRANGEBYSCORE "imp" 18. ContextTargeting Context can be used for optimization and targeting.classical targeting is limitation 19. ContextLivecube Advertising AdvertisingRecommenders RecommendersRWE Europe RWE Europe500 +1 500 +1collaborative filtering collaborative filtering500 +1 500 +1IBM Germany IBM Germany500 500most popular most popular500 500Intel Austria Intel Austria500 500text similarity text similarity500 500Onsite Onsite new iphone new iphone su... su...500 +1 500 +1twitter buys p.. twitter buys p..500 500google has seri. 500 google has seri. 500 20. Contextevaluation context successrecap added another dimension contextresult better for news: Collaborative Filtering better for content: Text Similarity20 21. now breath!what did we get? possibly many recommenders know how to measure success technology to see success 22. the ensemble real-time evaluation technology exists to choose best algorithm for current context we need to learn: multiarmed bayesian bandit 23. Data Scienceshuffle exploration exploitation No. 1 getting most temporary success?local minima? Interested? Look for Ted Dunning + Bayesian Bandit 24. better resultssuccess new total / avg is much better thx bandit thx ensembletimemore research timeseries 25. easy exploration tradeoff (money decision) between price/time we waste in offline evaluation and price we loose with bad recommendations 26. try and error minimum pre-testing no risk if recommender crashs "bad" code might find its context 27. collaboration now plista developers can try ideas and allow researchers to do the same 28. big pool of algorithms Collaborative FilteringEnsemble is able to choose Most PopularEnsemble Text SimilarityResearch Algorithms BPR-Linear WR-MF SVD++ etc. 29. researcher has ideasrc 30. researcher has idea src and only dataset in news context millions of items only relevant for short time dataset has many attributes !! many publishers have user intersection regional contextual real world !!! you can guide the user you dont need to follow his route real time !! This is industry, it has to be usable 31. ... needs to start the server ... probably hosted by university, plista or any cloud provider? 32. ... api implementation "message bus" event notifications impression click error notifications item updates train model from it 33. ... package content { // json "type": "impression", "context": { "simple": { "27": 418, // publisher "14": 31721, // widget ... }, "lists": { "10": [100, 101] // channel } ... specs hosted at http://orp.plista. api } com 34. ... package content{ // json "type": "impression", "recs": ... // what was recommended } api specs hosted at http://orp.plista. com 35. ... package content{ // json "type": "click", "context": ... // will include the position }api specs hosted at http://orp.plista. com 36. ... reply to recommendation requests { // jsonReal User"recs": { "int": { "3": [13010630, 84799192]recs// 3 refers to content recommendations } ...API}generated by researchers to be shown to real user api specs hosted at http://orp.plista. comResearcher 37. quality is win win #2 happy user Real User recs happy researcher happy plistaresearch can profit real user feedback Researcher real benchmark 38. how to build fast system? use common frameworkssrc 39. quick and fast no movies! news articles will outdate! visitors need the recs NOW => handle the data very fastsrc 40. "send quickly" technologies fast web server fast network protocol or Apache Kafka fast message queue fast storage40 41. comparison to plista "real-time features feel better in a real-time world" our setup php, its easy redis, its fast r, its well known we don't need batch! see 42. Overview Collaborative FilteringEnsemble VisitorsMost PopularText Similarity Recommendations Feedbacketc.PublisherPreferences 43. Overview 2012 Contest v1 2013 ACM RecSys News Recommender Challenge 2014 CLEF News Recommendation Evaluation Labs newsreel 44. questions? Contact (Blog) News Recommender Challenge #RecSys @torbenbrodt @NRSws2013 @plista