recommender systems evaluation: a 3d benchmark - presented at rue 2012 workshop at acm recsys 2012

Recommender systems evaluation: a 3D benchmark Alan Said 1 , Domonkos Tikk 2 , Yue Shi 3 , Martha Larson 3 , Klára Stumpf 2 , Paolo Cremonesi 4 1: TU Berlin 2: Gravity R&D 3: TU Delft 4: Politecnico di Milano/Moviri

Upload: domonkos-tikk

Post on 19-Aug-2015

940 views

Category:

Technology

2 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Recommender systems evaluation: a 3D benchmark

Alan Said1, Domonkos Tikk2, Yue Shi3, Martha Larson3, Klára Stumpf2, Paolo Cremonesi4

1: TU Berlin2: Gravity R&D3: TU Delft4: Politecnico di Milano/Moviri

Page 2: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Motivation

• Current recsys evaluation benchmarks are insufficient– mostly focused on IR measures (RMSE,

MAP@X, precision/recall)– does not consider the need of all stakeholders

(users, content provider, recsys vendor)– technological and business requirements are

mostly overlooked

• 3D Recommender System Benchmarking Model

Page 3: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Stakeholders

users

content of service provider

recommender

The Proposed 3D model

Page 5: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Recent benchmarks (1)

• pros:– Large scale– very well organized

• cons:– qualitative assessment of recommendation:

simplified to RMSE– rating prediction (not ranking)– no focus on direct business and technical

parameters (scalability, robustness, reactivity)

Page 6: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Recent benchmarks (2)

• pros:– constraints on training and response time– real traffic (only planned)– major driver: revenue increase

• cons:– only business goals, but otherwise unclear optimization

criteria– user needs are neglected– organization

Page 7: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Recent Benchmarks (3)

• pros:– availability of additional metadata (compared to KDD

Cup 2011)– not rating based (implicit feedback)– ranking based evaluation metric (MAP@500)

• cons:– offline evaluation– size does not matter anymore (lower interest)– no business requirements or technical constraint

Page 8: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

3D MODEL

Page 9: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

User requirements

• functional (quality-related)– relevant, interesting, novel, diverse,

serendipitious, context-aware, ethical, etc.

• non-functional (technology related)– real-time– usability-related

Page 10: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Business requirements

• Business model – for-profit: revenue stream – NP-style: award driven (reputation, community

building)

• KPI depends on the application area– Revenue increase– CTR– Raise awarness to content or service

Page 11: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Technical constraints

• data driven– availability of user feedback (e.g. satellite TV)

• system driven– hardware/software limitations (device-

dependent)

• scalability– typical response time

• robustness

Page 12: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Example

• VoD recommendation scenario (TV)– user: easy contect exploration, context-

awareness (time, viewer identification)– business: increase VoD sales & awareness

(user base)– technical: middleware, HW/SW of the

provider, response time

Page 13: Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Facit

• Recommendation tasks have many aspects typically overlooked

• Tasks define the important user, business, and technical quality measures– the fulfilment of all is required at a certain level– trade-off is usually required

• Proposal: with our 3D evaluation concept more comprehensive evaluation can be achieved

Predicting Performance in Recommender Systemsir.ii.uam.es/~alejandro/2011/recsys-dc-slides.pdf · Predicting Performance in Recommender Systems RQ1. Is it possible to define a performance

RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning Meets Search!

Proceedings of the ACM RecSys'09 Workshop on …...Proceedings of the ACM RecSys'09 Workshop on Recommender Systems & the Social Web EDITORS Dietmar Jannach, Werner Geyer, Jill Freyne,

Recommender Systems - RecSysrecsys.acm.org/wp-content/uploads/2015/08/RecSys... · the submitted research work. This year RecSys received again more than 200 submissions (131 long

HCI for Recommender Systems: An Introductionkonstan/RecSys-UI-tut.pdf · HCI for Recommender Systems: An Introduction Joseph A. Konstan [email protected] Konstan: Introduction to User

Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial

Context Relevance Assessment for Recommender Systemsricci/papers/iui-2011-baltrunas.pdf · Context-aware, recommender systems, user preferences. ACM Classiﬁcation Keywords H.3.3

Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Systems with Neural Networks (Bartłomiej Twardowski)

Tutorial on People Recommendations in Social Networks - ACM RecSys 2013, Hong Kong (Part 1 of 2)

Deep Learning for Recommender Systems - inovex · “Collaborative Deep Learning for Recommender Systems“ Proceedings of the 21th ACM SIGKDD International Conference on Knowledge

on an oblivious recommender - RecSys