implicit feedback recommendation via implicit-to-explicit ordinal logistic regression mapping

33
Implicit Feedback Recommendation via Implicit- to-Explicit OLR Mapping Denis Parra (Pitt), Alexandros Karatzoglou (TID), Xavier Amatriain (TID), Idil Yavuz (Pitt) CARS 2011 October 23rd 2011

Upload: denis-parra-santander

Post on 15-Jan-2015

1.555 views

Category:

Education


1 download

DESCRIPTION

Presentation at the CARS Workshop in the context of the Conference of Recommender Systems 2011, held in Chicago.

TRANSCRIPT

Page 1: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Implicit Feedback Recommendation via Implicit-to-Explicit OLR Mapping

Denis Parra (Pitt), Alexandros Karatzoglou (TID), Xavier Amatriain (TID), Idil Yavuz (Pitt)

CARS 2011October 23rd 2011

Page 2: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Outline

• Introduction• Datasets• Models• Results• Discussion• Conclusion

Page 3: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

A More Clear Outline

• This presentation has, IMHO, 3 sections:1. Good News2. Not that Good News3. Good News

• Which represent, respectively1. Results of first study on last.fm presented in UMAP

2011 2. Initial results of the study we present here 3. Expected Results after analysis of 2. (once I finish my

comps) – and your feedback!

Page 4: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 4

Introduction• Most of recommender system approaches rely on

explicit information of the users, but…• Explicit feedback: scarce (people are not especially

eager to rate or to provide personal info)• Implicit feedback: Is less scarce, but (Hu et al., 2008)

There’s no negative feedback … and if you watch a TV program just once or twice?

Noisy … but explicit feedback is also noisy (Amatriain et al., 2009)

Preference & Confidence … we aim to map the I.F. to preference (our main goal)

Lack of evaluation metrics … if we can map I.F. and E.F., we can have a comparable evaluation

7/12/2011

Page 5: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section I : Good News

• Last.fm User Study• Linear regression results

Page 6: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Recalling the 1st study (1/5)• Last.fm users (114 in total after filtering)• For each user, we crawled all the albums they

listened to send them a personalized survey

Page 7: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 7

Recalling the 1st study (2/5)• What items should they rate? Item (album) sampling:– Implicit Feedback (IF): playcount for a user on a given album.

Changed to scale [1-3], 3 means being more listened to.– Global Popularity (GP): global playcount for all users on a given

album [1-3]. Changed to scale [1-3], 3 means being more listened to.

– Recentness (R) : time elapsed since user played a given album. Changed to scale [1-3], 3 means being listened to more recently.

7/12/2011

Page 8: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 8

Recalling the 1st study (3/5)

• Demographics Survey + Rating 100 albums

7/12/2011

Page 9: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 9

Recalling the 1st study (4/5)• Gender• Age• Country• Hours per week spent on internet [int_hrs_per_week]• Hours per week listening to music online [msc_hrs_per_week]• Number of concerts per year [conc_per_year]• Do you read specialized music blogs or magazines? [blogs_mag]• Do you have experience evaluating music online? [rate_music]• How frequently do you buy physical music records? [buy_records]• How frequently do you buy music online? [buy_online] • Do you prefer listening to single tracks, whole albums or either way?

[track_or_CD]

7/12/2011

Page 10: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Recalling the 1st study (5/5)

• Prediction of rating by multiple Linear Regression evaluated with RMSE.

• Results showed that Implicit feedback (play count of the album by a specific user) and recentness (how recently an album was listened to) were important factors, global popularity had a weaker effect.

• Results also showed that listening style (if user preferred to listen to single tracks, CDs, or either) was also an important factor, and not the other ones.

Page 11: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

... but

• Linear Regression didn’t account for the nested nature of ratings

• And ratings were treated as continuous, when they are actually ordinal.

User 1

1 3 5 3 0 4 5 2 2 1 5 4 3 2

User n

3 2 1 0 4 5 2 5 4 3 2 1 3 5

. . .

Page 12: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

So, Ordinal Logistic Regression!

• Actually Mixed-Effects Ordinal Multinomial Logistic Regression

• Mixed-effects: Nested nature of ratings • We obtain a distribution over ratings (ordinal

multinomial) per each pair USER, ITEM -> we predict the rating using the expected value.

• … And we can compare the inferred ratings with a method that directly uses implicit information (playcounts) to recommend ( by Hu, Koren et al. 2007)

Page 13: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Model [to predict rating user x item]

• Model

• Predicted value

Page 14: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Final MELR Model with 4 fixed effects

Page 15: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section II : Not that Good News

• Datasets I and II• Results measured as MAP and nDCG.

Page 16: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Datasets

• D1: users, albums, if, re, gp, ratings, demographics/consumption

• D2: users, albums, if, re, gp, NO RATINGS.

Page 17: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Experiments• First step: build MELR model using D1• For D1 and D2: split dataset in 5 parts to

perform a 5-fold cross validation

Page 18: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Results

Page 19: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Section III: Expected Good News

• After Analyzing our data/process

Page 20: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Lessons / Challenges (1/2)Problem/ Challenge

1. Ground truth: Playcounts of albums or tracks?

2. Quantization of playcounts (implicit feedback), recentness, and overall number of listeners of an album (global popularity) [1-3] scale v/s raw playcounts

3. Defining Relevancy of recommended elements (to compare with the raw playcounts)

Page 21: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Lessons / Challenges (2/2)Problem/ Challenge

4. Additional/Alternative metrics for evaluation [MAP and nDCG used in the paper]

5. New Survey (In order to deal with the issues of identifying “actual” relevancy in dataset2)

6. Significance of level-2 variables: track_or_CD (study 1 v/s 2, where concerts_per_year was significant)

Page 22: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

… so

• Lots of work to do [after my comps]• Questions, Suggestions?

Page 23: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 23

Is this the end?

7/12/2011

Page 24: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Thanks!

• Denis Parra [email protected]

Page 25: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Backup slides

Page 26: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Complete Results for D2

AVG (MAP) AVG(nDCG) SD (MAP) SD(nDCG)

Koren 0.101428473 0.271841949 0.001504383 0.001906666

KorenLog 0.123444659 0.295368576 0.002105906 0.002239792

logit_3 0.12225702 0.294351155 0.000868738 0.001100289

popularity 0.01777665 0.136672774 0.000937768 0.00086512

linear_2 0.123417895 0.295026219 0.001830371 0.001554675

linear_3 0.122317675 0.294211465 0.001744534 0.00212961

Page 27: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Distribution of ratings

Page 28: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Actual Distribution of ratings

Page 29: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Some Intuition About the Results

• Distributions of ratings in both datasets

Page 30: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 30

4 Regression Analysis

• Including Recentness increases R2 in more than 10% [ 1 -> 2]• Including GP increases R2, not much compared to RE + IF [ 1 -> 3]• Not Including GP, but including interaction between IF and RE improves

the variance of the DV explained by the regression model. [ 2 -> 4 ]7/12/2011

M1: implicit feedback

M2: implicit feedback & recentness

M4: Interaction of implicit feedback & recentness

M3: implicit feedback, recentness, global popularity

Page 31: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 31

4.1 Regression Analysis

• We tested conclusions of regression analysis by predicting the score, checking RMSE in 10-fold cross validation.

• Results of regression analysis are supported.

7/12/2011

Model RMSE1 RMSE2User average 1.5308 1.1051M1: Implicit feedback 1.4206 1.0402M2: Implicit feedback + recentness 1.4136 1.034M3: Implicit feedback + recentness + global popularity 1.4130 1.0338M4: Interaction of Implicit feedback * recentness 1.4127 1.0332

Page 32: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 32

4.2 Regression Analysis – Track or Album

• Including this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model

7/12/2011

Model Tracks Tracks/Albums

Albums

User average 1.1833 1.1501 1.1306M1: Implicit feedback 1.0417 1.0579 1.0257M2: Implicit feedback + recentness 1.0383 1.0512 1.0169M3: Implicit feedback + recentness + global popularity

1.0386 1.0507 1.0159

M4: Interaction of Implicit feedback * recentness 1.0384 1.049 1.0159

Page 33: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping

Parra, Amatriain "Walk the Talk" 33

Is this the end?

7/12/2011