semantically-enhanced recommendation algorithms

19
Semantically-Enhanced Recommendation Algorithms CCIA 2012 Victor Codina & Luigi Ceccaroni [email protected] [email protected] Departament de Llenguatges i Sistemes Informàtics Knowledge Engineering and Machine Learning Group Health Informatics Personalized Computational Medicine

Upload: luigi-ceccaroni

Post on 15-Jan-2015

145 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Semantically-Enhanced Recommendation Algorithms

Semantically-EnhancedRecommendation Algorithms

CCIA 2012

Victor Codina & Luigi Ceccaroni [email protected] [email protected]

Departament de Llenguatges i Sistemes Informàtics

Knowledge Engineering and Machine Learning Group

Health Informatics

Personalized Computational Medicine

Page 2: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 2

Page 3: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 3

Netflix: 2/3 of the movies rented are recommend Google News: 38% more clickthrough Amazon: 35% sales from recommendations

The value of recommendations

All these systems employ as a main component Collaborative Filtering (CF) approach

Page 4: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni 4

But in most online services the CF approach does not work so well

Why??

Usually: Lack of Data

Other reasons: lack of context-awareness, domain-specific particularities

Page 5: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Outline

5

Cold-start problem and existing solutions

Proposed solution to overcome cold start

Evaluation and results

Page 6: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Outline

6

Cold-start problem andexisting solutions

Proposed solution to overcome cold start

Evaluation and results

Cold-start problem

Hybrid recommenders

Page 7: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

What is the cold-start problem?

Narrow viewo No ratings at all associated to items or users

Wider viewo Few ratings associated

7

UsersMany ratings Few ratings

Items

Many ratings Normal New user

Few ratings New item New user & item

Cold-start scenarios:

Page 8: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Typical solution: hybrid recommender combining CF with content-based filtering

8

Traditional Content-based filtering

Semantically-EnhancedContent-based filtering

Collaborative Filtering

+Collaborative Filtering

+

New item

New user

Limitation

PAST SOLUTION MORE RECENT SOLUTION

Lack of understanding and exploitation of domain semantics

The need of domain ontologies describing explicit metadata relations

Page 9: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Outline

9

Cold-start problem and existing solutions

Evaluation and results

Acquisition of implicit semantics

Methods for semantics exploitation

Proposed solution to overcome cold start

Page 10: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Acquisition of implicit domain semantics

Implicit semantics = semantic similarities among item attributes extracted from Vector Space Models (VSMs)

Distributional hypothesis: “words that share similar contexts share similar meaning”

10

Attrib

utes

Context

wa,c…

… MatrixTransformation (SVD, Conditional

probabilities)

Similarity measure (Cosine, Jaccard)

Attribute semantic

similarities

Items Users

Page 11: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Item-basedo Similarity is measured in terms of how many items are similarly

described by both attributes User-based

o Similarity is measured in terms of how many users are similarly interested in both attributes

Semantic similarities are context-dependant

Scifi 0.79598457future 0.6889696space 0.65459067aliens 0.6110453robots 0.59465224

Scifi 0.48631117aliens 0.42508063dystopia 0.34769687space 0.32580933future 0.27470198

Example:- Top-5 tags similar to “Sci-Fi” - Calculated using cosine similarity without matrix transformation

User-based Items-based

11

Page 12: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Exploitation of implicit semantics in content-based filtering

12

Attributes

Item

s

wi,a…

User modeling technique

Items ru,i… …

user ratings (u)

Attributes… wu,a

User interests (u)

Vector-based matching

Attributes… wi,a

Item attributes (i)

score

1. Profileexpansion

2. Semantic matching

Attributerelevance [0,1]

degree of interest [-1,1]

( )

Expanded user interests (u)

USER MODELING PREDICTION GENERATION

Page 13: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Method 1: User profile expansion by constrained spreading activation

13

Attribute semantic similarities

1 0.5 0.2 0 0.30.5 1 0.3 0 0.10.2 0.3 1 0.7 0.80 0 0.7 1 00.3 0.1 0.8 0 1

a1 a2 a3 a4 a5

Similarities can be symmetric or not depending on the similarity measure used

0 0.5 -0.1 0 0 User interests [-1,1] a1 a2 a3 a4 a5

0.25 0.5 0.05 0 0 Expandeduser interests [-1,1] a1 a2 a3 a4 a5

- fan-out threshold = 0.25

- activation threshold = 0.25

activated node

(0.5) (0.3)

Method hyper-parameters:

- max.expansion levels = 1

new interest Weight updated

a1

a2

a3

a4

a5

Page 14: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Method 2: Prediction generation by pair-wise semantic matching strategies

Item attributes [0,1]0 0.3 0 0 0.7

Attribute semantic similarities

1 0.5 0.2 0 0.30.5 1 0.3 0 0.10.2 0.3 1 0.7 0.80 0 0.7 1 00.3 0.1 0.8 0 1

a1 a2 a3 a4 a5

User interests [-1,1]0 0.5 -0.1 0 0

a1 a2 a3 a4 a5

a1 a2 a3 a4 a5

Direct matching (1)

Approach: Vector-based matching

Result: (using the product as aggregation function)

0.15

Best-pairs matching

Similarities can be symmetric or not depending on the similarity measure used

(0.8)

- 0.056 = 0.094

All-pairs matching

- 0.009 + 0.035 - 0.056 = 0.12

(0.3)(0.1)

a1

a2

a3

a4

a5

- similarity threshold = 0.05Methodhyper-parameter:

14

Page 15: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Outline

15

Cold-start problem and existing solutions

Evaluation and resultsMovieLens data set

Experimental results

Proposed solution to overcome cold start

Page 16: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Offline experimentation with a MovieLens data set extended with movie metadata

Users 2113

Movies 1646

Attributes 4 (Genres, directors, actors and tags)

Attribute values 2886

Ratings per user on avg. 239

Rating density 14%

Data set statistics after pruning unusual attributes values and movies with few attributes:

16

Page 17: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Evaluation of methods for semantics exploitation

17

Baseline = Traditional CB using hybrid user modeling techniqueExpansion-CB = CSA-same + User-based + raw frequenciesMatching-CB = Best-pairs-same + User-based + Forbes-Zhu methodBPR-MF = CF based on matrix factorization optimized for ranking

Page 18: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Conclusions

Cold-start problem can be very critical o Above all in systems with small databases

Existing solutions have some limitationso Traditional CB cannot solve new user scenarioo Semantically-enhanced CB requires domain ontologies to work

Exploitation of implicit semantics can be a good alternative to overcome cold-start problemo User-based semantics is more effective than item-basedo The best-pair semantic matching method is more effective than

the profile expansion based on spreading activation

18

Page 19: Semantically-Enhanced Recommendation Algorithms

Semantically-Enhanced Recommendation Algorithms - Victor Codina & Luigi Ceccaroni

Future work

Experimenting with data sets of different domainso Million Song data set

Extending the study of Vector Space Modelso Probabilistic similarity measures (e.g. Kullback-Leiber)

Apply the same approach to enhance cold-start performance of context-aware recommenderso Implicit semantics of contextual conditions can also be acquired

from user datao Similarly, pair-wise semantic strategies can be employed to

enhance contextual user modeling

19