collaborative filtering via euclidean embedding m. khoshneshin and w. street proc. of acm recsys,...

Collaborative Filtering via Euclidean Embedding

M. Khoshneshin and W. StreetProc. of ACM RecSys, pp. 87-94, 2010

Introduction Recommendation Systems

Suggest items based on user preferences

Recommendation Approaches: Content-based

• Items are recommended based on a user profile and product information

Collaborative Filtering

• Use similarity to recommend items that were liked by similar users, i.e., recommendation is based on the rating history of the system

• Predict unknown ratings so that users can be given suggestions based on items with a high expected rating

2

Challenges Existing approaches are more adequate for static settings

Incorporating new data to this models is not a trivial task

Recommendations are based on the best predicted ratings However, predicting ratings is very computationally expensive in

large datasets

Euclidean embedded (EE) method for collaborative filtering Users and items are embedded in a unified Euclidean space The distance between a user and an item is inversely proportional

to the rating

3

Solution

Euclidean Embedding (EE) Advantages of EE

Is more intuitively understandable for human allowing useful visualizations

Allows very efficient recommendation query implementation

Facilitates online implementation requirements, e.g., mapping new users/items

4

Related Work Neighborhood/Memory-based CF Algorithms

Item-based or user-based

KNN associates to each user/item its set of NNs; predicts a user’s rating on an item using the ratings of its

NNs

Utilize the entire DB of user preferences when computing recommendations

Model-based CF Algorithms

Matrix Factorization & Non-Negative Matrix Factorization

Compute a model of the preference data & use it to produce recommendations

Find patterns based on training on a subset of the DB5

Collaborative Filtering (CF) Given N users and M items

In a model-based approach for CF

• The model is trained based on known ratings (training set) so that the prediction error is minimized

• Root mean squared error (RMSE) is a popular error function

• The objective function of a model-based CF, i.e., Matrix Factorization, approach is defined as

rui the rating of user u for item i

is the prediction of the model for the rating of u for i wui is 1 if rui is known, and 0 otherwise

6

CF via Matrix Factorization CF via EE is similar to CF via matrix factorization (MF)

The predicted rating, i.e., , via MF is computed as

• μ is the total average of all ratings

• bu is the deviation of user u from the average

• bi is the deviation of item i from the average

• pu and qi are the user-factor and item-factor vector in a D-dimensional space, respectively, and puqi

’ is the

dot product of pu & qi’

A higher puqi’ means u likes i more than average

7

CF via Matrix Factorization A gradient descent approach is used to solve CF

problems with a highly sparse data matrix The goal is to minimize the following objective function

• where avoids overfitting the magnitude parameters, and λ is an algorithmic parameter

The gradient descent updates for each known rating rui are

• there are T, i.e., number of known rating steps, to go through all ratings in the training dataset 8

Current error for rating rui

Step size of the algorithm

CF via Euclidean Embedding All items & users are embedded in a unified Euclidean space

The characteristics of each person/item is defined by its location

If an item is close to a user in a unified space, its characteristics are attractive for the user

9

A user is expected to like an item which is close in the space

CF via Euclidean Embedding The predicted rating, i.e., , via EE is computed as

xu and yi are point vectors of user u and item i in a D- dimensional Euclidean space

(xu - yi)(xu - yi)’ is the squared Euclidean distance

• The squared Euclidean distance is computationally cheaper while the accuracy remains the same

10

CF via Euclidean Embedding EE is a supervised learning approach

The training phase involves finding the location of each item and user to minimize a loss function

• EE modifies the previous objective function (on Slide #7)

Using gradient descent to minimize the EE objective function, updates in each step are

defined as

11

step size

CF via Euclidean Embedding Time Complexity

Training Prediction Recommendation

Visualization1. Implement CF via EE in a high-dimensional space

2. Select the top K items for an active user

3. Embed user, selected items, and some favorite items in a 2-dimensional space via multi-dimensional

scaling (MDS), using distances for the high dimensional space in step 1

12

O(D), where D is the dimension of the space

O(K-Nearest Neighbor) = O(N2)

CF via Euclidean Embedding Example. Using a low-dimensional unified user-item space,

it is possible to represent items to users via a graphical interface

13

Representing close items ( ) to a user ( ) besides the movies he has already liked ( ) to assist him in selection

The search space for a query user ( ):EE searches for the K nearest neighbors while MF explores a large space

CF via Euclidean Embedding Fast recommendation generation

Mapped space allows candidate retrieval via neighborhood search

• The smaller the distance, the more desirable an item will be

14

CF via Euclidean Embedding Incorporating new users and items

For a new user or item, there are D + 1 unknown values • D for the vector p or q and 1 for the scalar b

Active learning may be used by a recommender by asking new users to provide their favorite items

• Since the point vector of items in the space is known, and a new user is probably very close to his

favorite items in the EE space, a user vector, xu, can be estimated as

15

Items that a new user u has selected as his favorites

Number of selected items

Experimental Results Datasets used

Netflix dataset consists of 17,770 movies, ~480,000 users, and ~100,000,000 ratings

• Dimension D = 50, regularization parameter = 0.005, & step side = 0.005

MovieLens dataset consists of 1,682 movies, 943 users, and 100,000 ratings

• Dimension D = 50, regularization parameter = 0.03, & step side = 0.005

16

Experimental Results Learning curve

Test RMSE of EE & MF in each iteration of the gradient descent algorithm for five different folds

MF is more prone to overfitting, since its error increases faster after it passes the optimal point

17

Experimental Results Dimension, accuracy, and time

EE & MF give similar results in 5, 25, and 50 dimensions

Precise & recall: rates 4 & 5 are considered desirable

18

EE performsbetter than MF

Experimental Results Visualization

For a typical user, the top n movies are selected based on EE with D = n dimensions

In the picture of EE, items are embedded based on the “taste” of the active user, while in the picture of MDS, it is based on the “tastes” of all

users

19

Experimental Results Generating Fast Recommendations

Generating new recommendations for a user using EE can be treated as a kNN search problem in a Euclidean space

The table shows the top-10 recommendations to all users

• D(imension) = 50

• In MF & EE, an exhaustive search was applied, whereas for EE-KNN, first 100 movies for each user were selected as candidates

20

Search timedecreases

significantly

Experimental Results New Users

New users can be quickly mapped in the existing space

MFa & EEa implement averaging for new users, whereas EEp represents the precision/recall values for

the regular settings when the users are not new

21

collaborative filtering via euclidean embedding m. khoshneshin and w. street proc. of acm recsys,...

Documents