demystifying recommendation systems

Demystifying Recommendation

Systems

About Rumman

•Senior Data Scientist and Instructor at Metis •Practicing Data Scientist

• Find me on twitter @ruchowdh • Visit my website at rummanchowdhury.com

• Check out my jobs page • …and my blog

http://rummanchowdhury.com

About Metis

• Data Science Bootcamp

• Part of Kaplan

• Accredited by ACCET

• 12-weeks, full-time including 60 hours of online pre-work

• Evening and weekend training courses

• Third party financing options

• $3,000 scholarship for women, underrepresented minority groups, and veterans or members of the U.S. military

Overview• What is a recommendation engine? • What are the types of recommendation systems? • What are the drawbacks of the most common recommendation engines and how do I deal with them? • How do I fine-tune my model?

What are recommendation systems?

What are recommendation systems?Automated systems that seek to suggest whether a given item (product, event, movie, song, etc) will be desirable to a user.

Or, more data science-y: predict what a user’s review will be for items that they have not reviewed

Where does a recommendation system lie in the space of data science and analytics?

• Descriptive • Average, percents, etc • Explains post-event or during

• Predictive • Uses modeling of past behavior to make predictions about the future

• Prescriptive • Informed decision of how actions should be taken based on data

How do I pick the best kind of recommender system for my data?

• What is your existing data? • How quickly does your inventory change? • How much information can you get on a user? (explicit and implicit) • Does your model need to scale well?

What are the kinds of recommendation systems?

What are the kinds of recommender systems?

• Search (knowledge-based) • Pros: items will be close matches to expressed needs, no cold-start issues • Cons: Static, manual tagging, will not work well with very similar inventories or rapidly changing inventories

• Example: Amazon’s basic search


• Content-based • Items are mapped based on characteristics into an item-feature space, and recommendations are based on specified characteristics

• Pros: Easier comparison between items • Cons: Cold start problem, need good content descriptions, need item ratings •Example: Search for ‘ai’ vs ‘AI’, ‘mit’ vs ‘MIT’


• Collaborative filtering: based on user and item similarities • Pros: can provide less-obvious matches • Cons: cold-start problem for new users and new items, requires a feedback rating

Limitations, or, Ask yourself, do you really need a recommendation engine?

• Recommendation systems have to update immediately. • You have to have a sufficiently inexpensive model and have the bandwidth to return results fast.

• You have more information than you think: • existing item popularity • geography based in ip address • cookies

How does Content-Based recommendation work?

• Users and items are represented by vectors in a feature space • Approaches:

• Map users and items to the same feature space, compute distance between a user and an item.

Example: Content-Based Recommendation

Features = (big box office, aimed at kids, famous actors)

Items (movies): Finding Nemo = (5, 5, 2) Mission Impossible = (3, -5, 5) Jiro Dreams of Sushi = (-4, -5, -5)

Predicted ratings*:

(-3*5 + 2*5 + 2*2) = -9 (-3*3 - 2*5 - 2*5) = -29 (3*4 - 2*5 + 2*5) = +12

* Ratings for user with a described preference of (-3, 2, 2) for these features

How does Content Based Recommendation work?

• Another option is to create features from user+item pairs and use an algorithm (classifier?) to predict like/dislike

•Each user/item pair has a labeled outcome, such as purchased/not purchased. You can train a model to predict purchase behavior.

How does Collaborative Filtering work?

• Collaborative filtering refers to a family of methods for predicting ratings where instead of thinking about users and items in terms of a feature space, we are only interested in the existing user-item ratings themselves.

•In this case, our dataset is a ratings matrix whose columns correspond to items, and whose rows correspond to users.

Example: Netflix movie recommendations

How does collaborative filtering work?• Method 1: Item-based CF, a.k.a. neighborhood methods or memory-based CF

• Ratings data are used to create an item-item similarity matrix. • Recommendations are made based on the items most similar to those a user has already rated highly.

•This method does not scale well. • Why? You need a fully populated matrix of item-item similarity. This doesn’t work well if you have a lot of items or if your items change a lot.

How does CF work?• Method 2: Model-based CF use matrix decomposition via singular value decomposition (SVD) to reduce dimensionality and extract latent variables.

• We express users and items in terms of these variables.

Why is model-based CF preferred?

• Scalable, flexible, accurate, domain independent, and requires no explicit information.

What are the drawbacks, and how can I address them?

Let’s discuss the drawbacks

• Cold-start problem! • Data is typically very sparse •Need granularity in your data

Drawback: Cold Start problem

• Build an initial profile based on implicit data, evolve based on explicit feedback as it comes. • Sometimes called a ‘hybrid’ filtering method, you can use content-based information to ease cold-start and data sparsity problems.

Drawback: Sparsity of Data

• Famous Netflix prize dataset, ~ 99% of possible ratings were missing. • Data is skewed and sparse

• or, most people don’t rate a lot and most items aren’t rated • those that are often are rated constantly

Drawback: Granularity of data• Traditional model-based CF works well for non-binary data (ie, a 5 star rating). Doesn’t work well for binary (ie, click/not click, purchased/did not purchase)

• You will need to tweak your measurements of item similarities

Quick overview of measurement

• Non-binary rating: • Pearson correlation coefficient • Euclidean distance • Manhattan distance

• Binary ratings: • Jaccard similarity • Cosine similarity

How do I refine my model?

Normalization

• Some items are significantly higher rated (ie, blockbuster movies, Oscar winners) • Some users are lower (or higher) raters from the norm • Ratings can change over time

Normalization• Need to offset per user • Need to offset per item

•Ex: Mean rating across all users for item x is some value. How does it differ from the mean rating across all items? How does my rating differ from the mean rating of that item?

Capturing data trends• Rating distributions:

• ratings aren’t random, they follow a distribution - model this distribution

• Feature importance: You can regress on your feature vectors to get an understanding of what values impact ratings • Feature generation: Characterize your users and create one-hot features (this can save a lot of time, and help with cold-start problems)

Temporal factors

• There can be an upward trend of ratings over time • Seasonal shifts due to holidays, awards, etc • Anchoring (ie, an item based on a previous iteration or version of that item)

Conclusions

• Think about your data, your capabilities, and your needs prior to creating a recommendation system • Consider the pros and cons of each type • Refine your model thoughtfully

Questions? www.rummanchowdhury.com

@ruchowdh

http://www.rummanchowdhury.com

demystifying recommendation systems

Data & Analytics