music recommendations at spotify - meetupfiles.meetup.com/1516886/recommendations at spotify...

64
Music recommendations at Spotify Erik Bernhardsson [email protected]

Upload: others

Post on 31-Mar-2020

53 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Music recommendations at Spotify

Erik Bernhardsson [email protected]

Page 2: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Spotify

- Launched in 2009

- Available in 17 countries

- 20M active users, 5M paying subscribers

- Peak at 5k tracks/s, 1M logged in users

- 20M tracks

Page 3: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Some applications

Page 4: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Recommendation stuff at Spotify

- Related artists:

Page 5: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Recommendation stuff at Spotify, cont…

Page 6: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

More!

Page 7: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

How can we find music?

Page 8: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Recommendations

- Manual classification

- Feature extraction

- Social media analysis, web scraping, metadata based

- Collaborative filtering

Page 9: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Pandora & Music Genome Project

- Classifies tracks in terms of 400 attributes

- Each track takes 20-30 minutes to classify

- A distance function finds similar tracks

- “Subtle use of strings”

- “Epic buildup”

- “Acid Jazz roots”

- “Beats made for dancing”

- “Trippy soundscapes”

- “Great trombone solo”

- …

Page 10: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Scraping the web is another approach

Page 11: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Feature extraction

Page 12: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Collaborative filtering

Idea:

- If two movies x, y get similar ratings then they are probably similar

- If a lot of users all listen to tracks x, y, z, then those tracks are

probably similar

Page 13: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Collaborative filtering

Page 14: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Get data

Page 15: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

… lots of data

Page 16: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Aggregate data

Throw away temporal information and just look at the number of times

Page 17: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

OK, so now we have a big matrix

Page 18: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

… very big matrix

Throw out all the temporal data:

Page 19: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Supervised collaborative filtering is pretty much matrix completion

Page 20: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Supervised learning: Matrix completion

Page 21: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Supervised: evaluating rec quality

Page 22: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Unsupervised learning

- Trying to estimate the density

- i.e. predict probability of future events

Page 23: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Try to predict the future given the past

Page 24: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

How can we find similar items

Page 25: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

We can calculate correlation coefficient as an item similarity

- Use something like Pearson, Jaccard, …

Page 26: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Amazon did this for “customers who bought this also bought”

- US patent 7113917

Page 27: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Parallelization is hard though

Page 28: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Can speed this up using various LSH tricks

- Twitter: Dimension Independent Similarity Computation (DISCO)

Page 29: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Are there other approaches?

Page 30: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Natural Language Processing has a lot of similar problems

…matrix factorization is one idea

Page 31: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Matrix factorization

Page 32: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Matrix factorization

- Want to get user vectors and item vectors

- Assume f latent factors (dimensions) for each user/item

Page 33: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

- Hofmann, 1999

- Also called PLSI

Probabilistic Latent Semantic Analysis (PLSA)

Page 34: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

PLSA, cont.

+ a bunch of constraints:

Page 35: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

PLSA, cont.

Optimization problem: maximize log-likelihood

Page 36: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

PLSA, cont.

Page 37: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 38: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 39: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 40: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 41: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

“Collaborative Filtering for Implicit Feedback Datasets”

- Hu, Koren, Volinsky (2008)

Page 42: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

“Collaborative Filtering for Implicit Feedback Datasets”, cont.

Page 43: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Here is another method we use

Page 44: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

What happens each iteration

- Assign all latent vectors small random values

- Perform gradient ascent to optimize log-likelihood

Page 45: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Calculate derivative and do gradient ascent

- Assign all latent vectors small random values

- Perform gradient ascent to optimize log-likelihood

Page 46: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

2D iteration example

Page 47: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Vectors are pretty nice because things are now super fast

- User-item score is a dot product:

- Item-item similarity score is a cosine similarity:

- Both cases have trivial complexity in the number of factors f:

Page 48: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Example: item similarity as a cosine of vectors

Page 49: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Two dimensional example for tracks

Page 50: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

We can rank all tracks by the user’s vector

Page 51: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

So how do we implement this?

Page 52: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Hadoop at Spotify

Page 53: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

One iteration of a matrix factorization algorithm

“Google News personalization: scalable online collaborative filtering”

Page 54: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 55: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

So now we solved the problem of recommendations right?

Page 56: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Actually what we really want is to apply it to other domains

Page 57: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Radio

- Artist radio: find related tracks

- Optimize ensemble model based on skip/thumbs data

Page 58: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Learning from feedback is actually pretty hard

Page 59: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

A/B testing

Page 60: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

More applications!!!

Page 61: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 62: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify
Page 63: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Last but not least: we’re hiring!

Page 64: Music recommendations at Spotify - Meetupfiles.meetup.com/1516886/Recommendations at Spotify v4.pdf · Music recommendations at Spotify Erik Bernhardsson erikbern@spotify.com. Spotify

Thank you