spil games konrad

Post on 16-Apr-2017

237 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Going data-driven Learnings from building a real-time recommender system

Konrad Burnik

September 21, 2016

Spil Games – Leading cross platform publisher

Web Portals

Spil Games - Web portal stats

• Portfolio of cca. 16K games

• 100 million monthly active users

• Channels: Family, Teen, Girls, Men

• Device Type: Desktop or Mobile

Spil Games getting ready for the Big Data world

This looks great! So, what's our

first ML project?

Just look around ...

Example: Distribution of Games within labels at spelletjes.nl

Long tail you got there!

"The" widget for recommendations

Goals and challenges

• Provide better content for the users

• Optimize recommendations for business value

• Provide recommendations for new users

• Learn to use the new Spark infrastructure for solving all of the above

Overview of the recommender system

• The infrastructure (before and after)

• Two key components of the new recommender

• Ephemeral (effectively solving the cold-start problem)

• Collaborative Filtering

s

Spil Games Recommender Infrastructure (before)

s

Streaming

MLlib

Spil Games Recommender Infrastructure (after)

• For users which have some activity

• In particular, we wish to target the users which came to the portals and played just a few games

Ephemeral Recommender

Ephemeral Recommender (challenges)

• What data can we use besides activity?

• How do we keep track of users?

• How do we quickly generate the recommendation lists?

Ephemeral Recommender (key features)

• The ephemeral recommender is game-similarity based

• Exploiting the long-tail

• Also we show games which have more business value for Spil Games for example with sufficient amount of lifetime value

• Processing 800-1500 events per second

Action Puzzle

Example:

Action Puzzle

+1

+1

+1

Streaming

For You

Action Puzzle

+1 +1

+1

For You

Streaming

Action Puzzle

+1

+1

+1

For You

Streaming

Action Puzzle

+1

+1

+1

For You

Streaming

• For users which have history of their activity

• Proven to work by different companies like Amazon, Netflix, …

Collaborative Filtering

Collaborative Filtering in general

* * ? ? * * * *

? * * * * * * ?

* * * * * * * ? * * * * *

* * * * ? * * * * * * *

Collaborative Filtering in general

* * ? ? * * * *

? * * * * * * ?

* * * * * * * ? * * * * *

* * * * ? * * * * * * *

Can we predict the empty

places?

Collaborative Filtering in general

* * * * * * * * * * * *

* * * * * * * * * * * *

* * * * * * * * * * * * * *

* * * * * * * * * * * * * * *

Great! But how do we get the highest ratings

out?

Collaborative Filtering in

Image obtained from databricks.com

MLlib

Collaborative Filtering (challenges)

• How do we aggregate the activity data?

• How do we score the data and scale it?

• Which users do we run the model on?

• How do we efficiently extract the recommendations from the model?

Collaborative Filtering recommender (key features)

• Aggregating every hour of user activity for the last hour (~1.5 - 5 mil. rows) takes about 2 minutes

• Calculating the model based on a month of scored and scaled pre-aggregated activity takes about 1 hour

• We run the model only for user which were active in the last 5 hours

• Extracting the recommendations takes about 30 mins with optimized approach

Family Teens Girls Men

Desktop 68 894 434 16 070 864 31 285 329 679 565

Mobile 2 532 549 404 934 1 276 879 2 249

# total records

Family Teens Girls Men

Desktop 16 127 074 5 254 646 5 022 497 357 721

Mobile 1 035 520 221 192 397 091 1 240

# distinct users

Family Teens Girls Men

Desktop 15 078 11 764 7 736 3 171

Mobile 3 151 5 532 1 792 467

# distinct games

Data amounts processed by CF

Results

• The deployment system in place for developing Spark

apps

• Gained knowledge of using Spark infrastructure

• Gained knowledge of inner workings of recommenders as well as some related cutting-edge research

• Significantly improved the CTR of the "For You"

widget in the two months the recommender is live

What have we learned?

• Giving recommendations is hard!

• Simple solutions often work best

• Exploring the long-tail is a good thing for diversification

• Spark is not that simple as hyped, you often need to tweak a lot!

Thank you

for your attention!

Contact: https://nl.linkedin.com/in/konrad-burnik

top related