big data paris 2011 is cool florian douetteau

18
Big Data + Social + Games @Is Cool 16/03/2012

Upload: iscoolent

Post on 27-May-2015

865 views

Category:

Technology


0 download

DESCRIPTION

Big data, HADOOP, Florian Douetteau, isCool Entertainment, BI, SQL, Big Data Congress

TRANSCRIPT

Page 1: Big data paris 2011 is cool florian douetteau

Big Data + Social + Games @Is Cool

16/03/2012

TITRE DOCUMENT

Page 2: Big data paris 2011 is cool florian douetteau

Who is IsCool Entertainment?

Social game publisher based in Paris, France

#1 French publisher in terms of audience (450k Daily Active Users) & revenue

2.8 Millions Fans

80 employees

9.1 million € revenue in 2010

4 live applications on Facebook

Florian Douetteau CTO @fdouetteau

Agenda • What do we do? Social Gaming

• What kind of (Big) Analytics we do? Lots

• How we do it ? Hadoop, Python, R, Tableau, Geph and stuff…

Page 3: Big data paris 2011 is cool florian douetteau

Is Cool Games

IsCool, Delirious Collectible

Game

Absolute Solitaire, The best solitaire game available online

Temple Of Mahjong, Collect, Play, Exchange

Belote Multijoueur, Play, Win, Meet

Page 4: Big data paris 2011 is cool florian douetteau

Games & Virtual Goods

Play the Game & Gain some virtual goods

Play again & Gain more

Collaborate with other players & Gain More

….

Possibly buy To grow quicker To help others

Page 5: Big data paris 2011 is cool florian douetteau

Virtual Goods Virtual Economy

Virtual Goods Must not be too easy to get The game would not be fun ! No monetization

Virtual Goods must not be hard to get People would churn because of

frustration !

Virtual Goods can be usually traded between players

Virtual and actual “Price” of a good

Let’s Trade 1 Watch against

3 Hammers

Page 6: Big data paris 2011 is cool florian douetteau

Why is this Big Data ?

Number of object transactions per day NYSE 3,600,000,000

IsCool 2,150,000,000

Nasdaq 1,600,000,000

Nikkey 1,500,000,000

Footsie 860,000,000

CAC 40 142,500,000

9,8 TB Data to analyze

18 Million users generated actions per day 7 Billions per year.

Page 7: Big data paris 2011 is cool florian douetteau

The Real Big Data Challenge Collaborate for collective insights

data scientist?

what metrics?

Realtime?

Game Designer Perspective : Nice Charts ?

Programmers’ Perspective : Log Files & Work ?

Business Guy Perspective: Revenue Forecast ?

BI Veteran: Schema Definition ?

Page 8: Big data paris 2011 is cool florian douetteau

Specifics of Game Analytics

Virtual Goods We are the Factory AND the

Shop, and most of the products are free.

Social Networks Network effects are key

Games The product changes EVERY day ! Sudden wage of unexpected

players from Guatemala ! People try to cheat !

Page 9: Big data paris 2011 is cool florian douetteau

Use Case 1 : Understanding Users

1: Defining engagement

Tenure length

Visit frequency

Virality

Paying user conversion

ARPPU

Score

Use of feature A,B,C…

Key drivers??? Traffic

Page 10: Big data paris 2011 is cool florian douetteau

Case Study 1 - Segment User Behaviours

2: Describing engagement patterns: Running a segment analysis

Page 11: Big data paris 2011 is cool florian douetteau

Use Case 2 : Understanding Users as a whole

10 Million Nodes

Around 1 000 Billion Edges

How does the graph evolve in time ?

What are the communities?

Page 12: Big data paris 2011 is cool florian douetteau

Understanding Users as a Whole

A very large community

Some mid size communities

Lots of small clusters ((mostly 2 players)

Page 13: Big data paris 2011 is cool florian douetteau

Use Case 3 : Analyze Long Terms effect of a feature

16/03/2012 TITRE DOCUMENT

A/B Tests Some features can be A/B tested …and some cannot ! How to measure the uplift ?

Are players using the new feature… More engaged? Generate more virality ? etc….

Complexity Multiple variable to observe

(other features, history )

Page 14: Big data paris 2011 is cool florian douetteau

… How

over the last 3 years

• Tools changed

• Scale changed

• Focus Changed

Analyzing the Offer

• Online Analytics Platform

• Commercial / Open Source ETL

• Commercial BI Visualization Software

• Commercial / Open Source databases (column stores)

• …

Page 15: Big data paris 2011 is cool florian douetteau

What we learned

Diversity

• There's no Hadoop+R Magic (Expertise, Entry Costs, Maintenance)

• There’s no XYZ Magical Product

Relativity

• Windows / Linux ? Cloud or on-premise ?

• Do you have internal data mining experts (yes/no) ?

• Do you have internal scalability experts (yes/no) ?

• What is _real_ budget ? 0K ? 10K ? 100K ? 1000K ?

Superciality

• Ability to display is more important than the result.

Page 16: Big data paris 2011 is cool florian douetteau

Mixed Approach

SaaS Analytics Platforms For common, business metrics (virality,

traffic, engagement) Corporate Level Visibility Day-to-day

Internal Datawarehousing Detailed Business Metrics Virtual Economy Modeling Long term behaviours Business Level Visibility Week-to-Week

Datamining tools Ad-hoc analytics Graph Analytics

Page 17: Big data paris 2011 is cool florian douetteau

Datawarehouse for the Big Data era

Hadoop/Hive (through Amazon’s Elastric Map Reduce)

• Used to reduce the amount of information : 10 GB a day => 1GB a day

• High cost of development for "business" related processing

Open Source ETL (PyBabe)

• Pure Python ETL

• Good integration with AWS/ S3

• Easy to integrate in our development environment

Columnar Database (Infinidb, Open Source)

• Free (as beer)

• Good performance for analytics tasks on a few hundreds million lines ( SELECT … GROUP BY … ORDER … )

• Featured and limited performance compared to commercial Column Stores

Dashboarding (Tableau Software)

• +Direct connection to the database

• +Excel fan biz guy can use it with no training !

Page 18: Big data paris 2011 is cool florian douetteau

Questions ?