riot games - player focused pipeline - stampedecon 2015
TRANSCRIPT
#StampedeCon 2015 - Riot Games
BUILDING A PLAYER FOCUSED DATA PIPELINE
RYAN TABORA@ryantabora
SEAN MALONEY@sean_seannery
SEAN MALONEYENGINEER
WHO WE ARE
SMALONEY @RIOTGAMES.COM
@SEAN_SEANNERY
WORKING ON RIOT’S ETL TOOLS
FAVORITE ACTIVITY:ATTEMPTING TO GROW FACIAL HAIR BUT FAILING MISERABLY
RYAN TABORAENGINEER
WHO WE ARE
WORKING ON RIOT’S INGESTION PIPELINE.
FAVORITE ACTIVITY:EATING MAC + CHEESE WHILE LISTENING TO DEATH METAL.
RTABORA @RIOTGAMES.COM @RYANTABORA
OUR DATA PLATFORM (THEN)
5 THINGS YOU NEED
RIOT GAMES SCALE
AGENDA
3 THINGS WE STILL NEED (AND YOU MAY WANT ALSO)
OUR DATA PLATFORM (NOW)
LEAGUE OF LEGENDS STATS
7.5 MILLION
PEAK CONCURRENT
PLAYERS
STATS RELEASED JANUARY 2014
67 MILLION
MONTHLY ACTIVE PLAYERS
MORE THAN MORE THAN27 MILLION
DAILY ACTIVE PLAYERS
MORE THAN
Auditing ETLs can use queries with custom injected data.
Ad-Hoc Data Requests Extend with new connection types and custom etls easily
Self-Service ArchitectureThe big data team is small. We can’t manage all the ETLS ourselves.
Support Multiple DatacentersOne task will execute on different database servers around the world.
A.K.A. 5 THINGS WE DIDN’
T HAVE Multiple Data Access PatternsExtend with new connection types and custom etls easily
User DocumentationNo one likes doing it, but it helps a lot.
Onboard trainingGet new coworkers in-the-know
Familiar ProtocolsUse REST or RPC so developers are on the same page
Focus on UXYour tools need to be easy for non-technical people to use.
SELF SERVICEHOW?
Templating ETLs can use queries with custom injected data.
Scale Horizontally As the data grows, the tool should be able to handle it.
Empower Users The big data team is small. We can’t manage all the ETLS ourselves.
Support One ETL - Many SourcesOne task will execute on different database servers around the world.
YOUR ETL TOOL
SHOULD...
Distributed ETL Software written in Ruby.
Candidate for Riot open sourcing
Same ETL applied to multiple regions / datacenters
Self-Service UI with SQL query templating.
REST micro-service built with Java and docker.
Reports and visualizations we can use to find problems.
Source and target comparison.
WarehouseAuditingServicePlatform
Returns one row in less than one second
Java web service
Simple abstraction, backed by DynamoDB
PointDataService
Full duplicate of the transactional data copied to DynamoDB
Data load powered by Fuetl and ad-hoc EMR cluster
Audited by WASP
PointDataService
Easily scale our resourcesBoth vertically (metastore) and horizontally (clusters)
Support intensive ad-hoc tasks.We can spin up temporary dedicated clusters for big projects.
We own our infrastructureBefore, the game servers team got all the love.
Can now join our data!One task will execute on different database servers around the world.
TO THE CLOUD!