netflix viewing data architecture evolution - qcon 2014
DESCRIPTION
Netflix's architecture for viewing data has evolved as streaming usage has grown. Each generation was designed for the next order of magnitude, and was informed by learnings from the previous. From SQL to NoSQL, from data center to cloud, from proprietary to open source, look inside to learn how this system has evolved. (from talk given at QConSF 2014)TRANSCRIPT
![Page 1: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/1.jpg)
![Page 2: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/2.jpg)
Please read the notes associated with each slide for the full context of the
presentation
![Page 3: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/3.jpg)
Who am I?
Philip Fisher-Ogden• Director of Engineering @
Netflix
• Playback Services (making “click play” work)
• 6 years @ Netflix, from 10 servers to 10,000s
![Page 4: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/4.jpg)
Story
Netflix streaming – 2007 to present
![Page 5: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/5.jpg)
Device Growth
20071 device
200810s of devices
200910s of devices
2010100s of devices
2011+1000+ devices
![Page 6: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/6.jpg)
Experience Evolution
![Page 7: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/7.jpg)
Subscribers & Viewing
53M global subscribers
50 countries
>2 billion hours viewed per month
![Page 8: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/8.jpg)
Improved Personalization
Better Experience
Viewing
Virtuous Cycle
![Page 9: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/9.jpg)
Viewing Data
Who, What, When, Where, How Long
![Page 10: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/10.jpg)
Real time data use cases
What have I watched?
![Page 11: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/11.jpg)
Real time data use cases
Where was I at?
![Page 12: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/12.jpg)
Real time data use cases
What else am I watching?
![Page 13: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/13.jpg)
Session Analytics
![Page 14: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/14.jpg)
Session Analytics
![Page 15: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/15.jpg)
Active Sessions
Last Position
Viewing History
Data Feed
Generic Architecture
Start Stop
Collect
ProcessStream State
Session Summary
Event Stream
Provide
![Page 16: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/16.jpg)
Architecture Evolution
• Different generations
• Pain points & learnings
• Re-architecture motivations
![Page 17: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/17.jpg)
Real Time Data
2007 2009 20102008 2011 2012 2013 2014 Future
SQL
No
SQL
Cac
hin
g
redismemcached
![Page 18: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/18.jpg)
Real Time Data – gen 1
2007 2009 20102008 2011 2012 2013 2014 Future
SQL
No
SQL
Cac
hin
g
redismemcached
![Page 19: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/19.jpg)
Real Time Data – gen 1
Start Stop
SessionsLogs / Events
History / Position
SQL
![Page 20: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/20.jpg)
Real Time Data – gen 1 pain points
• Scalability
– DB scaled up not out
• Event Data Analytics
– ad hoc
• Fixed schema
![Page 21: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/21.jpg)
Real Time Data – gen 2
2007 2009 20102008 2011 2012 2013 2014 Future
SQL
No
SQL
Cac
hin
g
redismemcached
![Page 22: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/22.jpg)
Real Time Data – gen 2 motivations
• Scalability
– Scale out not up
• Flexible schema
– Key/value attributes
• Service oriented
![Page 23: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/23.jpg)
Real Time Data – gen 2
Start Stop
No
SQL
50 data partitions
Viewing Service
![Page 24: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/24.jpg)
Real Time Data – gen 2 pain points
• Scale out
– Resharding was painful
• Performance
– Hot spots
• Disaster Recovery
– SimpleDB had no backups
![Page 25: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/25.jpg)
Real Time Data – gen 3
2007 2009 20102008 2011 2012 2013 2014 Future
SQL
No
SQL
Cac
hin
g
redismemcached
![Page 26: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/26.jpg)
Real Time Data – gen 3 landscape
• Cassandra 0.6
• Before SSDs in AWS
• Netflix in 1 AWS region
![Page 27: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/27.jpg)
Real Time Data – gen 3 motivations
• Order of magnitude increase in requests
• Scalability
– Actually scale out rather than up
![Page 28: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/28.jpg)
Real Time Data – gen 3
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
StatelessTier
(fallback)
Sessions
Viewing History
Mem
cach
ed
![Page 29: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/29.jpg)
Real Time Data – gen 3 writes
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Start
Stop
![Page 30: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/30.jpg)
Real Time Data – gen 3 writes
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
Start
Stop
![Page 31: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/31.jpg)
Real Time Data – gen 3 writes
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
Start
Stop
update
![Page 32: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/32.jpg)
Real Time Data – gen 3 writes
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
Start
Stop
snapshot
Sessions
![Page 33: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/33.jpg)
Real Time Data – gen 3 writes
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
Start
Stop
Viewing History
Mem
cach
ed
![Page 34: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/34.jpg)
Real Time Data – gen 3 reads
Vie
win
g Se
rvic
e
StatefulTier
What have I
watched?
Viewing History
Mem
cach
ed
View Summary
![Page 35: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/35.jpg)
Real Time Data – gen 3 reads
Vie
win
g Se
rvic
e
StatefulTier
Latest PositionsWhere
was I at?
Viewing History
StatelessTier
(fallback)
Mem
cach
ed
![Page 36: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/36.jpg)
Real Time Data – gen 3 reads
Vie
win
g Se
rvic
e
StatefulTier
What else am I
watching?
Active Sessions
![Page 37: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/37.jpg)
gen 3 - Requests ScaleOperation Scale
Create (start streaming) 1,000s per second
Update (heartbeat, close) 100,000s per second
Append (session events/logs) 10,000s per second
Read viewing history 10,000s per second
Read latest position 100,000s per second
![Page 38: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/38.jpg)
gen 3 – Cluster ScaleCluster Scale
Cassandra Viewing History ~100 hi1.4xl nodes~48 TB total space used
Viewing Service Stateful Tier ~1700 r3.2xl nodes50GB heap memory per node
Memcached ~450 r3.2xl/xl nodes~8TB memory used
![Page 39: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/39.jpg)
Real Time Data – gen 3 pain points
• Stateful tier
– Hot spots
– Multi-region complexity
• Monolithic service
• read-modify-write poorly suited for memcached
![Page 40: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/40.jpg)
Real Time Data – gen 3 learnings
• Distributed stateful systems are hard
– Go stateless, use C*/memcached/redis…
• Decompose into microservices
![Page 41: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/41.jpg)
Real Time Data – gen 4
Vie
win
g Se
rvic
e
StatefulTier
0
1
n-2
n-1
…
Active Sessions
Latest Positions
View Summary
StatelessTier
(fallback)
Viewing History
Sessions
Mem
cach
ed
![Page 42: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/42.jpg)
Real Time Data – gen 4
Stream State/Event Collectors
Stateless Microservices
Data Processors
Data Services Data Feeds
![Page 43: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/43.jpg)
Real Time Data – gen 4
Viewing History
Session State Session Positions
Session Events
Data Tiers
red
isre
dis
![Page 44: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/44.jpg)
Session Analytics
• Summarize detailed event data
• Non-real time, but near real time
• Some shared logic with real time
![Page 45: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/45.jpg)
Session Analytics - Processing
2007 2009 20102008 2011 2012 2013 2014 Future
CustomService
(Java on AWS)
Mantis
Bat
ch
Nea
r R
eal-
Tim
e
Stre
am
Pro
cess
ing
![Page 46: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/46.jpg)
Session Analytics - Storage
2007 2009 20102008 2011 2012 2013 2014 Future
Bat
ch
Nea
r R
eal-
Tim
e
Stre
am
Pro
cess
ing
![Page 47: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/47.jpg)
Session Analytics – gen 1
• Storage • Processing
Sess
ion
sLo
gs
![Page 48: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/48.jpg)
Session Analytics – gen 1 pain points
• MapReduce good for batch
– Not for near real time
• Complexity
– Code in 2 systems / frameworks
– Operational burden of 2 systems
![Page 49: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/49.jpg)
Session Analytics – gen 2
• Storage • Processing
Session Events & Logs
Java
![Page 50: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/50.jpg)
Session Analytics – gen 2 learnings
• Reduced complexity
– shared code and ops
• Batch still available
• New bottleneck
– harder to extend logic
![Page 51: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/51.jpg)
Session Analytics – gen 3 (*)
• Storage • Processing
Mantis
Storm
Samza
Spark Streaming
Stream Processing Frameworks
![Page 52: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/52.jpg)
Takeaways
• Polyglot Persistence– One size fits all doesn’t
fit all
• Strong opinions, loosely held– Design for long term, but
be open to redesigns
![Page 53: Netflix viewing data architecture evolution - QCon 2014](https://reader030.vdocuments.site/reader030/viewer/2022020101/5598fb681a28ab75718b4659/html5/thumbnails/53.jpg)
Thanks!
@philip_pfo