mesoscon2017 - streaming data pipelines on mesos - …...spark job #1 spark job #2 flume sqoop dbs...
TRANSCRIPT
Streaming Data Pipelines on Mesos: Lessons Learned
Dean Wampler, Ph.D. VP of Fast Data Engineering, Lightbend
2
• Modern data applications: • What makes them different?
• Mesos as a platform • Specific insights
Themes
3
Why we care? Lightbend Fast Data Platform
4
Microservices Streaming
Machine Learning
Management
Management
Data Back Plane
StorageDeployment
5
6
Why not Hadoop?
7
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
8
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
What’s not to Like?
• YARN limitations • Batch orientation • Microservices?
9
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
YARN Limitations• Only understands
running jobs like Spark, MapReduce
• Can’t even run HDFS services!
10
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
YARN Limitations
• Slow adoption of container standards
• Too coarse-grained
11
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
Batch Orientation
• Streaming services more ad hoc
• Must manage C*, Kafka, etc. separately
12
YARN
HDFS
MRjob#1
MRjob#2
Sparkjob#1
Sparkjob#2
Flume Sqoop
DBs
SlaveNode
DiskDiskDisk
NodeMgr
DataNode
Master
ResourceManager
NameNode
Microservices?• Hadoop wants to
own the whole cluster
• No mixed application work loads
13
Why Mesos?
14
15
16
17
18
…Node NodeMesos Executor …Mesos Executor
master
Spark Executor
task task
task task
Spark Executor
task task
task task
…
Mesos Master
Spark Framework
Spark Driverobject MyApp { def main() { val sc = new SparkContext(…) … }}
Scheduler
19
Streaming - Characteristics• Continuous
processing • Variable lifespans • Resilience • Scalability
20
Continuous Processing
• Never ending input and output • Dynamic scaling on demand
(more in a moment…) • CNI support
21
Variable Lifespans
• Apps live minutes to months! • Mesos can deploy and manage:
• Very short-lived containers • Long-running services
22
Resilience
• Mesos’ built-in fault tolerance features make app resilience easier to implement.
• …
23
Resilience
• … • But stateful streaming services
need to manage state persistence for recovery
24
Scalability
• Mesos’ flexible scheduling model supports dynamic scaling on demand
25
Characteristics of Particular Streaming Apps
26
• Low latency? How low? • High volume? How high?
Streaming Tradeoffs (1/4)
27
• Which kinds of data processing & analytics are required? • SQL? • Machine learning? • Simple filtering & transformations?
Streaming Tradeoffs (2/4)
28
• How will this processing be done? • Individual processing of events? • Bulk processing of records?
Streaming Tradeoffs (3/4)
29
• Which tools and data sources/sinks must interoperate with your streaming tool?
Streaming Tradeoffs (4/4)
30
Characteristics of Streaming: How Several Tools Line Up
• Low latency • Med. volume • ETL, “tables” • Data flow or per
event
31
• Low latency • Med. volume • Complex flows • Complex Event
Processing
32
33
MicroserviceMicroservice
Microservice
Microservice
ServiceActor1
Event
Event
Event
Event
Event
Event Router
Actor
ServiceActor2
…
SA13SA11
SA12
SA23
SA21SA22
• AS and KS are libraries (with some services)
• Run the apps as microservices
• Med. latency • High volume • Data flows, SQL • En masse
processing
34
• Low latency • High volume • Data flows,
correctness • En masse
processing35
36
• Spark and Flink run services to which you submit “jobs”
• Jobs are partitioned into “tasks”
…Node NodeMesos Executor …Mesos Executor
master
Spark Executor
task task
task task
Spark Executor
task task
task task
…
Mesos Master
Spark Framework
Spark Driverobject MyApp { def main() { val sc = new SparkContext(…) … }}
Scheduler
37
Architecture Trends…
38
Big DataServices
Architectures Trend
39
Big DataServicesMicroservices and Fast Data
New Architecture
40
• Single responsibilities.
• Easy to evolve. • Fits the fine-grained
model of Mesos very well.
…
Browse
REST
AccountOrders
ShoppingCart
APIGateway
Inventory
41
• Continuous processing
• Variable lifespans • Resilience • Scalability
42
• Each data stream app (or μservice): • has one responsibility • ingests unending data (or messages)
Synergies…
Browse
REST
AccountOrders
ShoppingCart
APIGateway
Inventory
43
• Each data stream app (or μservice): • must operate asynchronously • must offer never-ending service
Synergies…
Browse
REST
AccountOrders
ShoppingCart
APIGateway
Inventory
44
• So, both 1. have similar design problems 2. dominated by data (at least eventually for microservices)
Synergies…
Browse
REST
AccountOrders
ShoppingCart
APIGateway
Inventory
45
What’s Still Needed?
46
• Stateful streaming apps use ad-hoc mechanisms to persist the state for resilience
State