apache flink overview at sf spark and friends
TRANSCRIPT
![Page 1: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/1.jpg)
IntroducingApache Flink™
@StephanEwen
![Page 2: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/2.jpg)
Flink’s Recent History
April 2014 April 2015Dec 2014
Top Level Project Graduation
0.7
0.6
0.5
0.9
0.9-m1
![Page 3: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/3.jpg)
What is Apache Flink?
3
Gelly
Table
ML
SA
MO
A
DataSet (Java/Scala) DataStream (Java/Scala)
Had
oop M
/R
Local Remote YARN Tez Embedded
Data
flow
Data
flow
(W
iP)
MR
QL
Table
Casc
ad
ing
(W
iP)
Streaming dataflow runtime
Zep
pelin
A Top-Level project of the Apache Software Foundation
![Page 4: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/4.jpg)
Program compilation
4
case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next }
Optimizer
Type extraction
stack
Task schedulin
g
Dataflow metadat
a
Pre-flight (Client)
Master Workers
Data Sourceorders.tbl
Filter
Map DataSourcelineitem.tbl
JoinHybrid Hash
buildHT probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
Dataflow GraphIndependent of batch or streaming job
deployoperators
trackintermediate
results
![Page 5: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/5.jpg)
Native workload support
5
Flink
Streaming topologies
Long batch pipelines
Machine Learning at scale
How can an engine natively support all these workloads?And what does "native" mean?
Graph Analysis
Low latency
resource utilization iterative algorithms
Mutable state
![Page 6: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/6.jpg)
E.g.: Non-native iterations
6
Step Step Step Step Step
Client
for (int i = 0; i < maxIterations; i++) {// Execute MapReduce job
}
![Page 7: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/7.jpg)
E.g.: Non-native streaming
7
streamdiscretizer
Job Job Job Job
while (true) { // get next few records // issue batch job}
Data Stream
![Page 8: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/8.jpg)
Native workload support
8
Flink
Streaming topologies
Long batchpipelines
Machine Learning at scale
How can an engine natively support all these workloads?And what does "native" mean?
Graph Analysis
Low latency
resource utilization iterative algorithms
Mutable state
![Page 9: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/9.jpg)
Ingredients for “native” support1. Execute everything as streams
Pipelined execution, backpressure or buffered, push/pull model
2. Special code paths for batchAutomatic job optimization, fault tolerance
3. Allow some iterative (cyclic) dataflows
4. Allow some mutable state
5. Operate on managed memoryMake data processing on the JVM robust
9
![Page 10: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/10.jpg)
10
Stream processing in Flink
![Page 11: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/11.jpg)
Stream platform architecture
11
- Gather and backup streams- Offer streams for
consumption- Provide stream recovery
- Analyze and correlate streams- Create derived streams and state- Provide these to downstream
systems
Server logs
Trxnlogs
Sensorlogs
Downstreamsystems
![Page 12: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/12.jpg)
What is a stream processor?
1. Pipelining2. Stream replay
3. Operator state4. Backup and restore
5. High-level APIs6. Integration with batch
7. High availability8. Scale-in and scale-out
12
Basics
State
App development
Large deployments
See http://data-artisans.com/stream-processing-with-flink.html
![Page 13: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/13.jpg)
Pipelining
13
Basic building block to “keep the data moving”
Note: pipelined systems do not usually transfer individual tuples, but buffers that batch several tuples!
![Page 14: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/14.jpg)
Operator state User-defined state
• Flink transformations (map/reduce/etc) are long-running operators, feel free to keep around objects
• Hooks to include in system's checkpoint
Windowed streams• Time, count, data-driven windows• Managed by the system (currently WiP)
14
![Page 15: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/15.jpg)
Streaming fault tolerance Ensure that operators see all events
• “At least once”• Solved by replaying a stream from a checkpoint, e.g.,
from a past Kafka offset
Ensure that operators do not perform duplicate updates to their state• “Exactly once”• Several solutions
15
![Page 16: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/16.jpg)
Exactly once approaches Discretized streams (Spark Streaming)
• Treat streaming as a series of small atomic computations• “Fast track” to fault tolerance, but does not separate
application logic (semantics) from recovery
MillWheel (Google Cloud Dataflow)• State update and derived events committed as atomic
transaction to a high-throughput transactional store• Needs a very high-throughput transactional store
Chandy-Lamport distributed snapshots (Flink)
16
![Page 17: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/17.jpg)
Distributed snapshots in Flink
Super-impose checkpointing mechanism on execution instead of using execution as the
checkpointing mechanism17
![Page 18: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/18.jpg)
18
JobManagerRegister checkpointbarrier on master
Replay will start from here
![Page 19: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/19.jpg)
19
JobManagerBarriers “push” prior events (assumes in-order delivery in individual channels) Operator checkpointing
starting
Operator checkpointing finished
Operator checkpointing in progress
![Page 20: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/20.jpg)
20
JobManager Operator checkpointing takes snapshot of state after data prior to barrier have updated the state. Checkpoints currently synchronous, WiP for incremental and asynchronous
State backup
Pluggable mechanism. Currently either JobManager (for small state) or file system (HDFS/Tachyon). WiP for in-memory grids
![Page 21: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/21.jpg)
21
JobManager
Operators with many inputs need to wait for all barriers to pass before they checkpoint their state
![Page 22: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/22.jpg)
22
JobManager
State snapshots at sinks signal successful end of this checkpoint
At failure, recover last checkpointed state and restart sources from last barrier guarantees at least once
State backup
![Page 23: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/23.jpg)
Benefits of Flink’s approach Data processing does not block
• Can checkpoint at any interval you like to balance overhead/recovery time
Separates business logic from recovery• Checkpointing interval is a config parameter, not a variable in the
program (as in discretization)
Can support richer windows• Session windows, event time, etc
Best of all worlds: true streaming latency, exactly-once semantics, and low overhead for recovery
23
![Page 24: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/24.jpg)
DataStream API
24
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .groupBy("word").sum("frequency") .print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print()
DataSet API (batch):
DataStream API (streaming):
![Page 25: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/25.jpg)
Roadmap Short-term (3-6 months)
• Graduate DataStream API from beta• Fully managed window and user-defined state with pluggable
backends• Table API for streams (towards StreamSQL)
Long-term (6+ months)• Highly available master• Dynamic scale in/out• FlinkML and Gelly for streams• Full batch + stream unification
25
![Page 26: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/26.jpg)
Batch processingBatch on Streaming
26
![Page 27: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/27.jpg)
27
Batch Pipelines
![Page 28: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/28.jpg)
Batch on Streaming Batch programs are a special kind of streaming
program
28
Infinite Streams Finite Streams
Stream Windows Global View
PipelinedData Exchange
Pipelined or Blocking Exchange
Streaming Programs Batch Programs
![Page 29: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/29.jpg)
29
Batch Pipelines
Data exchange (shuffle / broadcast)is mostly streamed
Some operators block (e.g. sorts / hash tables)
![Page 30: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/30.jpg)
30
Operators Execution Overlaps
![Page 31: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/31.jpg)
31
Memory Management
![Page 32: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/32.jpg)
Memory Management
32
![Page 33: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/33.jpg)
Smooth out-of-core performance
33More at: http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
Blue bars are in-memory, orange bars (partially) out-of-core
![Page 34: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/34.jpg)
Other features of FlinkThere is more…
34
![Page 35: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/35.jpg)
35
More Engine Features
Automatic Optimization /Static Code Analysis
Closed Loop Iterations
StatefulIterations
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
JoinHybrid Hash
buildHT
probe
broadcast
forward
Combine
GroupRed
sort
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
JoinHybrid Hash
buildHT
probe
hash-part [0] hash-part [0]
hash-part [0,1]
GroupRed
sort
forward
![Page 36: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/36.jpg)
Closing
36
![Page 37: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/37.jpg)
Apache Flink: community
37
![Page 38: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/38.jpg)
I Flink, do you?
38
If you find this exciting,
get involved and start a discussion on Flink‘s mailing list,
or stay tuned by
subscribing to [email protected],following flink.apache.org/blog, and
@ApacheFlink on Twitter
![Page 39: Apache Flink Overview at SF Spark and Friends](https://reader036.vdocuments.site/reader036/viewer/2022081506/55c52887bb61ebec158b4626/html5/thumbnails/39.jpg)
39
flink-forward.org
Bay Area Flink meetupTomorrow