stream processing with apache flink (flink.tw meetup 2016/07/19)
TRANSCRIPT
![Page 1: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/1.jpg)
Stream Processing with
Apache Flink
Robert Metzger
@rmetzger_
Flink.tw | Apache Flink Taiwan User Group,July 19, 2016
![Page 2: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/2.jpg)
Talk overview
My take on the stream processing space,
and how it changes the way we think
about data
Discussion of unique building blocks of
Flink
Benchmarking Flink, by extending a
benchmark from Yahoo!
2
![Page 3: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/3.jpg)
Apache Flink
Apache Flink is an open source stream
processing framework
• Low latency
• High throughput
• Stateful
• Distributed
Developed at the Apache Software
Foundation, 1.1.0 release available soon,
used in production
3
![Page 4: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/4.jpg)
Entering the streaming era
4
![Page 5: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/5.jpg)
5
Streaming is the biggest change in
data infrastructure since Hadoop
![Page 6: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/6.jpg)
6
1. Radically simplified infrastructure
2. Do more with your data, faster
3. Can completely subsume batch
![Page 7: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/7.jpg)
Traditional data processing
7
Web server
Logs
Web server
Logs
Web server
Logs
HDFS / S3
Periodic (custom) or continuous ingestion (Flume) into HDFS
Batch job(s) for log analysis
Periodic log analysis job
Serving layer
Log analysis example using a batch
processor
Job scheduler (Oozie)
![Page 8: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/8.jpg)
Traditional data processing
8
Web server
Logs
Web server
Logs
Web server
Logs
HDFS / S3
Periodic (custom) or continuous ingestion (Flume) into HDFS
Batch job(s) for log analysis
Periodic log analysis job
Serving layer
Latency from log event to serving layer
usually in the range of hours
every 2 hrs
Job scheduler (Oozie)
![Page 9: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/9.jpg)
Data processing without stream
processor
9
Web server
Logs
Web server
Logs
HDFS / S3Batch job(s) for
log analysis
This architecture is a hand-crafted micro-
batch model
Batch interval: ~2 hours
hours minutes milliseconds
Manually triggered periodic batch job
Batch processor with micro-batches
Latency
Approach
seconds
Stream processor
![Page 10: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/10.jpg)
Downsides of stream processing with a
batch engine
Very high latency (hours)
Complex architecture required:• Periodic job scheduler (e.g. Oozie)
• Data loading into HDFS (e.g. Flume)
• Batch processor
• (When using the “lambda architecture”: a stream processor)
All these components need to be implemented and maintained
Backpressure: How does the pipeline handle load spikes?
10
![Page 11: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/11.jpg)
Log event analysis using a
stream processor
11
Web server
Web server
Web server
High throughputpublish/subscribe
bus
Serving layer
Stream processors allow to analyze
events with sub-second latency.
Options:• Apache Kafka• Amazon Kinesis• MapR Streams• Google Cloud Pub/Sub
Forward events immediately to pub/sub bus
Stream Processor
Options:• Apache Flink• Apache Beam• Apache Samza
Process events in real time & update serving layer
![Page 12: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/12.jpg)
12
Real-world data is produced in a
continuous fashion.
New systems like Flink and Kafka
embrace streaming nature of data.
Web server Kafka topic
Stream processor
![Page 13: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/13.jpg)
What do we need for replacing
the “batch stack”?
13
Web server
Web server
Web server
High throughputpublish/subscribe
bus
Serving layer
Options:• Apache Kafka• Amazon Kinesis• MapR Streams• Google Cloud Pub/Sub
Forward events immediately to pub/sub bus
Stream Processor
Options:• Apache Flink• Google Cloud
Dataflow
Process events in real time & update serving layer
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
![Page 14: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/14.jpg)
Apache Flink stack
15
Ge
lly
Table
ML
SA
MO
A
DataSet (Java/Scala)DataStream (Java / Scala)
Hadoop
M/R
LocalClusterYARN
Apache B
eam
Apache B
eam
Table
Cascadin
g
Streaming dataflow runtime
Sto
rm A
PI
Zeppelin
CE
P
![Page 15: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/15.jpg)
Needed for the use case
16
Ge
lly
Table
ML
SA
MO
A
DataSet (Java/Scala)DataStream (Java / Scala)
Hadoop
M/R
LocalClusterYARN
Apache B
eam
Apache B
eam
Table
Cascadin
g
Streaming dataflow runtime
Sto
rm A
PI
Zeppelin
CE
P
![Page 16: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/16.jpg)
Windowing / Out of order
events
17
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
![Page 17: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/17.jpg)
Building windows from a stream
18
“Number of visitors in the last 5 minutes per country”
Web server Kafka topic
Stream processor
// create stream from Kafka source
DataStream<LogEvent> stream = env.addSource(new KafkaConsumer());
// group by country
DataStream<LogEvent> keyedStream = stream.keyBy(“country“);
// window of size 5 minutes
keyedStream.timeWindow(Time.minutes(5))
// do operations per window
.apply(new CountPerWindowFunction());
![Page 18: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/18.jpg)
Building windows: Execution
19
Kafka
Source
Window Operator
S
S
S
W
W
Wgroup bycountry
// window of size 5 minutes
keyedStream.timeWindow(Time.minutes(5));
Job plan Parallel execution on the cluster
Time
![Page 19: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/19.jpg)
Window types in Flink
Tumbling windows
Sliding windows
Custom windows with window assigners, triggers and evictors
20Further reading: http://flink.apache.org/news/2015/12/04/Introducing-windows.html
![Page 20: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/20.jpg)
Time-based windows
21
Stream
Time of event
Event data
{ “accessTime”: “1457002134”,“userId”: “1337”, “userLocation”: “UK”
}
Windows are created based on the real
world time when the event occurred
![Page 21: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/21.jpg)
A look at the reality of time
22
Kafka
Network delays
Out of sync clocks
33 11 21 15 9
Events arrive out of order in the system
Use-case specific low watermarks for time tracking
Window between 0 and 15
Stream Processor
15
Guarantee that no event with time <= 15 will arrive afterwards
![Page 22: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/22.jpg)
Time characteristics in Apache Flink
Event Time
• Users have to specify an event-time extractor +
watermark emitter
• Results are deterministic, but with latency
Processing Time
• System time is used when evaluating windows
• low latency
Ingestion Time
• Flink assigns current system time at the sources
Pluggable, without window code changes23
![Page 23: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/23.jpg)
State handling
24
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
![Page 24: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/24.jpg)
State in streaming
Where do we store the elements from our
windows?
In stateless systems, an external state
store (e.g. Redis) is needed.
25
S
S
S
W
W
WTime
Elements in windows are state
![Page 25: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/25.jpg)
Stream processor: Flink
Managed state in Flink
Flink automatically backups and restores state
State can be larger than the available memory
State backends: (embedded) RocksDB, Heap
memory
26
Operator with windows (large state)
State backend
(local)
Distributed File System
Periodic backup /recovery
Web server
Kafka
![Page 26: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/26.jpg)
Managing the state
How can we operate such a pipeline
24x7?
Losing state (by stopping the system)
would require a replay of past events
We need a way to store the state
somewhere!
27
Web server Kafka topic
Stream processor
![Page 27: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/27.jpg)
Savepoints: Versioning state
Savepoint: Create an addressable copy of a
job’s current state.
Restart a job from any savepoint.
28Further reading: http://data-artisans.com/how-apache-flink-enables-new-streaming-applications/
> flink savepoint <JobID>
HDFS
> hdfs:///flink-savepoints/2
> flink run –s hdfs:///flink-savepoints/2 <jar>
HDFS
![Page 28: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/28.jpg)
Fault tolerance and
correctness
29
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
![Page 29: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/29.jpg)
Fault tolerance in streaming
How do we ensure the results (number of
visitors) are always correct?
Failures should not lead to data loss or
incorrect results
30
Web server Kafka topic
Stream processor
![Page 30: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/30.jpg)
Fault tolerance in streaming
at least once: ensure all operators see all
events
• Storm: Replay stream in failure case (acking
of individual records)
Exactly once: ensure that operators do
not perform duplicate updates to their
state
• Flink: Distributed Snapshots
• Spark: Micro-batches on batch runtime
31
![Page 31: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/31.jpg)
Flink’s Distributed Snapshots
Lightweight approach of storing the state
of all operators without pausing the
execution
Implemented using barriers flowing
through the topology
32
Data Stream
barrier
Before barrier =
part of the snapshotAfter barrier =
Not in snapshot
Further reading: http://blog.acolyer.org/2015/08/19/asynchronous-distributed-snapshots-for-
distributed-dataflows/
![Page 32: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/32.jpg)
Wrap-up: Log processing example
How to do something with the data? Windowing
How does the system handle large windows? Managed state
How do operate such a system 24x7? Safepoints
How to ensure correct results across failures?Checkpoints, Master HA
33
Web server Kafka topic
Stream processor
![Page 33: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/33.jpg)
Performance:
Low Latency & High Throughput
34
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
![Page 34: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/34.jpg)
Performance: Introduction
Performance always depends on your own
use cases, so test it yourself!
We based our experiments on a recent
benchmark published by Yahoo!
They benchmarked Storm, Spark
Streaming and Flink with a production use-
case (counting ad impressions)
35Full Yahoo! article: https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-
computation-engines-at
![Page 35: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/35.jpg)
Yahoo! Benchmark
Count ad impressions grouped by
campaign
Compute aggregates over a 10 second
window
Emit current value of window aggregates
to Redis every second for query
36Full Yahoo! article: https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-
computation-engines-at
![Page 36: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/36.jpg)
Yahoo’s Results
“Storm […] and Flink […] show sub-second
latencies at relatively high throughputs with
Storm having the lowest 99th percentile
latency. Spark streaming 1.5.1 supports high
throughputs, but at a relatively higher
latency.” (Quote from the blog post’s executive summary)
37Full Yahoo! article: https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-
computation-engines-at
![Page 37: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/37.jpg)
Extending the benchmark
Benchmark stops at Storm’s throughput
limits. Where is Flink’s limit?
How will Flink’s own window
implementation perform compared to
Yahoo’s “state in redis windowing”
approach?
38Full Yahoo! article: https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-
computation-engines-at
![Page 38: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/38.jpg)
Windowing with state in Redis
39
KafkaConsumermap()filter()
groupwindowing & caching code
realtime queries
![Page 39: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/39.jpg)
Rewrite to use Flink’s own window
40
KafkaConsumermap()filter()
group
Flink event time
windows
realtime queries
![Page 40: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/40.jpg)
Results after rewrite
41
0 750.000 1.500.000 2.250.000 3.000.000 3.750.000
Storm
Flink
Throughput: msgs/sec
400k msgs/sec
![Page 41: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/41.jpg)
Can we even go further?
42
KafkaConsumermap()filter()
group
Flink event time
windows
Network link to Kafka cluster is
bottleneck! (1GigE)
Data Generatormap()filter()
group
Flink event time
windows
Solution: Move data generator
into job (10 GigE)
![Page 42: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/42.jpg)
Results without network bottleneck
43
0 4.000.000 8.000.000 12.000.000 16.000.000
Storm
Flink
Flink (10 GigE)
Throughput: msgs/sec
10 GigE end-to-end
15m msgs/sec
400k msgs/sec
3m msgs/sec
![Page 43: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/43.jpg)
Benchmark summary
Flink achieves throughput of 15 million
messages/second on 10 machines
35x higher throughput compared to
Storm (80x compared to Yahoo’s runs)
Flink ran with exactly once guarantees,
Storm with at least once.
Read the full report: http://data-
artisans.com/extending-the-yahoo-
streaming-benchmark/
44
![Page 44: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/44.jpg)
Closing
45
![Page 45: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/45.jpg)
What makes Flink flink?
46
Low latency
High Throughput
Well-behavedflow control
(back pressure)
Make more sense of data
Works on real-timeand historic data
TrueStreaming
Event Time
APIsLibraries
StatefulStreaming
Globally consistentsavepoints
Exactly-once semanticsfor fault tolerance
Windows &user-defined state
Flexible windows(time, count, session, roll-your own)
Complex Event Processing
![Page 46: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/46.jpg)
Questions?
Ask now!
eMail: [email protected]
Twitter: @rmetzger_
Follow: @ApacheFlink
Read: flink.apache.org/blog, data-
artisans.com/blog/
Mailinglists: (news | user | dev)@flink.apache.org
47
![Page 48: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/48.jpg)
We are hiring!
data-artisans.com/careers
![Page 49: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/49.jpg)
Appendix
50
![Page 50: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/50.jpg)
Roadmap 2016
51
SQL / StreamSQL
CEP Library
Managed Operator State
Dynamic Scaling
Miscellaneous
![Page 51: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/51.jpg)
Miscellaneous
Support for Apache Mesos
Security• Over-the-wire encryption of RPC (akka) and data
transfers (netty)
More connectors• Apache Cassandra
• Amazon Kinesis
Enhance metrics• Throughput / Latencies
• Backpressure monitoring
• Spilling / Out of Core
52
![Page 52: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/52.jpg)
Fault Tolerance and correctness
53
4
3
4 2
How can we ensure the state is always in
sync with the events?
event counter
final operator
![Page 53: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/53.jpg)
Naïve state checkpointing approach
54
Process some records:
Stop everything,
store state:
Continue processing …
0
0
0 01
1
2 2
Operator State
a 1
b 1
c 2
d 2
a
b
c d
![Page 54: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/54.jpg)
Distributed Snapshots
55
0
0
0 0
1
1
0 0
Initial state
Start processing
1
1
0 0
Trigger checkpointOperator State
a 1
b 1
![Page 55: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/55.jpg)
Distributed Snapshots
56
2
1
2 0
Operator State
a 1
b 1
c 2
Barrier flows with events
2
1
2 2
Checkpoint completed Operator State
a 1
b 1
c 2
d 2
Valid snapshot without stopping the topology
Multiple checkpoints can be in-flight
Complete, consistent state snapshot
![Page 56: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/56.jpg)
Analysis of naïve approach
Introduces latency
Reduces throughput
Can we create a correct snapshot while
keeping the job running?
Yes! By creating a distributed snapshot
57
![Page 57: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/57.jpg)
Handling Backpressure
58
Slow down upstream operators
Backpressure might occur when:• Operators create checkpoints• Windows are evaluated• Operators depend on external
resources• JVMs do Garbage Collection
Operator not able to process
incoming data immediately
![Page 58: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/58.jpg)
Handling Backpressure
59
Sender
Sender
Receiver
Receiver
Sender does not have any empty buffers available: Slowdown
Network transfer (Netty) or local buffer exchange (when S and R are on the same machine)
• Data sources slow down pulling data from their underlying system (Kafka or similar queues)
Full buffer
Empty buffer
![Page 59: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/59.jpg)
How do latency and throughput affect
each other?
flink.apache.org 6030 Machines, one repartition step
Sender
Sender
Receiver
Receiver
Send buffer when full or timeout
• High throughput by batching events in network buffers
• Filling the buffers introduces latency• Configurable buffer timeout
![Page 60: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/60.jpg)
Aggregate throughput for stream record
grouping
61
0
10.000.000
20.000.000
30.000.000
40.000.000
50.000.000
60.000.000
70.000.000
80.000.000
90.000.000
100.000.000
Flink, nofault
tolerance
Flink,exactly
once
Storm, nofault
tolerance
Storm, atleast once
aggregate throughput of 83 million elements per second
8,6 million elements/s
309k elements/s Flink achieves 260x higher throughput with fault tolerance
30 machines, 120 cores,Google Compute
![Page 61: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/61.jpg)
Performance: Summary
62
Continuous
streaming
Latency-bound
buffering
Distributed
Snapshots
High Throughput &
Low Latency
With configurable throughput/latency tradeoff
![Page 62: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/62.jpg)
The building blocks: Summary
63
Low latencyHigh throughput
State handlingWindowing / Out of order events
Fault tolerance and correctness
• Tumbling / sliding windows• Event time / processing time• Low watermarks for out of order
events
• Managed operator state for backup/recovery
• Large state with RocksDB• Savepoints for operations
• Exactly-once semantics for managed operator state
• Lightweight, asynchronous distributed snapshotting algorithm
• Efficient, pipelined runtime• no per-record operations• tunable latency / throughput
tradeoff• Async checkpoints
![Page 63: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/63.jpg)
Low Watermarks
We periodically send low-watermarks
through the system to indicate the
progression of event time.
64For more details: “MillWheel: Fault-Tolerant Stream Processing at Internet Scale” by T. Akidau et. al.
33 11 28 21 15 958
Guarantee that no event with time <= 5 will arrive afterwards
Window between 0 and 15
Window is evaluated when watermarks arrive
![Page 64: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/64.jpg)
Low Watermarks
65For more details: “MillWheel: Fault-Tolerant Stream Processing at Internet Scale” by T. Akidau et. al.
Operator 35
Operators with multiple inputs always forward the lowest watermark
![Page 65: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/65.jpg)
Bouygues Telecom
66
![Page 66: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/66.jpg)
Bouygues Telecom
67
![Page 67: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/67.jpg)
Bouygues Telecom
68
![Page 68: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/68.jpg)
Capital One
69
![Page 69: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/69.jpg)
Fault Tolerance in streaming
Failure with “at least once”: replay
70
4
3
4 2
Restore from: Final result:
7
5
9 7
![Page 70: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/70.jpg)
Fault Tolerance in streaming
Failure with “exactly once”: state restore
71
1
1
2 2
Restore from: Final result:
4
3
7 7
![Page 71: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/71.jpg)
Latency in stream record grouping
72
Data Generator
Receiver:Throughput /
Latency measure
• Measure time for a record to travel from source to sink
0,00
5,00
10,00
15,00
20,00
25,00
30,00
Flink, nofault
tolerance
Flink, exactlyonce
Storm, atleast once
Median latency
25 ms
1 ms
0,00
10,00
20,00
30,00
40,00
50,00
60,00
Flink, nofault
tolerance
Flink,exactly
once
Storm, atleastonce
99th percentile latency
50 ms
![Page 72: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/72.jpg)
Savepoints: Simplifying Operations
Streaming jobs usually run 24x7 (unlike
batch).
Application bug fixes: Replay your job
from a certain point in time (savepoint)
Flink bug fixes
Maintenance and system migration
What-If simulations: Run different
implementations of your code against a
savepoint73
![Page 73: Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)](https://reader035.vdocuments.site/reader035/viewer/2022062901/58f2a6921a28ab13398b45a3/html5/thumbnails/73.jpg)
Pipelining
74
Basic building block to “keep the data moving”
• Low latency• Operators push
data forward• Data shipping as
buffers, not tuple-wise
• Natural handling of back-pressure