![Page 1: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/1.jpg)
Paris Carbone<[email protected]> - KTH Royal Institute of Technology Stephan Ewen<[email protected]> - data Artisans Gyula Fóra<[email protected]> - King Digital Entertainment Ltd Seif Haridi<[email protected]> - KTH Royal Institute of Technology Stefan Richter<[email protected]> - data Artisans Kostas Tzoumas<[email protected]> - data Artisans
1
State Management in Apache Flink®
Consistent Stateful Distributed Stream Processing
@vldb17
![Page 2: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/2.jpg)
Overview
• The Apache Flink System Architecture
• Pipelined Consistent Snapshots
• Operations with Snapshots
• Large Scale Deployments and Evaluation
2
![Page 3: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/3.jpg)
The Apache Flink Framework
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
3
![Page 4: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/4.jpg)
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
Client
4
![Page 5: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/5.jpg)
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
Client
4
![Page 6: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/6.jpg)
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
Client
optimised logical graph
4
![Page 7: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/7.jpg)
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
• scheduling • state partitioning • snapshot coordination
Client
optimised logical graph
4
![Page 8: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/8.jpg)
Zookeeper
• passive failover • snapshot metadata
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
• scheduling • state partitioning • snapshot coordination
Client
optimised logical graph
4
![Page 9: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/9.jpg)
Zookeeper
• passive failover • snapshot metadata
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
• scheduling • state partitioning • snapshot coordination
Client
optimised logical graph
• memory management • local snapshot execution • flow control
physical long-runningtasks
4
![Page 10: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/10.jpg)
Zookeeper
• passive failover • snapshot metadata
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
• scheduling • state partitioning • snapshot coordination
Client
optimised logical graph
• memory management • local snapshot execution • flow control
physical long-runningtasks
locally managed state
4
![Page 11: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/11.jpg)
Zookeeper
• passive failover • snapshot metadata
Distributed Architecture
Cluster Backend Metrics
Dataflow Runtime
DataStream DataSet
SQL
Tabl
e
CEP
Gra
phs
MLLibraries
Core API
Runner
Setup
Job Manager
Task Manager
Task Manager
….
• scheduling • state partitioning • snapshot coordination
Client
optimised logical graph
• memory management • local snapshot execution • flow control
physical long-runningtasks
locally managed state
ExternalSnapshot Store(e.g., hdfs)
partial snapshots
4
![Page 12: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/12.jpg)
1. End-to-End Guarantees
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots
5
![Page 13: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/13.jpg)
1. End-to-End Guarantees
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots
6
![Page 14: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/14.jpg)
Stateful Processing
tasktasktask
7
![Page 15: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/15.jpg)
Stateful Processing
tasktasktask
invoke per input record
7
![Page 16: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/16.jpg)
Stateful Processing
tasktasktask
readwrite
managed state
logical operations (collections)
invoke per input record
7
![Page 17: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/17.jpg)
Local State Backend
physicaloperations
In-Memory(Heap) Embedded Off-heap+Disk Key-Value Store
(RocksDB)
Stateful Processing
tasktasktask
readwrite
managed state
logical operations (collections)
invoke per input record
7
![Page 18: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/18.jpg)
Local State Backend
physicaloperations
In-Memory(Heap) Embedded Off-heap+Disk Key-Value Store
(RocksDB)
Stateful Processing
tasktasktask
readwrite
managed state
logical operations (collections)
invoke per input record
state = f(input)
7
![Page 19: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/19.jpg)
8
![Page 20: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/20.jpg)
local statesinput
streams
8
![Page 21: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/21.jpg)
local statesinput
streams
stream processor
8
![Page 22: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/22.jpg)
local statesinput
streams
divide computationinto epochs
stream processor
8
![Page 23: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/23.jpg)
local statesinput
streams
capture all local states after completing an epoch
divide computationinto epochs
stream processor
8
![Page 24: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/24.jpg)
local statesinput
streams
capture all local states after completing an epoch
divide computationinto epochs
stream processor
can rollback input and state to captured point in the past
8
![Page 25: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/25.jpg)
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 26: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/26.jpg)
drain epoch 1
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 27: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/27.jpg)
drain epoch 1
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 28: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/28.jpg)
drain epoch 1
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 29: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/29.jpg)
drain epoch 2
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 30: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/30.jpg)
drain epoch 2
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 31: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/31.jpg)
drain epoch 2
Snapshot Store
copy states
A Synchronous Approach
master
9
![Page 32: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/32.jpg)
• In use: Storm Trident and Spark Streaming
• A conservative approach, equivalent to batching
• Can cause unnecessary latency (master coordination)
• Processing is no longer continuous
• Forces many tasks to be idle
• Instead, in Apache Flink snapshots are pipelined
Synchronous Snapshots
10
![Page 33: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/33.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
11
![Page 34: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/34.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
insert markers
11
![Page 35: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/35.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
insert markers
A
BC D
E
11
![Page 36: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/36.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
A
BC D
E
11
![Page 37: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/37.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
A
BC D
E
B
11
![Page 38: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/38.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
epoch alignment
A
BC D
E
B
11
![Page 39: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/39.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
epoch alignment
A
BC D
E
B A
11
![Page 40: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/40.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
A
BC D
E
B A C
11
![Page 41: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/41.jpg)
Pipelined Snapshots
Snapshot Store
async state copy
A
BC D
E
B A C D E
11
![Page 42: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/42.jpg)
Pipelined Snapshots
Snapshot Store
async state copysnapshotcompletes
A
BC D
E
B A C D E
11
![Page 43: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/43.jpg)
Pipelined Snapshots (cycles)
12
![Page 44: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/44.jpg)
Pipelined Snapshots (cycles)
Problem: we cannot wait indefinitely for records in cycles
12
![Page 45: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/45.jpg)
Pipelined Snapshots (cycles)
Problem: we cannot wait indefinitely for records in cycles
Solution: log in snapshot inflight records within a cycle
Replay upon recovery. 12
![Page 46: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/46.jpg)
• Offers exactly-once processing guarantees
• Issued periodically/externally by the user
• Naturally respects flow control mechanisms
• Channel state logging limited to cycles only
• Multiple epoch snapshots can be pipelined
• Can offer weaker at-least-once processing guarantees by simply dropping aligning vs no alignment cost
Technique Highlights
13
![Page 47: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/47.jpg)
1. End-to-End Guarantees
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots Usages
14
![Page 48: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/48.jpg)
Exactly-Once: Input and Processing
Important Assumptions
• Input streams are persisted with offset indexes (e.g., Kafka, Kinesis)
• Data Channels are FIFO and reliable (no loss)
Each epoch either completes or repeats
15
![Page 49: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/49.jpg)
• Idempontency ~ repeated operations can be tolerated after recovery/rollback (works for mutable stores).
• Transactional Processing ~ Requires a two-phase coordination. A snapshot completion eventually leads to external commit (e.g., Flink’s HDFS RollingSink*)
in-progress committedpendingpending
epoch n-1 epoch n-2 epoch n-3epoch n
Exactly-Once Output
16
![Page 50: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/50.jpg)
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots Usages
1. End-to-End Guarantees
17
![Page 51: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/51.jpg)
Dataflow Reconfiguration
18
![Page 52: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/52.jpg)
Dataflow Reconfiguration
18
![Page 53: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/53.jpg)
Dataflow Reconfiguration
stop
snap-1 snap-2
18
![Page 54: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/54.jpg)
Dataflow Reconfiguration
stop
snap-1 snap-2
snap-3
…
change parallelism
18
![Page 55: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/55.jpg)
Dataflow Reconfiguration
stop
snap-1 snap-2
snap-3
…
change parallelism
Problem: How is state repartitioned from a snapshot?
18
![Page 56: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/56.jpg)
Reconfiguration: The Issue
19
![Page 57: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/57.jpg)
Reconfiguration: The Issue
0x100: bob … … … … 0x449: alice
reconfigure
case I full scan
Scan Remote Storage for Responsible Keys
19
![Page 58: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/58.jpg)
Reconfiguration: The Issue
0x100: bob … … … … 0x449: alice
reconfigure
case I full scan
Scan Remote Storage for Responsible Keys
too slow
19
![Page 59: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/59.jpg)
Reconfiguration: The Issue
case II
0x100: bob … … … … 0x449: alice
reconfigure
Include Key Locations in Snapshot Metadata
bob: 0x100 carol: 0x344 …
alice: 0x449 chuck: 0x630 …
0x100: bob … … … … 0x449: alice
reconfigure
case I full scan
Scan Remote Storage for Responsible Keys
too slow
19
![Page 60: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/60.jpg)
Reconfiguration: The Issue
case II
0x100: bob … … … … 0x449: alice
reconfigure
Include Key Locations in Snapshot Metadata
bob: 0x100 carol: 0x344 …
alice: 0x449 chuck: 0x630 …
0x100: bob … … … … 0x449: alice
reconfigure
case I full scan
Scan Remote Storage for Responsible Keys
too slow
too much
19
![Page 61: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/61.jpg)
Reconfiguration: Key GroupsPre-partition state in
hash(K) space, into key-groups
bob……
… ………
alice
20
![Page 62: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/62.jpg)
Reconfiguration: Key GroupsPre-partition state in
hash(K) space, into key-groups
bob……
… ………
• Snapshot Metadata: Contains a reference per stored Key-Group (less metadata)
• Reconfiguration: Contiguous key-group allocation to available tasks (less IO)
alice
20
![Page 63: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/63.jpg)
Reconfiguration: Key GroupsPre-partition state in
hash(K) space, into key-groups
bob……
… ………
• Snapshot Metadata: Contains a reference per stored Key-Group (less metadata)
• Reconfiguration: Contiguous key-group allocation to available tasks (less IO)
alice
Note: number of key groups controls trade-off between metadata to keep and reconfiguration speed
20
![Page 64: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/64.jpg)
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots Usages
1. End-to-End Guarantees
21
![Page 65: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/65.jpg)
Version Control
22
![Page 66: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/66.jpg)
Version Control
Pipeline v.1
22
![Page 67: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/67.jpg)
Version Control
fork and update Pipeline v.1
Pipeline v.2
22
![Page 68: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/68.jpg)
Version Control
fork and update Pipeline v.1
Pipeline v.2
22
![Page 69: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/69.jpg)
Version Control
fork and update Pipeline v.1
Pipeline v.3
Pipeline v.2
22
![Page 70: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/70.jpg)
Version Control
fork and update Pipeline v.1
Pipeline v.3
Pipeline v.2
22
![Page 71: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/71.jpg)
Snapshots
2. Reconfiguration
3. Version Control 4. Isolation
Snapshots Usages
1. End-to-End Guarantees
23
![Page 72: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/72.jpg)
Isolation Levels
24
![Page 73: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/73.jpg)
Isolation Levels
select from facebook.userID, clients.name … inner join clients on …
read-committed(snapshot)
read-uncommitted(dirty read on latest state)
external query
24
![Page 74: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/74.jpg)
Large Scale Deployment at King
25
![Page 75: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/75.jpg)
Large Scale Deployment at King10
0
200
300
400
500
Global State Size (GB)
0
50
100
150
200
250
Tota
lSna
psho
ttin
gTi
me
(sec
)
total time / snapshot(alignment + async copies)
25
![Page 76: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/76.jpg)
Large Scale Deployment at King10
0
200
300
400
500
Global State Size (GB)
0
50
100
150
200
250
Tota
lSna
psho
ttin
gTi
me
(sec
)
total time / snapshot(alignment + async copies)
~runtime overhead
25
![Page 77: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/77.jpg)
Large Scale Deployment at King
30 50 70Parallelism
0
200
400
600
800
1000
1200
1400
Tota
lAlig
nmen
tTim
e(m
sec)
PROCWINOUT
alignmentcost
100
200
300
400
500
Global State Size (GB)
0
50
100
150
200
250
Tota
lSna
psho
ttin
gTi
me
(sec
)
total time / snapshot(alignment + async copies)
~runtime overhead
25
![Page 78: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/78.jpg)
Large Scale Deployment at King
30 50 70Parallelism
0
200
400
600
800
1000
1200
1400
Tota
lAlig
nmen
tTim
e(m
sec)
PROCWINOUT
alignmentcost
100
200
300
400
500
Global State Size (GB)
0
50
100
150
200
250
Tota
lSna
psho
ttin
gTi
me
(sec
)
total time / snapshot(alignment + async copies)
~runtime overhead
25
![Page 79: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/79.jpg)
Large Scale Deployment at King
30 50 70Parallelism
0
200
400
600
800
1000
1200
1400
Tota
lAlig
nmen
tTim
e(m
sec)
PROCWINOUT
alignmentcost
100
200
300
400
500
Global State Size (GB)
0
50
100
150
200
250
Tota
lSna
psho
ttin
gTi
me
(sec
)
total time / snapshot(alignment + async copies)
~runtime overhead
• #shuffles (keyby) • parallelism
25
![Page 80: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/80.jpg)
Teaser: More paper highlights
• We can use the same technique to coordinate externally managed state with snapshots.
• Epoch markers can act as on-the-fly reconfiguration points.
• Internals of asynchronous and incremental snapshots.
26
![Page 81: State Management in Apache Flink : Consistent Stateful Distributed Stream Processing](https://reader036.vdocuments.site/reader036/viewer/2022062311/5aac21b87f8b9adb278b48ab/html5/thumbnails/81.jpg)
Paris Carbone<[email protected]> - KTH Royal Institute of Technology Stephan Ewen<[email protected]> - data Artisans Gyula Fóra<[email protected]> - King Digital Entertainment Ltd Seif Haridi<[email protected]> - KTH Royal Institute of Technology Stefan Richter<[email protected]> - data Artisans Kostas Tzoumas<[email protected]> - data Artisans
27
State Management in Apache Flink®
Consistent Stateful Distributed Stream Processing
@vldb17