apache kafka lightning talk
DESCRIPTION
This short deck provides a high-level overview of how Apache Kafka works under the covers. It covers logs, topics partitions, consumer groups, and replication from a conceptual perspective.TRANSCRIPT
Q
Apache KafkaQ
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
QQ Q
Q
Q
Q
Q
Q
Q
QQ Q
Q
Q
Q
Jeff KunkleNov 22, 2013
Tuesday, December 3, 13
Q
What is Kafka?
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
QQ Q
Q
Q
Q
Q
Q
Q
QQ Q
Q
Q
QTuesday, December 3, 13
Kafka is a distributed, partitioned, replicated commit log service.
Tuesday, December 3, 13
Tuesday, December 3, 13
?Tuesday, December 3, 13
... a non-JMS messaging system with a unique design.
Tuesday, December 3, 13
Key Concepts1 Topics, Logs, and Partitions2 Consumer Groups3 Replication
Tuesday, December 3, 13
Topics and Logs
Tuesday, December 3, 13
partition 0
partition 1
partition 2
Topics and Logs
Tuesday, December 3, 13
partition 0
partition 1
partition 2
old new
Topics and Logs
1 2 3 4 5 6 7log
1 2 3 4 5 6log
1 2 3 4 5log
Tuesday, December 3, 13
partition 0
partition 1
partition 2
writes
old new
Topics and Logs
1 2 3 4 5 6 7log
1 2 3 4 5 6log
1 2 3 4 5log
Tuesday, December 3, 13
1234567
partition 0
Pro
du
cers
123456
partition 1
12345
partition 2
P1
P2
Tuesday, December 3, 13
1 2
34567
partition 0
Pro
du
cers
1
23456
partition 1
1
2345
partition 2
P1
P2
Tuesday, December 3, 13
1 2
34567
partition 0
Pro
du
cers
1
23456
partition 1
1
2345
partition 2
P1
P2
bulk publishing
Tuesday, December 3, 13
1 2
34567
partition 0
Pro
du
cers
1
23456
partition 1
1 2 3 4
5
partition 2
P1
P2
bulk publishing
Tuesday, December 3, 13
1 2
34567
partition 0
Pro
du
cers
1
23456
partition 1
1 2 3 4
5
partition 2
P1
P2
parallel publishing
Tuesday, December 3, 13
1 2 3 4
567
partition 0
Pro
du
cers
1 2
3456
partition 1
1 2 3 4
5
partition 2
P1
P2
parallel publishing
Tuesday, December 3, 13
1 2 3 4 5 6 7partition 0
Pro
du
cers
1 2 3 4 5 6partition 1
1 2 3 4 5partition 2
P1
P2
Tuesday, December 3, 13
1 2 3 4 5
1 2 3 4 5 6
1 2 3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1 2 3 4 5 6 7
1 2 3 4 5 6
1 2 3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
1 2 3 4 5
1 2 3 4 5 6
1
2 3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1 2 3 4 5 6 7
1 2 3 4 5 6
1 2 3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
1 2 3 4 51
2 3 4 5 6
1
2 3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1 2 3 4 5 6 7
12 3 4 5 6
1 2 3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
1 2 3 4 51
2 3 4 5 6
1
2 3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1 2 3 4 5 6 7
12 3 4 5 6
1
2 3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
12 3 4 5123
4 5 6
12
3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3 4 5 6 7
1
2
3
4 5 6
1
2
3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
12 3 4 5123
4 5 6
12
3 4 5 6 7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3 4 5 6 7
1
2
3
4 5 6
1
2
3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
bulk consumption
Tuesday, December 3, 13
12 3 4 5123
4 5 6
123456
7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3456
7
1
2
3
4 5 6
1
2
3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
bulk consumption
Tuesday, December 3, 13
12 3 4 5123
4 5 6
123456
7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3456
7
1
2
3
4 5 6
1
2
3 4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
parallel consumption
Tuesday, December 3, 13
123 4 512345
6
123456
7 partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3456
7
1
2
345
6
1
23
4 5
consumer group 2
consumer group 1
C3
C2
C1
C4
parallel consumption
Tuesday, December 3, 13
123451234561234567
partition 0
partition 1
partition 2
Co
nsu
me
rs
1
2
3456
71
2
345
61
2345
consumer group 2
consumer group 1
C3
C2
C1
C4
Tuesday, December 3, 13
A1A2A3A4
A5
A6A7
server A
Replication
server Bserver C
B1B2B3B4
B5B6C1C2C3C4
C5
P2
P1
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1
A2A3A4
A5
A6A7
server A
Replication
server Bserver C
B1B2B3B4
B5B6C1C2C3C4
C5
P2
P1
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1
A2A3A4
A5
A6A7
server A
Replication
server Bserver C
B1B2B3B4
B5B6C1C2C3C4
C5
P2A1
P1
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1 A2
A3A4
A5
A6A7
server A
Replication
server Bserver C
B1B2B3B4
B5B6C1C2C3C4
C5
P2A1 A2
P1
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1 A2
A3A4
A5
A6A7
server A
Replication
server Bserver C
B1B2B3B4
B5B6
C1
C2C3C4
C5
P2A1 A2
P1C1
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1 A2 A3 A4
A5
A6A7
server A
Replication
server Bserver C
B1 B2 B3 B4 B5 B6
C1 C2 C3 C4 C5
P2A1 A2 A3 A4
P1C1 C2 C3 C4 C5
B1 B2 B3 B4 B5 B6
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1 A2 A3 A4 A5 A6 A7
server A
Replication
server Bserver C
B1 B2 B3 B4 B5 B6
C1 C2 C3 C4 C5
P2A1 A2 A3 A4
A5 A6 A7
P1C1 C2 C3 C4 C5
B1 B2 B3 B4 B5 B6
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
A1 A2 A3 A4 A5 A6 A7
server A
Replication
server Bserver C
B1 B2 B3 B4 B5 B6
C1 C2 C3 C4 C5
P2A1 A2 A3 A4 A5 A6 A7
P1C1 C2 C3 C4 C5
B1 B2 B3 B4 B5 B6
A master(all reads and writes)
B master(all reads and writes)
C master(all reads and writes)
Tuesday, December 3, 13
123 Messages in a partition are assigned a sequential id number, referred to as the offset.
Consumers keep track of their consumed offset in the partition. Kafka doesn’t maintain any metadata.
Kafka maintains all messages for a configurable time, regardless of whether they’ve been consumed.
Tuesday, December 3, 13
Alternativeshttp://queues.io
Tuesday, December 3, 13
Questions?
Tuesday, December 3, 13