how we used kafka to scale our database infrastructure...today’s agenda introduction to espresso...
TRANSCRIPT
![Page 1: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/1.jpg)
How we used Kafka to scale our Database Infrastructure
Basavaiah Thambara(Basu) Staff Site Reliability Engineer
( https://www.linkedin.com/in/basavaiaht )
![Page 2: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/2.jpg)
Today’s agenda
Introduction to Espresso
Espresso - Replication
Espresso with MySQL Replication
Espresso with Kafka Replication
Advantages of Using Kafka
How Kafka Based Replication Works
Conclusion & References
![Page 3: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/3.jpg)
Espresso
Document store MySQL RDBMS & k-v Stores
Consistent & Partition tolerance
![Page 4: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/4.jpg)
Espresso : Features
Multi-colo writes
Bulk import export
Secondary Indexing
Schema Evolution
Change data capture
![Page 5: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/5.jpg)
Linkedin Profiles
Linkedin Invitations
Linkedin InMails, etc.
Espresso : Use CasesLINKEDIN
![Page 6: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/6.jpg)
Espresso : Current Scale
O(100)Clusters
O(10K)Servers
O(100)Databases
O(PB)Data
O(M)Peak QPS
![Page 7: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/7.jpg)
Espresso : Basic Architecture
● Client/Application
● Router
● Helix
● Zookeeper
● Storage node
![Page 8: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/8.jpg)
Espresso : Replication Requirements
Read Scaling BackupsHigh Availability
Disaster Recovery
Multi-colo writes
![Page 9: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/9.jpg)
Espresso : Local Replication
● MySQL Replication
● 3 Copies
● Per Node Replication
● Node Failure
P1 P2 P3
Node 1
P1 P2 P3
Node 2
Node 3
P1 P2 P3
P4 P5 P6
Node 4
P4 P5 P6
Node 5
Node 6
P4 P5 P6
Q1 Q2 Q3
Node N -2
Q1 Q2 Q3
Node N -1
Node N
Q1 Q2 Q3
Master Slave Replication
Legacy Architecture
![Page 10: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/10.jpg)
Espresso : Cross Colo Replication (Legacy)
● Databus
● Data Replicator
● Colo failure
Remote Data Center
Client
Router
API Server
Storage Node
API Server
Storage Node
API Server
Storage Node
DataBusData Replicator
Online Data Center
Client
Router
API Server
Storage Node
API Server
Storage Node
API Server
Storage Node
Data ReplicatorDataBus
![Page 11: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/11.jpg)
Limitations : Per Instance Replication
Poor Resource Utilization
Cross Colo Replication (Legacy)
![Page 12: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/12.jpg)
Limitations : Per Instance Replication
● Databus
■ tightly coupled to storage node
■ operational complexity
■ Uses SSD,higher cost to serve
● Cluster expansion is painful
■ Lot of manual steps
■ Needs databus expansion
■ Requires downtime
Remote Data Center
Client
Router
API Server
Storage Node
API Server
Storage Node
API Server
Storage Node
DataBusData Replicator
Online Data Center
Client
Router
API Server
Storage Node
API Server
Storage Node
API Server
Storage Node
Data ReplicatorDataBus
![Page 13: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/13.jpg)
Limitations: Per Instance Replication
● Upon master failure, single
node gets traffic
● Human intervention to
bring up slaves
● Slave-less situation might
lead to outage
Master Failure Slaves Failure
P1 P3
P1 P3
P1
P2
Node 1
P2
Node 2
Node 3
P2 P3
Master Slave
P1 P3
P1 P3
P1
P2
Node 1
P2
Node 2
Node 3
P2 P3
P1 P3
P1 P3
P1
P2
Node 1
P2
Node 2
Node 3
P2 P3
P1 P3
P1 P3
P1
P2
Node 1
P2
Node 2
Node 3
P2 P3
Offline
![Page 14: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/14.jpg)
Espresso : Replication Using Kafka
● Per partition replication
● Flexible partition placement
● Every node serves traffic
● Data replicator uses kafka
New architecture
![Page 15: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/15.jpg)
Advantages: Per Partition Replication
Better resource utilization.
1
![Page 16: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/16.jpg)
Advantages: Per Partition Replication
Better resource utilization.
Resource Utilization (Legacy)
Resource Utilization ( )
1
![Page 17: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/17.jpg)
Advantages: Per Partition Replication
Better resource utilization.
1
Easy cluster expansion.
2
![Page 18: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/18.jpg)
Cluster Expansion
Initial cluster state with 12 partitions,
3 storage nodes, replication factor=3
![Page 19: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/19.jpg)
Cluster Expansion
Adding a node: Helix will send
offline to Slave for new node
![Page 20: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/20.jpg)
Cluster Expansion
Once partitions on new node
are ready, transfer ownership
and drop old
![Page 21: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/21.jpg)
Cluster Expansion
Cluster state after expansion
with 12 partitions, 4 storage
nodes, r=3
![Page 22: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/22.jpg)
Advantages : Per Partition Replication
Node failure
■ parallel mastership handoff
■ parallel restore of slaves
![Page 23: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/23.jpg)
Advantages: Per Partition Replication
Better resource utilization.
1
Easy cluster expansion.
2
No human intervention.
3
![Page 24: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/24.jpg)
Advantages: Per Partition Replication
Single platform.
6
Cost savings.
5
Databus complexity eliminated.
4
Internal replication
Cross colo replication
Change capture for nearline
![Page 25: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/25.jpg)
Requirements
1
Implementing Kafka based replication
● Broker and producer config
● Implement
2Solution
![Page 26: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/26.jpg)
Requirements
1
Implementing Kafka based replication
Guaranteed Delivery
Exactly Once(sort of)
In-Order
![Page 27: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/27.jpg)
Implementing Kafka based replication
● Broker and producer config
● Implement
2Solution
Broker config
● Kafka broker config■ replication factor =3■ min.isr = 2■ Disabled unclean
leader elections
![Page 28: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/28.jpg)
Implementing Kafka based replication
● Broker and producer config
● Implement
2Solution
Producer Config● acks = “all”● Infinite retries● block.on.buffer.full = true
● max.in.flight.requests.per.connection = 1● linger.ms = 0● on non-retryable exception
■ destroy producer■ create new producer■ resume from last checkpoint
![Page 29: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/29.jpg)
Global Transaction Identifier
● Global transaction identifier(GTID)● Unique
![Page 30: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/30.jpg)
Replication flow
![Page 31: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/31.jpg)
Message protocol
MySQLMySQL
![Page 32: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/32.jpg)
Message protocol - Mastership Handoff
MySQLMySQL
![Page 33: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/33.jpg)
Message protocol - Mastership Handoff
MySQLMySQL
![Page 34: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/34.jpg)
Message protocol - Mastership Handoff
MySQLMySQL
![Page 35: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/35.jpg)
Checkpointing - Producer
MySQLMySQL
![Page 36: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/36.jpg)
Checkpointing - Producer ...
MySQLMySQL
![Page 37: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/37.jpg)
Checkpointing - Consumer
MySQL MySQL
3:101@2
![Page 38: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/38.jpg)
Producer Failure
MySQLMySQL
3:101@2
![Page 39: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/39.jpg)
Producer Failure...
MySQLMySQL
![Page 40: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/40.jpg)
Producer Failure...
MySQLMySQL
![Page 41: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/41.jpg)
Producer Failure...
MySQL MySQL
3:103@6
![Page 42: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/42.jpg)
Producer Failure...
MySQL
3:103@6
MySQL
![Page 43: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/43.jpg)
Producer Failure...
MySQL MySQL
3:103@6
![Page 44: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/44.jpg)
Producer Failure...
MySQL
3:103@6
MySQL
![Page 45: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/45.jpg)
Zombie Writes
MySQLMySQL
![Page 46: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/46.jpg)
Zombie Writes...
MySQLMySQL
![Page 47: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/47.jpg)
Zombie Writes...
MySQLMySQL
![Page 48: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/48.jpg)
Zombie Writes...
MySQL MySQL
![Page 49: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/49.jpg)
Zombie Writes...
MySQL MySQL
![Page 50: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/50.jpg)
Conclusion
● LinkedIn leveraged Kafka to scale Espresso● Kafka helped to Unify data pipelines● Reduced operational complexity● Saved $$$
![Page 51: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/51.jpg)
References
1. https://engineering.linkedin.com/espresso/introducing-espresso-linkedins-hot-new-distributed-document-store
2. https://engineering.linkedin.com/blog/2016/04/kafka-ecosystem-at-linkedin
3. https://www.slideshare.net/ConfluentInc/espresso-database-replication-with-kafka-tom-quiggle
4. https://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kafka-49753844
![Page 52: How we used Kafka to scale our Database Infrastructure...Today’s agenda Introduction to Espresso Espresso - Replication Espresso with MySQL Replication Espresso with Kafka Replication](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec6b4c47965b564650c51d3/html5/thumbnails/52.jpg)
Q&A?