![Page 1: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/1.jpg)
Apache BookKeeper
A High Performance and Low Latency Storage Service
@sijieg (Sijie Guo, Twitter)
@jvjujjuri (JV, Salesforce)
![Page 2: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/2.jpg)
I am Sijie Guo- PMC Chair of Apache BookKeeper- Co-creator of Apache DistributedLog- Twitter Messaging/Pub-Sub Team- Yahoo! R&D Beijing
Hello!
![Page 3: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/3.jpg)
Challenges in Distributed Systems
![Page 4: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/4.jpg)
Expect Failures
up to 10% annual failure rates for disks/servers
![Page 5: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/5.jpg)
“
Symptoms
![Page 6: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/6.jpg)
Problem 1: Not Available
![Page 7: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/7.jpg)
Problem 1: Not Available
![Page 8: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/8.jpg)
Problem 2: Inconsistencies
![Page 9: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/9.jpg)
CAP
![Page 10: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/10.jpg)
“
More Issues
![Page 11: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/11.jpg)
Problem 3: Split Brain
Writer A Writer A
Write A’
Writer A
Write A’
Two Writers
![Page 12: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/12.jpg)
Problem 4: Failure Detection
B
A
C
![Page 13: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/13.jpg)
Problem 5: Recovery
B
A
C
Recovery Protocol
Consistency
![Page 14: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/14.jpg)
“
Solutions
![Page 15: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/15.jpg)
OverviewEnter Apache BookKeeper
![Page 16: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/16.jpg)
BookKeeper - Durable StorageA Durable Storage Optimized for Immutable Data
Serve as a building block for reliable systems
Commodity Hardware
Durability
Replication Consistency Recovery
Client Library
![Page 17: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/17.jpg)
Immutable Data Abstraction
![Page 18: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/18.jpg)
Ledger
◉ Segment
◉ Block / Object
◉ Append-Only File
◉ ...
![Page 19: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/19.jpg)
Guarantees
If an entry
has been acknowledged,
it must be readable
If an entry
is read once,
it must always be readable
![Page 20: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/20.jpg)
History
◉ Initial Use Case - Hadoop NameNode HA
◉ 2008: Open Sourced Contrib of ZooKeeper
◉ 2011: Sub-Project of ZooKeeper
◉ 2012: Yahoo! Push Notification
◉ 2012~Now: DistributedLog, Pulsar, Majordodo
◉ 2015~Now: Salesforce Distributed Store
![Page 21: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/21.jpg)
Inside of Apache BookKeeper
Details
![Page 22: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/22.jpg)
Architecture
Bookie
Bookie
Bookie
APPC
lient
Metadata Store
Ledger
![Page 23: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/23.jpg)
Reliable Writes
◉ Store checksum along with entry
◉ Fsync entries before responding
◉ Ack when
○ All Previous Entries
○ This Entry
Bookie
Bookie
Bookie
Accepted
by
Quorum
![Page 24: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/24.jpg)
Consistency - LastAddPushed
0 1 2 3 4 7 8 9
LastAddPushed
10 11 12
Writer
Add entries
![Page 25: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/25.jpg)
Consistency - LastAddConfirmed
0 1 2 3 4 7 8 9 10 11 12
LastAddConfirmed
Reader Reader
LastAddConfirmed
Writer WriterOwnership Changed
Add entriesAck Adds
Fencing
![Page 26: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/26.jpg)
Fencing
![Page 27: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/27.jpg)
Read Entry & Read LAC
B1 B2 B3
Client
Read Entry K
Speculative ReadsOn Timeouts
B1 B2 B3
Client
Read LAC
Quorum Read
![Page 28: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/28.jpg)
Long Poll Read
B1 B2 B3
Client
Long Poll ReadSpeculativeLong Poll
![Page 29: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/29.jpg)
Inside a Bookie
![Page 30: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/30.jpg)
Use CasesApache BookKeeper as a Building Block
![Page 31: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/31.jpg)
Projects built on BookKeeper
◉ Twitter: Apache DistributedLog
◉ Yahoo: Pulsar - Cloud Messaging Service
◉ Salesforce Distributed Store.
◉ Huawei - HDFS NameNode HA
◉ HubSpot - WAL
◉ Majordodo - Distributed Resource Manager
![Page 32: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/32.jpg)
“
Apache DistributedLog(Twitter)
![Page 33: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/33.jpg)
Apache DistributedLog
1 2 3 4 5 6 7 11 12
13
14
15
16
17
Oldest Newest
Log SegmentX
Log SegmentX+1
Log SegmentX+2
Apache BookKeeper
![Page 34: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/34.jpg)
Apache DistributedLogM
etad
ata
Stor
e
Log SegmentStore(BK)
ColdStorage(HDFS)
Log Streams - Abstraction & Naming- Data Management
- Efficient Write & Read- Intra-cluster & Geo Replication
- Segments
- Raw Streams
WriteProxy
ReadProxy
- Ownership Tracking- Batching, Compression
Record Cache -Rate Limiting, Quota -
- Serving
- Applications
- Different
Consumer
models
DBs - e.g.,Twitter’s
Manhattan
DeferredRPC
(queuing)
Self-servePub/Sub
StreamComputing
Cross DCReplication
![Page 35: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/35.jpg)
DistributedLog at Twitter
◉ Manhattan Key/Value Store - WAL
◉ Durable Deferred RPC - Journal
◉ Real-Time Search Indexing - Change Propagation
◉ Self-serve Pub/Sub - Message Delivery, Ads Pipeline
◉ Stream Computing
○ Source & Sink
○ Stateful Processing in Heron (coming soon)
◉ Reliable Cross Datacenter Replication
![Page 36: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/36.jpg)
Scale DistributedLog at Twitter
◉ 1.5 trillion records/day, 17.5 petabytes/day
◉ O(10) thousands streams, O(1) million live ledgers
◉ O(10^2) bookies, O(10^3) proxies
◉ Records size from 100 bytes to 20 KB to even more
◉ Data is kept from hours to days, even up to a year
◉ Replication factor is 3 or 5. 9 or 15 for global use
case.
![Page 37: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/37.jpg)
DistributedLog Resources
◉ Website - https://distributedlog.io
◉ Mail List -
◉ Project Ideas - https://cwiki.apache.org/confluence/display/DL/Project+Ideas
◉ Paper - “DistributedLog: A high performance
replicated log service” (ICDE 2017)
![Page 38: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/38.jpg)
“
Yahoo! Pulsar(Cloud Messaging Service)
![Page 39: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/39.jpg)
Yahoo! Pulsar
◉ Distributed Pub/Sub Messaging Platform
◉ Flexible Messaging Model - Topic and Queue
◉ Durable, Low Latency
◉ Strong Ordering and Consistency Guarantees
◉ Geo Replication
◉ Apache BookKeeper as Durable Message Store
![Page 40: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/40.jpg)
Yahoo! Pulsar
![Page 41: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/41.jpg)
Scale Pulsar at Yahoo!
◉ 100 billion messages per day
◉ More than 1.4 million topics
◉ Avg publish latency across services of less than 5ms
◉ 10+ data centers, cross-region replications
![Page 42: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/42.jpg)
Pulsar Performance
![Page 43: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/43.jpg)
“
Salesforce Distributed Store
![Page 44: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/44.jpg)
Salesforce Application Storage
◉ Store for Persistent WAL, Data and Objects
◉ Low, Constant Write Latencies
◉ Low, Constant Random Read Latencies
◉ Highly Available, Consistent
◉ Distributed and Linearly Scalable
◉ On Commodity Hardware
![Page 45: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/45.jpg)
Heterogeneous Stores
![Page 46: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/46.jpg)
Roadmap, Releases, Future
Community
![Page 47: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/47.jpg)
Community
◉ 7 PMC Members◉ 10+ Committers◉ 20+ Active Contributors◉ 5+ Companies actively using/contributing
○ Twitter○ Yahoo!○ Salesforce○ Huawei○ EMC
![Page 48: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/48.jpg)
Release 4.5.0
◉ Netty 4 Upgrade - Performance Improvements
◉ Security (Authentication & Authorization) Support
◉ Explicit LAC
◉ Long Poll Read Support
◉ Auto Re-replication Improvements
◉ ...
![Page 49: Apache BookKeeper: A High Performance and Low Latency Storage Service](https://reader030.vdocuments.site/reader030/viewer/2022020410/58e4a08b1a28abf5428b60ab/html5/thumbnails/49.jpg)
Future
◉ Scalable Segment Store○ Object, Log, File, Stream, …
◉ Long Term Storage○ Disk Scrubber
○ Better Lifecycle Management
○ …
◉ Beyond the limit○ 128 bits support
○ Scalable metadata management