tolerating latency in replicated state machines through client speculation
DESCRIPTION
Tolerating Latency in Replicated State Machines through Client Speculation. April 22, 2009 Benjamin Wester 1 , James Cowling 2 , Edmund B. Nightingale 3 , Peter M. Chen 1 , Jason Flinn 1 , Barbara Liskov 2 University of Michigan 1 , MIT CSAIL 2 , Microsoft Research 3. - PowerPoint PPT PresentationTRANSCRIPT
Tolerating Latency in Replicated State Machines through Client Speculation
April 22, 2009
Benjamin Wester1, James Cowling2, Edmund B. Nightingale3,Peter M. Chen1, Jason Flinn1, Barbara Liskov2
University of Michigan1, MIT CSAIL2, Microsoft Research3
Simple Service Configuration
NSDI'09 Benjamin Wester University of Michigan CSE
2
x=01
++x
x=1
Replicated State Machines (RSM)
3
x=1
x=1
x=1
++x
2
2
2
2
x=1
x=2
x=2
x=2
x=2
++x
++x
++x
• Agree on request• All non-faulty replies are
identical
NSDI'09 Benjamin Wester University of Michigan CSE
RSMs have high latency
4
222
x=2
x=2
x=2
x=2
1. Need many replies2. Agreement3. Geographic Distribution
NSDI'09 Benjamin Wester University of Michigan CSE
5
Hide the Latency
• Use speculative execution inside RSM• Speculate before consensus is reached
– Without faults, any reply predicts consensus value– Let client continue after receiving one reply
NSDI'09 Benjamin Wester University of Michigan CSE
6
Overview
• Introduction• Improving RSMs with speculation• Application to PBFT• Performance• Conclusion
NSDI'09 Benjamin Wester University of Michigan CSE
Speculative Execution in RSM
• Continue processing while waiting
7
Blocked
Take Checkpoint
x=1
Predict: 1Speculate! Commit
x=1
x=2
Rollback
NSDI'09 Benjamin Wester University of Michigan CSE
Critical path: first reply
• Completion latency less relevant• First reply latency sets critical path
– Speed– Accuracy
• Other desirable properties– Throughput– Stability under contention– Smaller number of replicas
8
11
NSDI'09 Benjamin Wester University of Michigan CSE
Requests while speculative
1. Hold request– Bad performance
2. Distributed commit/rollback– State tracking complex
9
while !check_lottery():submit_tps()
buy_corvette()
win?
yes
Predict win? = yes
buy
What do we do with this?
NSDI'09 Benjamin Wester University of Michigan CSE
buy
Resolve speculations on the replicas
• Explicitly encode dependencies as predicates• No special request handling needed• Replicas need to log past replies• Local decision at replicas matches client
10
buy
yes
if win?=yes: buy
yes
NSDI'09 Benjamin Wester University of Michigan CSE
while !check_lottery():submit_tps()
buy_corvette()
win?
win? = yes
keys
buy = keys
Predict win? = yes
11
Overview
• Introduction• Improving RSMs with speculation• Application to PBFT• Performance• Conclusion
NSDI'09 Benjamin Wester University of Michigan CSE
Practical BFT
12
f=1
primary
client
-CS[Castro and Liskov 1999]
NSDI'09 Benjamin Wester University of Michigan CSE
13
Additional Details
• Tentative execution– PBFT/PBFT-CS complete in 4 phases
• Read-only optimization– Accurate answer from backup replica
• Failure threshold– Bound worst-case failure
• Correctness
NSDI'09 Benjamin Wester University of Michigan CSE
14
Overview
• Introduction• Improving RSMs with speculation• Application to PBFT• Performance• Conclusion
NSDI'09 Benjamin Wester University of Michigan CSE
Benchmarks
• Shared counter– Simple checkpoint– No computation
• NFS: Apache httpd build– Complex checkpoint– Significant computation
15NSDI'09 Benjamin Wester University of Michigan CSE
16
Topology
2.5 or 15 msPrimary
1. Primary-local2. Primary-remote3. Uniform
NSDI'09 Benjamin Wester University of Michigan CSE
17
Base case: no replication
2.5 or 15 ms
1. Primary-local2. Primary-remote3. Uniform
NSDI'09 Benjamin Wester University of Michigan CSE
Shared Counter
18
Primary-local topology
NSDI'09 Benjamin Wester University of Michigan CSE
Shared Counter
19
Primary-local topology
[Kotla et al. 07]
NSDI'09 Benjamin Wester University of Michigan CSE
Shared Counter
20
Uniform & Primary-remote topology
NSDI'09 Benjamin Wester University of Michigan CSE
Shared Counter
21NSDI'09 Benjamin Wester University of Michigan CSE
Uniform & Primary-remote topology
NFS: Apache build
22
Primary-local topology
NSDI'09 Benjamin Wester University of Michigan CSE
NFS: Apache build
23
Uniform topology
NSDI'09 Benjamin Wester University of Michigan CSE
NFS: Apache build
24
Primary-remote topology
NSDI'09 Benjamin Wester University of Michigan CSE
NFS: With Failure
25
(1% fail)
Primary-local topology
NSDI'09 Benjamin Wester University of Michigan CSE
Throughput (Shared Counter)
26
LAN topology
NSDI'09 Benjamin Wester University of Michigan CSE
27
Conclusion
• Integrate client speculation within RSMs• Predicated requests: performance without complexity• Clients less sensitive to latency between replicas• 5x speedup over non-speculative protocol
Makes WAN deployments more practical
NSDI'09 Benjamin Wester University of Michigan CSE