zyzzyva: speculative byzantine fault tolerance

30
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007

Upload: abby

Post on 26-Jan-2016

76 views

Category:

Documents


5 download

DESCRIPTION

ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE. R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin. Best Paper Award at SOSP 2007. Motivation. Why implement Byzantine Fault-Tolerant replication? Increasing value of data and decreasing cost of hardware - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

1

ZYZZYVA:SPECULATIVE BYZANTINE

FAULT TOLERANCE

R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong

U. T. Austin

Best Paper Award at SOSP 2007

Page 2: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

2

Motivation

• Why implement Byzantine Fault-Tolerant replication?

– Increasing value of data and decreasing cost of hardware

– More non-stop-fail behaviors than believed – BFT is becoming cheaper– Cost of 3-way non-BFT replication close to

cost of BFT replication

Page 3: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

3

Zyzzyva (I)

• Uses speculation to reduce the cost of BFT replication– Primary replica proposes order of client

requests to all secondary replicas (standard)– Secondary replicas speculatively execute the

request without going through an agreement protocol to validate that order (new idea)

Page 4: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

4

Zyzzyva (II)

• As a result– States of correct replicas may diverge– Replicas may send diverging replies to client

• Zyzzyva’s solution– Clients detect inconsistencies– Help convergence of correct replicas to a

single total ordering of requests– Reject inconsistent replies

Page 5: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

5

How?

• Clients observe a replicated state machine• Replies contain enough information to let clients

ascertain if the replies and the history are stable and guaranteed to be eventually committed

• Replicas have checkpoints

Page 6: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

6

Byzantine agreement (I)

• No solution for less than four entities

Page 7: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

7

Byzantine agreement (II)

• To achieve agreement in the presence of f failed nodes (“traitors”) we need– 3f + 1 entities

Page 8: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

8

Practical BFT (I)

• Practical Byzantine Fault-Tolerant protocol (PBFT) [Castro and Liskov 1999]

Page 9: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

9

Practical BFT (II)

Replicas decide on correct ordering

Page 10: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

10

Practical BFT (III)1. Client sends signed request to primary replica2. Primary assigns a sequence number to the request

and sends to all other replicas aPRE-PREPARE message

3. Secondary replicas validate the message and send a PREPARE message to all replicas

4. Replicas that can collect 2f PREPARE messages send a COMMIT message to all replicas

5. Replicas that can collect 2f+ 1 COMMIT message send a REPLY to the client

Page 11: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

11

A shortened version

Faster agreement is achieved thanks toa more complex view change protocol

Page 12: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

12

The explanation (I)

• "No replicated service that uses the traditional view change protocol can be live without an agreement protocol that includes both the prepare and commit full exchanges"

• "The traditional view change protocol lets correct replicas commit to a view change and become silent in a view without any guarantee that their action will lead to the view change."

Page 13: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

13

The explanation (II)

• Zyzzyva – Adds an extra phase to its view change

protocol– Guarantees that a correct replica will not

abandon a view unless every other correct replica does it

Page 14: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

14

Zyzzyva Agreement (I)

• Common case: no faulty replicas

Page 15: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

15

Explanations

• Secondary replicas assume that– Primary replica gave the right ordering– All secondary replicas will participate in

transaction• Initiate speculative execution• Client receives 3f + 1 mutually consistent

responses

Page 16: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

16

Zyzzyva Agreement (II)

• With a faulty replica

Page 17: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

17

Explanations (I)

• Client receives 3f mutually consistent responses• Gathers at least 2f + 1 mutually consistent

responses• Distributes a commit certificate to the replicas• Once at least 2f + 1 replicas acknowledge

receiving a commit certificate, the client considers the request completed

Page 18: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

18

Explanations (II)

• If enough secondary replicas suspect that the primary replica is faulty, a view change is initiated and a new primary elected

Page 19: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

19

Comparison with traditional solutions

Page 20: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

20

State maintained at each replica

Page 21: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

21

Explanations (I)

• Each replica maintains– A history of the requests it has executed– A copy of the max commit certificate it has

received • Let it distinguish between committed

history and speculative history

Page 22: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

22

Explanations (II)

• Each replica constructs a checkpoint every CP_INTERVAL requests

• It maintains one stable checkpoint with a corresponding stable application state snapshot

• It might also have up to one speculative checkpoint with its corresponding speculative application state snapshot

Page 23: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

23

Explanations (III)

• Checkpoints and application state become committed through a process similar to that of earlier BFT agreement protocols– Replicas send signed checkpoint messages

to all replicas when they generate a tentative checkpoint

– Commit checkpoint after they collect f + 1 signed matching checkpoint messages

Page 24: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

24

View change sub-protocol (I)

Page 25: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

25

Explanations

• Two-phase protocol• Elects a new primary • Guarantees that it will not introduce any changes

in a history that has already completed at a correct client

Page 26: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

26

Performance: throughput

Page 27: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

27

Comments

• Zyzzyva-5 is a special version of Zyzziva requiring more replicas but having a lower overhead

Page 28: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

28

Performance: latency

Page 29: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

29

Scalability: peak throughputs

Page 30: ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE

30

CONCLUSIONS

• Systematically exploiting speculative execution results in a protocol much faster than conventional BFT agreement protocols.

Observe that Zyzzyva is optimized for the most frequent case but provides the correct result in all cases

• A good rule to follow