distributed algorithms – 2g1513 lecture 10 – by ali ghodsi fault-tolerance in asynchronous...

52
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

Upload: oliver-todd

Post on 16-Dec-2015

231 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

Distributed Algorithms – 2g1513

Lecture 10 – by Ali GhodsiFault-Tolerance in Asynchronous Networks

Page 2: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

2

Consensus Problems

Consensus problems very important in DS Distributed Databases

All processes must agree whether to commit or abort a transaction

If any process says abort, all processes should abort

Atomic Broadcast All processes receive the same set of messages coming

from correct processes only Can be used to implement consensus, vice versa

Page 3: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

3

Fischer, Lynch, Paterson 1983/85 Consensus cannot be solved in

asynchronous model With possibility of one process crashing

http://www.sics.se/~ali/flp85.pdf

Most influential paper award PODC 2001

Page 4: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

4

Modified Model

To proof the result, we will modify our model of a distributed system slightly

Processes execute local algorithms, modeled by a STS

But, given any state, a correct process can always execute a “dummy” instruction For any state in a process, there exists a transition There exists always an applicable event on every process

A crashed process, cannot make any transitions

Page 5: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

5

Definition: T-crash fair executions A t-crash-robust algorithm is a consensus algorithm

if it satisfies:

Termination All correct processes eventually decides

Agreement In every configuration, the decided processes should have decided

for the same value (0 or 1)

Non-triviality There exists at least one possible input configuration where the

decision is 0 There exists at least one possible input configuration where the

decision is 1 Example, maybe input “0,0,1”->0 while “0,1,1”->1

Page 6: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

6

Definitions

0-decided configuration A configuration with decide ”0” on some process

1-decided configuration A configuration with decide ”1” on some process

0-valent configuration A configuration in which every reachable decided configuration is a 0-decide

1-valent configuration A configuration in which every reachable decided configuration is a 1-decide

Bivalent configuration A configuration which can reach a 0-decided and 1-decided configuration

Page 7: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

7

Definitions Illustrated 1(4)

0-decided configuration A configuration with decide ”0” on some process

0-decided configuration

{ STATE2,

STATE,5

DECIDE-0,

STATE7

{msg1, msg2}

}

At least of them is in state DECIDE-0

msg1

msg

2

P1 state2

P2 state5

P4 state7

P3 decide0

Page 8: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

8

Definitions Illustrated 2(4) 0-valent configuration

No 1-decided configurations are reachable Future determined, means ”everyone will decide 0”

0- valent configuration

{ P1_state,

P2_state,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ P1_state,

P2_state2,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ decide-0,

P2_state,

P3_state,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-0,

P2_state2,

P3_state2,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-0,

P2_state,

P3_state,

decide-0,

{ msg2}

}

0-valent configuration

{ decide-0,

P2_state2,

P3_state2,

decide-0,

{ msg2}

}

0-valent configuration

{ decide-0,

P2_state,

decide-0,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-0,

P2_state3,

P3_state,

decide-0,

{}

}

Page 9: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

9

Definitions Illustrated 3(4) 1-valent configuration

No 0-decided configurations are reachable Future determined, means ”everyone will decide 1”

0- valent configuration

{ P1_state,

P2_state,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ P1_state,

P2_state2,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ decide-1,

P2_state,

P3_state,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-1,

P2_state2,

P3_state2,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-1,

P2_state,

P3_state,

decide-1,

{ msg2}

}

0-valent configuration

{ decide-1,

P2_state2,

P3_state2,

decide-1,

{ msg2}

}

0-valent configuration

{ decide-1,

P2_state,

decide-1,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-1,

P2_state3,

P3_state,

decide-1,

{}

}

Page 10: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

10

Definitions Illustrated 4(4) Bivalent configuration

Both 0 and 1-decided configurations are reachable Future undetermined, could go either way…

bivalent configuration

{ P1_state,

P2_state,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ P1_state,

P2_state2,

P3_state,

P4_state,

{msg1}

}

1-valent configuration

{ decide-1,

P2_state5,

P3_state6,

P4_state5,

{msg1, msg3}

}

0-valent configuration

{ decide-0,

P2_state2,

P3_state2,

P4_state,

{msg1, msg2}

}

1-valent configuration

{ decide-1,

P2_state5,

P3_state6,

decide-1,

{ msg2}

}

0-valent configuration

{ decide-0,

P2_state2,

P3_state2,

decide-0,

{ msg2}

}

0-valent configuration

{ decide-0,

P2_state,

decide-0,

P4_state,

{msg1, msg2}

}

1-valent configuration

{ decide-1,

P2_state9,

P3_state6,

decide-1,

{}

}

Page 11: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

11

Bivalent Initial Configuration

Theorem For any algorithm that solves the 1-crash

consensus problem there exists an initial bivalent configuration

Page 12: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

12

Proof 1/(10)

We know that the algorithm must be non-trivial There should be some initial configuration that will lead to a

0-decide There should be some initial configuration that will lead to a

1-decide

Take two such configuration i1 and i2

E.g. 4 processes initial values (0,1,0,1,1) lead to 1 Initial values (0,0,1,0,0) lead to 0

Page 13: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

13

Proof 2/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1

(0,0,1,0,0) leading to 0

Lets look at other initial configurations by flipping the inputs transforming the upper input to the lower input

Page 14: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

14

Proof 3/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to ?

(0,0,1,0,0) leading to 0

Lets look at other initial configurations by

flipping the inputs transforming the upper

input to the lower input

Page 15: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

15

Proof 4/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to ? (0,0,1,1,1) leading to ?

(0,0,1,0,0) leading to 0

Lets look at other initial configurations by

flipping the inputs transforming the upper

input to the lower input

Page 16: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

16

Proof 5/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to ? (0,0,1,1,1) leading to ? (0,0,1,0,1) leading to ? (0,0,1,0,0) leading to 0

Lets look at other initial configurations by

flipping the inputs transforming the upper

input to the lower input

Page 17: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

17

Proof 6/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to ? (0,0,1,1,1) leading to ? (0,0,1,0,1) leading to ? (0,0,1,0,0) leading to 0

There must exist two neighboring configurations here, with two different outcomes

Lets look at other initial configurations by

flipping the inputs transforming the upper

input to the lower input

Page 18: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

18

Proof 7/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to 1 (0,0,1,1,1) leading to 1 (0,0,1,0,1) leading to 0 (0,0,1,0,0) leading to 0

Assume the following two

Lets look at other initial configurations by flipping the inputs

Page 19: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

19

Proof 8/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,1,0,1,1) leading to 1 (0,0,0,1,1) leading to 1 (0,0,1,1,1) leading to 1 (0,0,1,0,1) leading to 0 (0,0,1,0,0) leading to 0

Assume the following two

Identical configurations except for process p4

Page 20: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

20

Proof 9/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,0,1,1,1) leading to 1 (0,0,1,0,1) leading to 0

The consensus algorithm should tolerate if p4 crashes! (0,0,1,X,1), leads to ? (either 0 or 1)

Assume the following two

Page 21: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

21

Proof 10/(10)

We know there exists inputsp1, p2, p3, p4, p5

(0,0,1,1,1) leading to 1 (0,0,1,0,1) leading to 0

The consensus algorithm should tolerate if p4 crashes! (0,0,1,X,1), leads to ? (either 0 or 1)

If it leads to 1, then depending on whether p4 crashes or not (0,0,1,0,1) either leads to 0 or 1 (bivalent)

If it leads to 0, then depending on whether p4 crashes or not(0,0,1,1,1) either leads to 0 or 1 (bivalent)

Assume the following two

Page 22: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

22

Initial Bivalence

Intuition Given any algorithm, we can find some start state, that depending on the

failure of one process, will either lead to a 0-decide or a 1-decide

Bivalent Initial Config

{ P1_state,

P2_state,

P3_state,

P4_state,

{msg1}

}

1-valent configuration

{ P1_state,

P2_state2,

P3_state,

P4_state,

{msg1}

}

0-valent configuration

{ P1_state,

P2_state,

P3_state,

P4_state,

{msg1, msg2}

}

1-valent configuration

{ decide-1,

P2_state2,

P3_state2,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-0,

P2_state,

P3_state,

P4_state,

{ msg2}

}

1-valent configuration

{ P1_state,

P2_state,

decide-1,

P4_state,

{msg1, msg2}

}

0-valent configuration

{ decide-0,

decide-0,

P3_state,

decide-0,

{}

}

Page 23: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

23

Coarse-grained Model of Distributed Systems In our model, we will now let each event be the

receipt of a message After the receipt of a message m, a process

deterministically makes all internal and send events it can do

In other words, we make our course-grained model a bit more fine-grained An event represents the receipt of a message, some internal

transitions and the sending of some messages

A receipt of message m at process p is always applicable if a message m with destination p is in the network

Page 24: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

24

Intuition behind model

receive <tok, y> from q

for x:=1 to 3 do

beginy:=y+1;

send <tok, y> neighp[x];

end

receive <tok, z> from q;

print z+y

Receipt event e

Initial state of p

State of p after receipt of e

Deterministic transitions

Receipt event f

Deterministic transitionsState of p after receipt of f

Page 25: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

25

Order of events

Intuition The order in which two applicable events are

executed is not important!

Order Theorem Let ep and eq be two events on two different

processors p and q which are both applicable in configuration . Then ep can be applied to eq(), and eq can be applied to ep().

Moreover, ep(eq()) = eq(ep() ).

Page 26: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

26

Definitions

A sequence of events =( e1, e2,…,ek) is applicable in configuration if e1 is applicable in , e2 applicable in e1() ...

If the resulting configuration is we write ()= or

If only contains events of a subset of the processes P, we write P

Page 27: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

27

Order of sequences

Diamond Theorem Let sequences 1 and 2 be applicable in

configuration , and let no process participate in both 1 and 2. Then 2 is applicable in 1(), 2 is applicable in 2(), and 1(2())=2(1())

Proof By induction using the order theorem

Page 28: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

28

Illustration of the Diamond Theorem

1 2

1() 2()

2 1

=2(1())=1(2())

Page 29: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

29

Bivalent Configuration

Any configuration of the 1-robust consensus algorithm is exactly one of these three Bivalent 0-valent 1-valent

Why? Any configuration leads to a decide because of termination We know bivalent configurations exist If it is not bivalent, it must lead to either 0-decide or 1-

decide, so it is either 0-valent or 1-valent

Page 30: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

30

Bivalent Configurations

In any bivalent config , either one applicable event goes to a bivalent config, or there exists two applicable events, leading to a 0-

valent and 1-valent configurations (respectively)

Bivalent BivalentBivalent 1-valent

0-valent

Case 1 Case 2

Page 31: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

31

Staying Bivalent

Theorem Given any bivalent config and an event e

applicable in There exists another reachable config where e is applicable,

and e() is bivalent

Bivalent …

Theorem Illustration

eBivalent …

e

…e

Bivalent

Page 32: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

32

Proof definitions

Assume e involves process p

Call the set of all possible configs reachable from without applying e the set C

Apply event e to all configs in C and call the resulting configs D

Bivalent

e

Theorem Illustration

……

ee

…e

…e

C

D

…e

Page 33: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

33

Proof intuition

We will proof that D contains a bivalent config by contradiction

I.e., assume there exists no bivalent config in D, show that this will lead to a contradiction or absurdity

Bivalent

e

Theorem Illustration

……

ee

e

…e

…e

C

D

Page 34: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

34

Proof

Assume D contains no bivalent configs I.e. all configs in D are either 0-valent or 1-valent

Then it follows that there exists a 0-valent and a 1-valent config in D (next slides)

Page 35: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

35

Proof

We know we can reach a 0-valent and 1-valent config from , call them 1 and 2 (non-triviality)

Either 1 and 2 are in C or they are not in C

If inside C, then e(1) and e(2) is in D and they are 0-valent/1-valent

Bivalent

e

1 and 2 are in C 1 and 2 are not in C

1

2 …

ee

e

…e

…e

C

Bivalent

e

2

1

ee

e

…e

…e

C

Page 36: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

36

Proof

If not inside C, then some1 and 2 exists on the path to 1 and

2, such that e(1) and e(2) are in D and they are 0-valent/1-valent

[Remember we assumed no bivalent config available in D]

Bivalent

e

1 and 2 are in C 1 and 2 are not in C

1

2 …

ee

e

…e

…e

C

Bivalent

e1

2

2

1

ee

e

…e

…e

C

Page 37: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

37

Reflection

We now know that D must always contain a 0-valent and 1-valent config, assuming no bivalent config exists in D

Lets call the two 0-valent and 1-valent configs in D, d0 and d1

We will now show that this situation is a contradiction itself. Hence, D must contain a bivalent config

Page 38: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

38

f

Deriving the contradiction

There must exist two configs c0 and c1 in C

such that c1=f(c0), and d0=e(c0) and d1=e(c1)

c0 c1

d0 d1

e e

C

D Lets see why!

Page 39: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

39

Proofing two neighbors exist 1(4) We know is bivalent, and e() is in D and is either 0-valent or

1-valent, assume 0-valent

0-valent

e

C

D

Page 40: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

40

Proofing two neighbors exist 2(4) We know is bivalent, and e() is in D and is either 0-valent or

1-valent, assume 0-valent

There is a reachable 1-valent config in D

f0 1

0-valent

e e

C 2 … m

1-valent

D

Page 41: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

41

Proofing two neighbors exist 3(4) We know is bivalent, and e() is in D and is either 0-valent or

1-valent, assume 0-valent

There is a reachable 1-valent config in D

e is applicable in each i, and must be 0-valent or 1-valent

1

0-valent 1-valente e

C 2 … m

x-valent y-valent z-valent

D

e e e

f0

Page 42: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

42

There exists two neighbors, one 1-valent and one 0-

valent

Proofing two neighbors exist 4(4)

1

0-valent 1-valente e

C 2 … m

0-valent 1-valent z-valent

D

e e e

f0 f1 f2 f3

We know is bivalent, and e() is in D and is either 0-valent or 1-valent, assume 0-valent

There is a reachable 1-valent config in D

e is applicable in each i, and must be 0-valent or 1-valent

Page 43: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

43

There exists two neighbors, one 1-valent and one 0-

valent

Proofing two neighbors exist 4(4) We know is bivalent, and e() is in D and is either 0-valent or

1-valent, assume 0-valent

There is a reachable 1-valent config in D

e is applicable in each i, and is 0/1-valent

f1C 2

0-valent 1-valent

D

e e

Page 44: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

44

There exists two neighbors, one 1-valent and one 0-

valent

Neighbors lead to contradiction 1(3) We now know there exist two configs c0 and c1 in C such that

c1=f(c0), and d0=e(c0) and d1=e(c1)

Either the events e and f happen on the same processor or on different processors, both cases will lead to contradictions

f1C 2

0-valent 1-valent

D

e e

Page 45: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

45

Neighbors lead to contradiction 2(3) We now know there exist two configs c0 and c1 in C such that

c1=f(c0), and d0=e(c0) and d1=e(c1)

Assume e and f happen on two different processes p and q Then, the order of their execution can be exchanged

fc0 c1

d1

e e

C

D0-valent 1-valent

fd0

Contradiction as d0 is 0-valent, but it can lead to a 1-

valent config, hence d0 must be bivalent, but we assumed

no bivalent configs exist in D

Page 46: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

46

Neighbors lead to contradiction 3(3) We now know there exist two configs c0 and c1 in C such that c1=f(c0), and d0=e(c0) and

d1=e(c1)

Assume e and f happen on the same process p, the algorithm should still work if p is silent

fc0 c1 d1

e e

C

0-valent 1-valent

d0

Contradiction as A should be a 0/1-valent configuration, but we have shown

that A can lead to both 0 and 1

f2 ee A

If p is silent, the algorithm should continue and terminate with a decision in some config A

0

If p is silent, some

execution leading to 0

should exist

1

If p is silent, some

execution leading to 1

should exist

Page 47: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

47

Proof Map

Assume there is no bivalent config in D

We know all configs in D are 0-valent or 1-valent

Show that we can find a 0-valent and 1-valent config in D

Show that two neighboring configs c0─e→c1 exist, where c0

─f→”0-valent config”, c1 ─f→”1-valent config”

Show this is a contradiction

Assumption must be incorrectD must contain a bivalent configuration

Page 48: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

48

Final Theorem

No deterministic 1-crash-robust consensus algorithm exists for the asynchronous model

Proof1. Start in a initial bivalent config

2. Given the bivalent config, pick the event e that has been applicable longest

Pick the execution taking us to another config where e is applicable

Apply e, and get a bivalent config

3. Repeat 2.

Page 49: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

49

Consensus not Impossible!

Lets do deterministic consensus algorithm for the a different failure model Initially dead processes

Assume t failures can happen initially

Where t=4 for N=10, t=5 for N=11

Let L denote L=6 for N=10, L=6 for N=11

2

1Nt

2

1NL

N=t+L

Page 50: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

50

Intuition

Assume N processes are connected in a underlying graph, and at most t fail

We know L processes are alive after the start Broadcast your identity, and receive/collect L identities

For any two correct processes, their set of collected identities will overlap Quorom concept There are N nodes, any two processes have L identities

each, i.e. total

N+1 identities, total N nodes, at least two must be same (PHP)

12

122

NN

L

Page 51: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

51

Initially Dead Consensus

Receive L messages

Initial state of p

Any two processes have overlapping Succ

Keep identity of senders in Rcvd

Wait until you’ve received a message from every process that is transitively in each Succ

Every process has the same set Alive

Page 52: Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks

52

Summary

We have proved that a 1-crash resilient deterministic consensus algorithm does not exist

Hence, there exists always an execution which stays in bivalent configurations and still keeps applying all applicable events!

All correct processes execute infinite number of events, and still leads to no decision!

We have shown an algorithm for consensus which is for the initially dead processes model