o. biran,s. moran,s. zaks
DESCRIPTION
A combinatorial characterization of the distributed tasks which are solvable in the presence of one faulty processor. O. Biran,S. Moran,S. Zaks. Introduction. FLP showed that achieving a distributed consensus is impossible in the presence of one faulty processor. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
A combinatorial characterization of the distributed tasks which are solvable in the
presence of one faulty processor
O. Biran, S. Moran,S. Zaks
Introduction
FLP showed that achieving a distributed consensus is impossible in the presence of one faulty processor.
Introduction
We will present a condition that is necessary and sufficient in order for a task to be solvable in the presence of a faulty processor.
Also we present a universal protocol which solves any task which is found to be solvable by our condition.
Definitions
The network is asynchronous and distributed.
There are N > 2 processors, V = {P1, P2, …, PN}.
We assume (w.l.o.g) that . iPidentityVP ii
The messages arrive with no error and in finite but unbounded and unpredictable time.
Definitions
One processor might be faulty (be defined later)
We assume that the network is complete (clique), but the results can be easily generalized to every biconnected network in which a failure of one processor cannot disconnect the network.
Adjacency Graphs
Let AN denote the set all vectors ā = (a1, a2, …, aN)
where A is an arbitrary set and
(a Cartesian multiplication of order N)
Aa i i
Let S be a set, when NAS
are adjacent if they differ in exactly one component.
Ss,s 21
Adjacency Graphs
The adjacency graph of S, G(S) = (S, ES) is an undirected graph adjacent are s,sEs,s 21S21
is called a partial vector (which means that the i component is undefined)
For a set of vectors, S, SssS ii
)s,,s,,s , ,(ss N1i1-i1i
i - Cliques
For each clique C in G(S), corresponds an integer 1 ≤ i ≤ N such that all the vectors in C differ from one another in exactly the i-th component.
Let C and i be a clique and its corresponding integer, than C is called an i-Clique. Notice that C defines a partial vector is
Maximal i - Cliques
A maximal i-clique is an i-clique which is not contained in any other i-clique.
Notice that every defines a maximal i-clique that includes and all the vectors that differs from in exactly the i-th component
is
s
s
Decision Tasks
Let A and B be arbitrary sets. Let f : A 2B be a function that assigns to each element a subset f(a) of B.
Cc
)(][A C
cfCf
Aa
Similarly,
Decision Tasks
Let X and D be sets of input_values and decision_values respectively.
ND
TXT 2:
A distributed decision task T is a function
where NT XX Is called the input set of task T
Decision Tasks
Similarly,
Each vector
][ TT XTD
Similarly, for decision vector
is called the decision set of T
TX ) x, x,(xx N21 is called aninput vector and it represents the initial assignment of the input value to processor Pi
Xxi
TD )d ,d ,(dd N21
Decision Tasks
A decision task T maps each input vector to a non-empty set of allowable decision vectors.
The adjacency graph G(XT) of the input set XT
is called the input graph of T.
Similarly, the adjacency graph G(DT) of the input set DT is called the decision graph of T.
Examples of Decision Tasks
ConsensusN
T XX ,11,1,,,00,0,)xT(Xx T
Strong consensusN
T XX
,11,1,,,00,0,)xT(Xx T
1,,1,10,,0,0, vTuTXvu T
Examples of Decision Tasks
Approximate consensusN
T QX
Mdmεddji
Dd,d,ddxT
x,x,xmaxM
x,x,xminm
iji
NN21
N21
N21
0given any for
Examples of Decision Tasks
Order preserving renaming (OPR) ji xxjiji N
T ΝX
jiji
i
NN21
ddxxji
Kd1i
Dd,d,dd
xT
NKK ,given any for
Protocols
Protocol for a given network is a set of N programs, each associated with a single processor.
Programs
Each program contains:
•Reading an input value
•Sending a message to a neighbor
•Receiving a message from a neighbor
•Performing a local computation
•Halting
Decision Protocols
A decision protocol is a protocol that when halts always writes an output value (to the associated element of DT
Executions
For a given network that is initialized with an input vector (x1, x2, …, xN) of XN , if each processor executes its own program, the sequence of operations performed by the processors is called an execution on the input vector.
Executions
Notice that for an input vector (x1, x2, …, xN) there can be more than one executions (due to the asynchronous nature of the system).
The set of all executions of protocol α on an input vector is denoted byx xE
Terminating Executions
A protocol in which all the processors eventually halts is called a terminating protocol.
N21 d,,d,dd
xe,D
The vector where di is the decision value of the processor pi in the execution e of protocol α is called the output vector of the execution e and is denoted by
Terminating Executions
is the set of all output vectors of all the terminating executions of the protocol α the input vector
xD
x
xTEeαα
α
xe,DxD
xTEα
is the set of all terminating executions of α on the input vector.
Terminating Executions
For a set S, we will define Dα[S] to be the union
Sx
αα xDSD
Notice that :
TTα
TTα
DXDαT
DXDαT
Solvability
The protocol α solves task T if :
)()(.2
)()(.1
Xx T
xTxD
xTExE
Solvability
Notice that :
T solves αsuch that α T
T must be computable
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T
Knows all the available information and calculates
the desired result
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T
Solvability
Notice that :
T solves αsuch that α T Output vector
Faulty Processors
The processor P is faulty in an execution e if after a certain time, all the messages sent by P are not received. (fail-stop failure)
A striking processor
1-Solvability
The protocol α 1-solves task T if :
• If no processor is faulty than α solves T
• If in an execution e one processor is faulty then all the other processors eventually halt.
If for a decision task T such α exists, then T is
called 1-solvable
1-Solvability
Notice that the strong consensus decision task of FLP is not 1-solvable.
On the other hand, the weak consensus is clearly 1-solvable, using the trivial solving protocol.
Conditions for 1-Solvability
We shall present 2 basic conditions for a task T to be 1-solvable by protocol α.
The Connectivity Condition
Theorem MW -
MW are S. Moran and Y. Wolfstahl
Let T be a decision task that has a connected input graph and let α be a given protocol.
If α 1-solves T, then G(Dα[XT]) is connected.
The Connectivity Condition
Theorem 1 - Let T be a decision task. Let be such as G(C) is a connected sub graph of the input graph G(XT). Let α be a given protocol.
If α 1-solves T, then G(Dα[C]) is connected.
TXC
The Connectivity Condition - Proof
Let α be a protocol that 1-solves the task T:XTDT
We define a new task T’:CDα[C], such that XT’=C and . xTxT` Cx
Clearly α 1-solves T`. By applying theorem MW to T` we have G(Dα[XT`]) is connected.
DT
Restrictions
A task T` restriction of task T if XT`=XT and
xTxT` Xx ` T
Notice that if α is a protocol which (1-)solves T` then α also (1-)solves T.
XT
DT`T`
T
Tα
Let T be a task and α a protocol which solves T.
We denote by Tα the task induced by α.
xDxT Xx
XX
T
T
T
Note that Tα is a restriction of T.
Pointwise connected
A task T is pointwise connected if is connected. xTG Xx T
Corollary 1 - If a protocol α 1-solves a task T then Tα, the task induced by α and T, is pointwise connected.
Covering Clique
Let be a maximal input i-clique in G(XT), and B an i-clique in G(DT).
We say that B is a covering clique for (with respect to the task T) if :
ixC
ixC
ΦByTxCy i
The partial decision vector defined by a covering clique for is called covering partial vector for
ixC ixC
Open for business
i-Sleeping Execution
Let α be a protocol that 1-solves a task T. An i-sleeping execution of α is an execution in which all the messages sent by Pi are delayed until all other processors halt and decide.
We’re on a strike
The Sleeping Processor Condition
Theorem 2 - Let T be a decision task and α a protocol that 1-solves T. Then, in Tα there is a covering clique for each maximal input i-clique.
Sleeping Processor Condition - Proof
Let be a maximal input i-clique. Consider the i-sleeping-execution in α in which the input to Pj is xj for each j ≠ i.
Let be the partial output vector by the non sleeping processors.
We claim that the maximal i-cliquein G(D(Tα)) is a covering clique for
ixC
id
idCD ixC
Sleeping Processor Condition - Proof
Let yi be any value such that the vector N1ii1i21 x,,x,y,x,x,x y is a possible input
vector in . We must show that ixC yTD
For this, assume that is the actual input to α and that Pi is eventually awakened. Pi must eventually decide on a value di to obtain an input vector
y
yTDin clearly is d This .d,,d,,dd αni1
Sleeping Processor Condition
iE
i
xCy
iE
i
iE
xCT
iff xCfor clique covering a exists e that therNote
.yT xCT
thatsee todifficult not isIt .xCfor vectors
partial covering all ofset thedenotes xCTLet
i i
Theorem 2 – Example 1
Consider the OPR task of N = 3 and K = 4.
3,4,2,3,4,1,2,4,1,2,3,1xT
2,4,3,1,4,3,1,4,2,1,3,2xT
2,3,4,1,3,4,1,2,4,1,2,3xT
9,12,10,11,12,10,13,12,10X
3
2
1
321T
xxx
Theorem 2 – Example 1
3,4,,2,4,,2,3,xT
2,4,,1,4,,1,3,xT
2,3,,1,3,,1,2,xT
hence 10,12, v wherevC clique-3input For the
33
32
31
33
2). Theorem(by solvable-1not is task the
thatmeans which vCT that seecan We 3E
Theorem 2 – Example 2
Consider the OPR task of N = 3 and K = 5.
9,12,10,11,12,10,13,12,10X 321T xxx
4,5,,3,5,,3,4,,2,5,,2,4,,2,3,xT
3,5,,2,5,,2,4,,1,5,,1,4,,1,3,xT
3,4,,2,4,,2,3,,1,4,,1,3,,1,2,xT
hence 10,12, v wherevC clique-3input For the
33
32
31
33
Theorem 2 – Example 2
2). Theorem(by solvable-1 IS task that the
means which ,4,2vCT that seecan We 3E
? ? ?
2 4 ?
2 4 1/3/5
Conditions for 1-Solvability
Next we shall present two necessary andsufficient conditions for 1-solvability
Conditions for 1-Solvability
clique. covering asuch
is dC that so d a outputs vinput on that algorithm
dcenterlize a is thereAlso, T`.in clique covering
a is therevC clique-iinput maximaleach For 2
connected. pointwise is T` 1
: following thesatisfying T`
T, ofn restrictio a exists thereiff solvable-1 is TA task
iii
i
Theorem 3 :
Theorem 3 – Proof
conditions 2 thesatisfies
thatT ofn restrictio T` solvable-1 is T
Only if
Theorem 3 – Proof
Condition 1
Let α be a protocol that 1-solves task T (by the assumption that T is 1-solvable).
Tα is a restriction of T (as we saw before).
Tα is pointwise connected (by Corollary 1).
Theorem 3 – Proof
Condition 2
Tα contains a covering clique for each maximal input i-clique (Theorem 2).
For each partial input vector the corresponding can be computed by simulating an i-sleeping-execution of α on the input , as described in the proof of Theorem 2.
ix
ix
id
Theorem 3 – Proof
conditions 2 thesatisfies
thatT ofn restrictio T` solvable-1 is T
If
Theorem 3 – Proof
Let T be a task which has a restriction T` which satisfies both conditions mentioned.
We will present a protocol which 1-solves T` and hence 1-solves T (as we saw before)
i-Anchor
.executions-sleeping-iin output arethat
ectorsdecision v thoseare anchors-i that means This
D.xT`d if x of
an is dtor output vecan Then, .xCfor clique
covering a be dCDlet andor input vectan be xLet i
i
i-anchor
i-Anchor – Example
In OPR with N = 3, K = 5, and input vector (10,20,30), and in a 2-sleeping-execution, P1 and P3 will output the partial covering vector (2, *, 4).
(This will happen since without knowing the value of P2, P1 and P3 can decide on 2 and 4 respectively and still allow P2 to decide on 1 / 3 / 5 which cover all the possible combinations)
i-Anchor – Example
In OPR with N = 3, K = 5, and input vector (10,20,30), and in a 2-sleeping-execution, P1 and P3 will output the partial covering vector (2, *, 4).
So (2, 3, 4) is a 2-anchor.
Theorem 3 – Proof
.S contains that xT`G
in TR treefinite a is therex of anchors-i of Sset finite
given afor hence connected, is xT`G 1condition By
.xCfor ,d vector covering partial a
outputs and xor input vect partial ainput an as getsthat
COMP.CLIQ algorithman is there2condition By
x
xx
ii
i
Theorem 3 – Proof
xx
xx
TRin vector arbitrary an is which rroot a and
above as TR treea outputs x of anchors-i of S
set finite a and xinput an on that COMP.TREE
algorithman exists therecomputable is T Since
The protocol assumes that each processor Pk contains a copy of the algorithms COMP.CLIQ and COMP.TREE described above.
Theorem 3 – the Protocol
Pk
xk
xk
xk
xk
xk
xk
STAGE ASending xk to all
xk
Theorem 3 – the Protocol
Pk
STAGE A Receiving first (N-1) initial values (including xk)
Theorem 3 – the Protocol
STAGE B Calculating partial vector and sending to all
Pk
ixixix
ixixix
ix
Theorem 3 – the Protocol
Pk
STAGE B Receiving first (N-1) partial vectors
Notice that all the partial vectorsAre not necessary identical
Theorem 3 – the Protocol
Pk
STAGE B Are all the partial vectors identical ?
?
Theorem 3 – the Protocol
STAGE B Are all the partial vectors identical ?
Yes !Pk decides on its output value
according to the common partial vector
ii xCOMP.CLIQ d
Theorem 3 – the Protocol
STAGE B Are all the partial vectors identical ?
No !
But notice that now Pk knowsthe entire input vector
Theorem 3 – the Protocol
x
x
TRIn
k
TRIn
s
j
xCOMP.CLIQCin x of anchors-i
N321x
AFATHERd
else
AFATHERd
identical were2)-(N messages x 1)-(N thefrom If
A,,A,A,A,xCOMP.TREETR
Theorem 3 – the Protocol
1
1 ,d SUGGEST,BROADCAST
ll
FALSEdecided
Theorem 3 – the Protocol
STAGE C of the protocol
in which all the processors will try to agree on a
common set of two adjacent vectors
Theorem 3 – the Protocol
messages `dSUGGEST are messages 1)-(N all
message `dDECIDE a is messages theof one IF
1 phase of messages 1)-(N RECEIVE
1
begin
do while
l-
ll
decidedNOT
Theorem 3 – the Protocol
dFATHERd
`dSUGGEST are messages thefrom 2)-(N if
else
`dBROADCAST
`dDECIDE
TRUEdecided
Theorem 3 – the Protocol
end
`dSUGGESTBROADCAST
end
dFATHERd
else
Theorem 3 – the Protocol
Lx denotes the maximal distance in the tree TRx from an i-anchor to the root rx.
Claim 1 : In each execution of the protocol there is an l ≤ Lx such that at least one processor DECIDEs in phase l.
Let l0 be the minimal l that satisfies Claim 1.
Theorem 3 – the Protocol
Let Pk be a processor that DECIDEs in phase l0.
Let d be the vertex in TRx on which Pk DECIDEs.
Claim 2 : If some processor Pj DECIDEs in phase l0 on a vertex d`, then d` = d.
Theorem 3 – the Protocol
Claim 3 : Exactly one of the following occurs :
a) At least two processors send at phase l0 a DECIDE(d) message.
b) All the (non-faulty) processors except Pk send at phase l0 a message SUGGEST(FATHER(d)).
Theorem 3 – the Protocol
Claim 4 : For j = 1 … N non-faulty processor Pj DECIDEs at phase l0 or at phase l0+1 on d or on FATHER(d).
(This comes from Claim 3 and STAGE C of the protocol).
Theorem 3 – the Protocol
If all the processors DECIDE on two adjacent vertices then the vector they output is one of these vertices.
This completes the proof of Theorem 3
Lower Bounds
After we have characterized the tasks that are 1-solvable we shall present lower bounds over the messages complexity It requires in order to solves such tasks.
Lower Bounds - FIFO
We assume that the system satisfies the FIFO discipline on each communication link.
Cleary, if a protocol 1-solves a task T, it must solve it also under this restrictive assumption.
Lower Bounds - FIFO
Also, if a task T is 1-solvable by a protocol that assumes the FIFO discipline, it is also 1-solvable by protocols that don’t assume it.
This is true this each processor can number the message it send.
Lower Bounds - FIFO
In conclusion, every lower bounds that assume this discipline are also applicable in cases in which this discipline is not assumed.
Lower Bounds – Lemma 1
Lemma 1 :Let α be a protocol that 1-solves a task T.Let x be in XT.Then if at most M messages are sent in any (FIFO) execution of α on x, then |Dα(x)| < (N+1)2M
Lower Bounds – Lemma 2
Lemma 2 :There exist tasks T such that for each arbitrarily large M there exists an input vector x such that the distance between any 1-anchor and any 2-anchor of x is greater than (N+1)2M
Lower Bounds – Theorem 4
Theorem 4 - For a given N ≥3, there is a 1-solvable
distributed task T for N processors that satisfies the following:
For every arbitrary constant M there is an input to T such that every protocol that 1-solves T must send, in the worst case, at least M messages on input .
x
x
Let x be an input vector whose existence is guaranteed by Lemma 2.
Lower Bounds – Theorem 4
Theorem 4 – Proof – Let T be a task that satisfies Lemma 2.
Let M be given.
Then, by the proof of the ”Only if” of Theorem 3 we know that every protocol α that 1-solves T must satisfy that G(Dα(x)) is connected and it contains an i-anchor of x for i = 1, 2, …, N.
By Lemma 2 this implies that |Dα(x)| > (N+1)2M.
By Lemma 1 this implies that α may send more than M messages on input x.
Lower Bounds – Theorem 4
Then, by the proof of the ”Only if” of Theorem 3 we know that every protocol α that 1-solves T must satisfy that G(Dα(x)) is connected and it contains an i-anchor of x for i = 1, 2, …, N.
By Lemma 2 this implies that |Dα(x)| > (N+1)2M.
By Lemma 1 this implies that α may send more than M messages on input x.
Lower Bounds – Theorem 4