csce 668 distributed algorithms and systems
DESCRIPTION
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS. Fall 2011 Prof. Jennifer Welch. Problems Solvable in Failure-Prone Asynchronous Systems. - PowerPoint PPT PresentationTRANSCRIPT
CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS
Fall 2011Prof. Jennifer WelchCSCE 668
Set 19: Asynchronous Solvability 1
Problems Solvable in Failure-Prone Asynchronous Systems
CSCE 668Set 19: Asynchronous Solvability
2
Although consensus is not solvable in failure-prone asynchronous systems (neither message passing nor read/write shared memory), there are some interesting problems that are solvable: set consensus approximate agreement renaming k-exclusion
weakenings of consensus
- "opposite" of consensus
- fault-tolerant variant of mutex
Model Assumptions
CSCE 668Set 19: Asynchronous Solvability
3
asynchronous shared memory with read/write registers
heavy use of atomic snapshot objects at most f crash failures of procs.
results can be translated to message passing if f < n/2 (cf. Chapter 10)
may be a few asides into message passing
Set Consensus Motivation
CSCE 668Set 19: Asynchronous Solvability
4
By judiciously weakening the definition of the consensus problem, we can overcome the asynchronous impossibility
We've already seen a weakening of consensus: weaker termination condition for randomized
algorithms How about weakening the agreement
condition? One weakening is to allow more than one
decision value: allow a set of decisions
Set Consensus Definition
CSCE 668Set 19: Asynchronous Solvability
5
Termination: Eventually, each nonfaulty processor decides.
k-Agreement: The number of different values decided on by nonfaulty processors is at most k.
Validity: Every nonfaulty processor decides on a value that is the input of some processor.
Set Consensus Algorithm
CSCE 668Set 19: Asynchronous Solvability
6
Uses a shared atomic snapshot object X can be implemented with read/write registers
update your segment of X with your input
repeatedly scan X until there are at least n - f nonempty segments
decide on the minimum value appearing in any segment
Correctness of Set Consensus Algorithm
CSCE 668Set 19: Asynchronous Solvability
7
Termination: at most f crashes. Validity: every decision is some proc's
input Why does k-agreement hold?
We'll show it does as long as k > f. Sanity check: When k = 1, we have
standard consensus. As long as there is less than 1 failure, we can solve the problem.
k-Set Agreement Condition
CSCE 668Set 19: Asynchronous Solvability
8
Let S be set of min values in final scan of each nf proc; these are the nf decisions
Suppose in contradiction |S| > f + 1. Let v be largest value in S, the decision
of pi. So pi's final scan misses at least f + 1
values, contradicting the code.
Synchronous vs. Asynchronous? How does the previous, asynchronous,
algorithm compare to the synchronous algorithm for k-set consensus from Chapter 5 homework?
Recall the synchronous algorithm runs in f/k + 1 rounds.
CSCE 668Set 19: Asynchronous Solvability
9
Set Consensus Lower Bound
CSCE 668Set 19: Asynchronous Solvability
10
Theorem: There is no asynchrounous algorithm for solving k-set consensus in the presence of f failures, if f ≥ k.
Straightforward extensions of consensus impossibility result fail; even proving the existence of an initial bivalent configuration is quite involved.
Original proof of set-consensus impossibility used concepts from algebraic topology
Textbook's proof uses more elementary machinery, but still very involved
Approximate Agreement Motivation
CSCE 668Set 19: Asynchronous Solvability
11
An alternative way to weaken the agreement condition for consensus:
Require that the decisions be close to each other, but not necessarily equal
Seems appropriate for continuous-valued problems (as opposed to discrete)
Approximate Agreement Definition
CSCE 668Set 19: Asynchronous Solvability
12
Termination: Eventually, each nonfaulty processor decides.
-Agreement: All nonfaulty decisions are within of each other.
Validity: Every nonfaulty decision is within the range of the input values.
Approximate Agreement Algorithm
CSCE 668Set 19: Asynchronous Solvability
13
Assume procs know the range from which input values are drawn: let D be the length of this range
wait-free: up to n - 1 procs can fail algorithm is structured as a series of
"asynchronous rounds": exchange values via a snapshot object, one per
round compute midpoint for next round
continue until spread of values is within , which requires about log2 D/ rounds
Approximate Agreement Algorithm
CSCE 668Set 19: Asynchronous Solvability
14
Shared atomic snapshot objects ASO[1], ASO[2],...
Initially local variable v = pi's inputInitially local variable r = 1while true do update pi's segment of ASO[r] to be v let scan be set of values obtained by scanning
ASO[r] v := midpoint(scan) if r = log2 (D/) + 1 then decide v and terminate else r++
Analysis of Approx. Agreement Alg.
CSCE 668Set 19: Asynchronous Solvability
15
Definitions w.r.t. a particular execution: M = log2 (D/) + 1 U0 = set of input values Ur = set of all values ever written to
ASO[r]
Helpful Lemma
CSCE 668Set 19: Asynchronous Solvability
16
Lemma (16.8): Consider any round r < M. Let u be the first value written to ASO[r]. Then the values written to ASO[r+1] are in this range:
umin(Ur) max(Ur)(min(Ur)+u)/2 (max(Ur)+u)/2
elements of Ur+1 are in here
Implications of Lemma
CSCE 668Set 19: Asynchronous Solvability
17
The range of values written to the ASO object for round r + 1 is contained within the range of values written to the ASO object for round r. range(Ur+1) range(Ur)
The spread (max - min) of values written to the ASO object for round r + 1 is at most half the spread of values written to the ASO object for round r. spread(Ur+1) ≤ spread(Ur)/2
Correctness of Algorithm
CSCE 668Set 19: Asynchronous Solvability
18
Termination: Each proc executes M asynchronous rounds.
Validity: The range at each round is contained in the range at the previous round.
-Agreement:spread(UM) ≤ spread(U0)/2M
≤ D/2M
≤
Handling Unknown Input Range
CSCE 668Set 19: Asynchronous Solvability
19
Range might not be known. Actual range in an execution might be
much smaller than maximum possible range.
First idea: have a preprocessing phase in which procs try to determine input range but asynchrony and possible failures makes
this approach problematic
Handling Unknown Input Range
CSCE 668Set 19: Asynchronous Solvability
20
Use just one atomic snapshot object Dynamically recalculate how many
rounds are needed as more inputs are revealed
Skip over rounds to try to catch up to maximum observed round
Only consider values associated with maximum observed round
Still use midpoint
Unknown Input Range Algorithm
CSCE 668Set 19: Asynchronous Solvability
21
shared atomic snapshot object A; initially all segments updatei(A,[x,1,x]), where x is pi's input // [original input, rd#,
current estimate]repeat scan A let S be spread of all inputs in non- segments if S = 0 then maxRound := 0
else maxRound := log2(S/)
let rmax be largest round in non- segments
let values be set of candidates in segments with round number rmax
update pi's segment in A with [x,rmax+1,midpt(values)]
until rmax ≥ maxRound
decide midpoint(values)
Analysis of Unknown Input Range Algorithm
CSCE 668Set 19: Asynchronous Solvability
22
Definitions w.r.t. a particular execution: U0 = set of all input values Ur = set of all values ever written to A
with round number r M = largest r s.t. Ur is not emptyWith these changes, correctness proof is
similar to that for known input range algorithm.
Key Differences in Proof
CSCE 668Set 19: Asynchronous Solvability
23
Why does termination hold? a proc's local maxRound variable can only
increase if another proc wakes up and increases the spread of the observable inputs. This can happen at most n - 1 times.
Why does -agreement hold? If pi's input is observed by pj the last time pj
computes its maxRound, same argument as before.
Otherwise, when pi wakes up, it ignores its own input and uses values from maxRound or later.
Renaming
CSCE 668Set 19: Asynchronous Solvability
24
Procs start with unique names from a large domain
Procs should pick new names that are still distinct but that are from a smaller domain
Motivation: Suppose original names are serial numbers (many digits), but we'd like the procs to do some kind of time slicing based on their ids
Renaming Problem Definition
CSCE 668Set 19: Asynchronous Solvability
25
Termination: Eventually every nonfaulty proc pi decides on a new name yi
Uniqueness: If pi and pj are distinct nonfaulty procs, then yi ≠ yj.
We are interested in anonymous algorithms: procs don't have access to their indices, just to their original names. Code depends only on your original name.
Performance of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
26
New names should be drawn from {1,2,…,M}.
We would like M to be as small as possible.
Uniqueness implies M must be at least n. Due to the possibility of failures, M will
actually be larger than n.
Renaming Results
CSCE 668Set 19: Asynchronous Solvability
27
Algorithm for wait-free case (f = n –1) with M = n + f = 2n – 1.
Algorithm for general f with M = n + f. Lower bound that M must be at least n + 1,
for wait-free case. Proof similar to impossibility of wait-free consensus
Stronger lower bound that M must be at least 2n – 1, for wait-free case if n satisfies a certain number-theoretic property If n does not satisfy the property, there is a wait-
free algorithm with M = 2n – 2. (includes n = 6, 10, 14,...)
Wait-Free Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
28
Shared atomic snapshot object A; initially all segments
s := 1 // suggestion for my new namewhile true do
update pi's segment of A to be [x,s], where x is pi’s original name
scan A if s is also someone else's suggestion then let r be rank of x among original names of non-
segments let s be r-th smallest positive integer not currently
suggested by another proc else decide on s for new name and terminate
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
29
Uniqueness: Suppose in contradiction pi and pj choose same new name, s.
pi's lastscan beforedeciding s
pj's lastscan beforedeciding s
pi's lastupdatebeforedeciding:suggests s
sees s as pi'ssuggestion anddoesn't decide s
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
30
New name space is {1, …, 2n – 1}. Why? rank of a proc pi's original name is at most
n (the largest one) worst case is when each of the n – 1 other
procs has suggested a different new name for itself, so suggested names are {1, …, n – 1}.
Then pi suggests n + n – 1 = 2n – 1.
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
31
Termination: Suppose in contradiction some set T of nonfaulty procs never decide in some execution.
Consider the suffix of the execution in which each proc in T has already done at least
one update and only procs in T take steps (others have
either already crashed or decided).
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
32
Let F be the set of new names that are free (not suggested at the beginning of by any proc not in T) the trying procs need to choose new names from this
set. Let z1, z2,… be the names in F in order. By the definition of , no proc wakes up during
and reveals an additional original name, so all procs in T are working with the same set of original names during .
Let pi be proc whose original name has smallest rank (among this set of original names). Let r be this rank.
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
33
Eventually procs other than pi stop suggesting zr as a new name: After starts, every scan indicates a set of
free names that is no larger than F. Every trying proc other than pi has a larger
rank and thus continually suggests a new name for itself that is larger than zr, once it does the first scan in .
Analysis of Renaming Algorithm
CSCE 668Set 19: Asynchronous Solvability
34
Eventually pi does suggest zr as its new name: By choice of zr as r-th smallest free new
name, and fact that eventually other trying procs stop suggesting z1 through zr, eventually pi sees zr as free name with r-th smallest rank.
Contradicts assumption that pi is trying (i.e., stuck).
So termination holds.
General Renaming
CSCE 668Set 19: Asynchronous Solvability
35
Suppose we know that at most f procs will fail, where f is not necessarily n - 1.
We can use the wait-free algorithm, but it is wasteful in the size of the new name space, 2n – 1, if f < n – 1.
We can do better (if f < n – 1) with a slightly different algorithm: keep track in the snapshot object of whether
you have decided an undecided proc suggests a new name only if
its original name is among the f + 1 lowest names of procs that have not yet decided.
k-Exclusion Problem
CSCE 668Set 19: Asynchronous Solvability
36
A fault-tolerant version of mutual exclusion.
Processors can fail by crashing, even in the critical section (stay there forever).
Allow up to k processors to be in the critical section simultaneously.
If < k processors fail, then any nonfaulty processor that wishes to enter the critical section eventually does so.
k-Assignment Problem
CSCE 668Set 19: Asynchronous Solvability
38
A specialization of k-Exclusion to include: Uniqueness: Each proc in the critical section
has a variable called slot, which is an integer between 1 and m. If pi and pj are in the C.S. concurrently, then they have different slots.
Models situation when there is a pool of identical resources, each of which must be used exclusively: k is number of procs that can be in the pool
concurrently m is the number of resources To handle failures, m should be larger than k
k-Assignment Algorithm Schema
CSCE 668Set 19: Asynchronous Solvability
39
k-exclusion entry section
renaming using m = 2k-1 names
k-assignment entry section
k-exclusion exit section
k-assignment exit section