around self-stabilization
DESCRIPTION
Around Self-Stabilization. Part 2 : Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII). Roadmap. Self-Stabilization (recall) Motivation Tolerating more types of fault FTSS Enhance the convergence Snap-Stabilization Conclusion. - PowerPoint PPT PresentationTRANSCRIPT
Around Self-StabilizationAround Self-Stabilization
Part 2: Strengthened Forms of Self-Stabilization
Stéphane Devismes
Post-Doc CNRS at the LRI (Paris VII)
06/08/2008 Computer Science Department, University of Osaka 2
RoadmapRoadmap
1. Self-Stabilization (recall)
2. Motivation
3. Tolerating more types of fault
4. FTSS
5. Enhance the convergence
6. Snap-Stabilization
7. Conclusion
06/08/2008 Computer Science Department, University of Osaka 3
Self-Stabilization (recall)Self-Stabilization (recall)
• [Dijkstra 1974]
• General approach for recovering from the effect of
any transient faults
06/08/2008 Computer Science Department, University of Osaka 4
MotivationMotivation
• Self-Stabilization includes several advantages:
1. Tolerance to any transient fault:• No hypothesis on the nature of extent of transient faults
• Recovers from the effects of those faults in a unified manner
2. No initialization:• Large scale systems
3. Dynamicity:• Self-organization in sensor and ad hoc networks
06/08/2008 Computer Science Department, University of Osaka 5
MotivationMotivation
• But also several drawbacks:
1. Impossibility results
• Some fundamental problems have no self-stabilizing solution
2. Overhead• Self-stabilizing protocols can make use of a large amount of resources
3. Usually not tolerant for other kinds of fault
4. Eventual safety
• During the convergence, almost nothing is guaranteed
Weakened Forms
Strengthened Forms
06/08/2008 Computer Science Department, University of Osaka 6
MotivationMotivation
• Strengthened Forms for:
Tolerating more types of faults
Enhance the convergence property
• Converging quickly in some (frequent) cases
• Ensure some weak safety property when there are faults
06/08/2008 Computer Science Department, University of Osaka 7
Tolerating more types of faultsTolerating more types of faults
• Types of faults:
Transient
Intermittent
Crash
Byzantine
06/08/2008 Computer Science Department, University of Osaka 8
Tolerating more types of faultTolerating more types of fault
• Transient Faults:
Usually treated by the Self-Stabilization
Duration: finite
Periodicity: rare
Effect: alter the contain of some component(s) of the
network (processes and/or links)
E.g., memory/message corruption, crash-recover, lose
of messages…
06/08/2008 Computer Science Department, University of Osaka 9
Tolerating more types of faultTolerating more types of fault
• Intermittent Faults:
Duration: finite
Periodicity: frequent
Effect: alter the contain of some component(s) of the network
(processes and/or links)
E.g., memory/message corruption, crash-recover, lose of
messages…
Some paper deals with both self-stabilization and certain types
of intermittent fault, e.g., [Delaët and Tixeuil, JPDC’02]
• Fair lose of message + finite number of message corruption
06/08/2008 Computer Science Department, University of Osaka 10
Tolerating more types of faultTolerating more types of fault
• Crash Failures:
Duration: definitive
Effect: some component(s) of the network (processes and/or links)
definitively stops working
E.g., process crash, link removal
Fault-Tolerant Self-Stabilization (FTSS) [Gopal and Perry, PODC’93]
• Usually consider process crash only.
06/08/2008 Computer Science Department, University of Osaka 11
Tolerating more types of faultTolerating more types of fault
• Byzantine Failures:
Duration: unlimited
Effect: some component(s) of the network (usually processes) work in an
arbitrary manner
E.g., processes hit by an attack
Byzantine-Tolerant Self-Stabilization [Dolev and Welch, PODC’95]
• Restriction on the number of Byzantine processes and/or
• Some synchrony assumptions
Robust Stabilizing Leader ElectionRobust Stabilizing Leader Election
Carole Delporte-Gallet (LIAFA)
Stéphane Devismes (CNRS, LRI)
Hugues Fauconnier (LIAFA)
LIAFA
06/08/2008 Computer Science Department, University of Osaka 13
TopicsTopics
• Designing Leader Election protocols in message-
passing model that are
1. Crash tolerant
2. Self-Stabilizing
3. Communication-Efficient
4. With weak synchrony assumption
06/08/2008 Computer Science Department, University of Osaka 14
ModelModel
• Fully-connected network• Communications using messages• Link :
Unidirectional No order on the delivers May be synchronous
• Process : Synchronous or crashed With identifier State initially arbitrary
1 2
3 4
06/08/2008 Computer Science Department, University of Osaka 15
Communication-EfficiencyCommunication-Efficiency
[Larrea, Fernandez, and Arevalo, 2000]:
« An algorithm is communication-efficient if it
eventually only uses n - 1 unidirectional links »
1 2
3 4
06/08/2008 Computer Science Department, University of Osaka 16
Related WorksRelated Works
• [Gopal and Perry, PODC’93]
• [Anagnostou and Hadzilacos, WDAG’93]
• [Beauquier and Kekkonen-Moneta, JSS’97]
Communication-Efficiency never considered
06/08/2008 Computer Science Department, University of Osaka 17
Self-Stabilizing Leader Election in a full timely network?
Self-Stabilizing Leader Election in a full timely network?
Yes + communication-efficiency
06/08/2008 Computer Science Department, University of Osaka 18
Algorithm (1/4)Algorithm (1/4)
• Each process p periodically sends ALIVE,p to each other if Leader = p
4
3 2
1Leader=1
Leader=2 Leader=2
ALIVE,2
ALIV
E,2
ALIVE,2
ALIVE,1
ALIVE,1
ALIV
E,1
06/08/2008 Computer Science Department, University of Osaka 19
Algorithm (2/4)Algorithm (2/4)
• When an alive process p such that Leader = p receives ALIVE from process q,
Leader := q if q < p
4
3 2
1Leader=1
Leader=2 Leader=2
ALIVE,2
ALIV
E,2
ALIVE,2
ALIVE,1
ALIVE,1
ALIV
E,1
Leader=1
4
06/08/2008 Computer Science Department, University of Osaka 20
Algorithm (3/4)Algorithm (3/4)
• Each alive process q such that Leader ≠ q always chooses as leader the
process from which it receives ALIVE the most recently
4
3 2
1Leader=1
Leader=2 Leader=1
ALIVE,1
ALIVE,1
ALIV
E,1
Leader=1
4
06/08/2008 Computer Science Department, University of Osaka 21
Algorithm (4/4)Algorithm (4/4)
• On Time out, each alive process p sets Leader to p
4
3 2
1Leader=3
Leader=2 Leader=4
ALIVE,2
ALIV
E,2
ALIVE,2
ALIVE,1
ALIVE,1
ALIV
E,1
Leader=1
Leader=2
4
06/08/2008 Computer Science Department, University of Osaka 22
Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous?
Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous?
No
06/08/2008 Computer Science Department, University of Osaka 23
Impossibility of Communication-Efficiency in a system with at most one asynchronous link
Impossibility of Communication-Efficiency in a system with at most one asynchronous link
• Claim: Any process p such that Leader ≠ p must periodically receive messages
within a bounded time otherwise it chooses another leader
The process chooses another leader
06/08/2008 Computer Science Department, University of Osaka 24
Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous?
Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous?
Yes
06/08/2008 Computer Science Department, University of Osaka 25
Self-Stabilizing Leader Election in a system with a timely routing overlaySelf-Stabilizing Leader Election in a system with a timely routing overlay
• For each pair of alive processes (p,q), there exists at least two
paths of timely links:
From p to q
From q to p
06/08/2008 Computer Science Department, University of Osaka 26
AlgorithmAlgorithm
• Each process computes the set of alive processes and chooses as leader the
smallest process of this set
• To compute the set:
– Each process p periodically sends ALIVE,p to every other process
– Any ALIVE,p message is repeated n - 1 times
(any other process periodically receives such a message)
06/08/2008 Computer Science Department, University of Osaka 27
Self-Stabilizing Leader Election in a system without timely routing overlay ?
Self-Stabilizing Leader Election in a system without timely routing overlay ?
No
06/08/2008 Computer Science Department, University of Osaka 28
ConclusionConclusion
• Obtaining algorithms that are both self-stabilizing and crash tolerant is
highly desirable
• But designing communication-efficient solution requires strong
synchrony assumption even if the network is fully-connected
• Solution: FTPS (Fault-Tolerant Pseudo-Stabilization)
06/08/2008 Computer Science Department, University of Osaka 29
Enhance The ConvergenceEnhance The Convergence
• Fault-containing Self-Stabilization
• Time-Adaptive Self-Stabilization
• Safe-Converging Self-Stabilization
• Superstabilization
• Snap-Stabilization
06/08/2008 Computer Science Department, University of Osaka 30
Fault-Containing Self-StabilizationFault-Containing Self-Stabilization
• [Ghosh et al, PODC’96]
• Self-stabilizing + if there is a few number of faults:
Spatial containment: a few number of processes can be
contaminated by the faults
Fast convergence time
06/08/2008 Computer Science Department, University of Osaka 31
Time-Adaptive Self-StabilizationTime-Adaptive Self-Stabilization
• [Kutten & Patt-Shamir, PODC’97]
• Self-stabilizing and if f<k processes are faulty:
The output of the algorithm stabilizes in O(f)
Faults hit f processes The output is stabilized The state is stabilized
06/08/2008 Computer Science Department, University of Osaka 32
Safe-Converging Self-StabilizationSafe-Converging Self-Stabilization
• [Kakugawa & Masuzawa, IPDPS’06]
• Self-stabilizing and fast convergence to a weaker (useful)
predicate
• E.g. Minimal Dominating Set (MDS):Arbitrary initial configuration DS MDS
06/08/2008 Computer Science Department, University of Osaka 33
SuperstabilizationSuperstabilization
• [Dolev & Herman, CJTCS’97]
• A Superstabilizing Algorithm
Must be self-stabilizing
Must preserve a “passage predicate”
Passage Predicate - Defined with respect to a class of topology changes (A
topology change falsifies legitimacy and therefore the passage predicate must
be weaker than legitimacy but strong enough to be useful).
Topological change Passage Predicate
06/08/2008 Computer Science Department, University of Osaka 34
Passage Predicate - ExamplePassage Predicate - Example
In a token ring:
A processor crash can lose the token but still not falsify
the passage predicate
Passage Predicate Legitimate State
At most one token exists
in the system. (e.g. the
existence of 2 tokens
isn’t legal)
Exactly one token exists
in the system.
06/08/2008 Computer Science Department, University of Osaka 35
Snap-StabilizationSnap-Stabilization
• [Bui et al, WSS’99]
• A snap-stabilizing algorithm immediately operates
correctly after the end of the faults
• Request-based algorithm and user-centric point
of view:
Each time a user initiates a request, it obtain a correct
result for its request
06/08/2008 Computer Science Department, University of Osaka 36
Snap-StabilizationSnap-Stabilization
06/08/2008 Computer Science Department, University of Osaka 37
Self vs. SnapSelf vs. Snap
1.X2.XN.X
06/08/2008 Computer Science Department, University of Osaka 38
Self vs. SnapSelf vs. Snap
1.X
Snap-Stabilization in Message-Passing Systems
Snap-Stabilization in Message-Passing Systems
Sylvie Delaët (LRI)
Stéphane Devismes (CNRS, LRI)
Mikhail Nesterenko (Kent State University)
Sébastien Tixeuil (LIP6)
06/08/2008 Computer Science Department, University of Osaka 40
Message-Passing ModelMessage-Passing Model
• Network bidirectional and fully-connected
• Communications by messages
• Links asynchronous, fair, and FIFO
• Ids on processes
• Transient faults
m1m2m3 m3mamb mamb
1 2
3 4m
06/08/2008 Computer Science Department, University of Osaka 41
Related Works in message-passing(reliable communication in self-stabilization)
Related Works in message-passing(reliable communication in self-stabilization)
• [Gouda & Multari, 1991] Deterministic + Unbounded Capacity => Unbounded Counter
Deterministic + Bounded Capacity => Bounded Counter
• [Afek & Brown, 1993] Probabilistic + Unbounded Capacity + Bounded Counter
?
?<I’m 12>
<How old are you, Captain?>
<I’m 21><I’m 60>
06/08/2008 Computer Science Department, University of Osaka 42
Related Works in message-passing (self-stabilization)
Related Works in message-passing (self-stabilization)
• [Varghese, 1993] Deterministic + Bounded Capacity
• [Katz & Perry, 1993] Unbounded Capacity, deterministic, infinite counter
• [Delaët et al] Unbounded Capacity, deterministic, finite memory Silent tasks
06/08/2008 Computer Science Department, University of Osaka 43
Related Works (snap-stabilization)
Related Works (snap-stabilization)
• Nothing in the Message-Passing Model
• Only in State Model:
Locally Shared Memory
Composite Atomicity
• [Cournier et al, 2003]
Snap-Stabilization in Message-Passing Systems
Snap-Stabilization in Message-Passing Systems
06/08/2008 Computer Science Department, University of Osaka 45
Case 1: unbounded capacity linksCase 1: unbounded capacity links
• Impossible for safety-distributed specifications
06/08/2008 Computer Science Department, University of Osaka 46
B
A
Safety-distributed specificationSafety-distributed specification
p
q
Example : Mutual Exclusion
06/08/2008 Computer Science Department, University of Osaka 47
A
Safety-distributed specificationSafety-distributed specification
p
sp
m1 m2 m3 m4 m5
Bq
sq
m’1 m’2 m’3 m’4
06/08/2008 Computer Science Department, University of Osaka 48
A
Safety-distributed specificationSafety-distributed specification
p
sp
m1m2m3m4m5
Bq
sq
m’1m’2m’3m’4
06/08/2008 Computer Science Department, University of Osaka 49
Case 2: bounded capacity linksCase 2: bounded capacity links
• Problem to solve: Reliable Communication
• Starting from any configuration, if Tintin sends a question to Captain
Haddock, then:• Tintin eventually receives good answers
• Tintin only delivers the good answers
?
?
06/08/2008 Computer Science Department, University of Osaka 50
Case 2: bounded capacity linksCase 2: bounded capacity links
• Case Study: Single-Message Capacity
0 or 1 message
0 or 1 message
06/08/2008 Computer Science Department, University of Osaka 51
Case 2: bounded capacity linksCase 2: bounded capacity links
• Sequence number State {0,1,2,3,4}
p q
Statep Stateq0
NeigStatep NeigStateq
?
??
<0,NeigStatep,Qp,Ap>
0
<Stateq,0,Qq,Aq>
1
<1,NeigStatep,Qp,Ap>
Until Statep = 4?
06/08/2008 Computer Science Department, University of Osaka 52
Case 2: bounded capacity linksCase 2: bounded capacity links
• Pathological Case:
p q
Statep Stateq0
NeigStatep NeigStateq
?
1?
<2,?,?,?>
<?,0,?,?>
1
<?,1,?,?>
2
2
<?,2,?,?>
3
<3,NeigStatep,Qp,Ap>
3
<Stateq,3,Qq,Aq>
4
06/08/2008 Computer Science Department, University of Osaka 53
GeneralizationsGeneralizations
• Arbitrary Bounded Capacity
2xCmax+3 values
p q
Cmax values
Cmax values
1 value 1 value
06/08/2008 Computer Science Department, University of Osaka 54
GeneralizationsGeneralizations
• PIF in fully-connected network
mm
m
AmAm
Am
06/08/2008 Computer Science Department, University of Osaka 55
ApplicationApplication
Mutual Exclusion
in a fully-connected & identified network
using the PIF
06/08/2008 Computer Science Department, University of Osaka 56
Mutual ExclusionMutual Exclusion
• Specification:
Any process that requests the CS enters in the CS in finite time (Liveness)
If a requesting process enters in the CS, then it executes the CS alone (Safety)
N.b. Some non-requesting processes may be initially in the CS
06/08/2008 Computer Science Department, University of Osaka 57
Principles (1/6)Principles (1/6)
• Let L be the process with the smallest ID
• L decides using ValueL which is authorized to access the CS§ if ValueL = 0, then L is authorized§ if ValueL = i, then the ith neighbour of L is authorized
• When a process learns that it is authorized by L to access the CS:1. It ensures that no other process can execute the CS
2. It executes the CS, if it requests it 3. It notifies L when it terminates Step 2 (so that L increments ValueL)
06/08/2008 Computer Science Department, University of Osaka 58
Principles (2/6)Principles (2/6)
• Each process sequentially executes 4 phases infinitely often
• A requesting process p can enter in the CS only after executing
Phases 1 to 4 consecutively
The CS is in Phase 4
06/08/2008 Computer Science Department, University of Osaka 59
Principles (3/6)Principles (3/6)
• Process p evaluates the IDs
5
2
Id?
Id?
Id?
8 2
33
8
Leader=2Phase=1
06/08/2008 Computer Science Department, University of Osaka 60
Principles (4/6)Principles (4/6)
• Process p asks if Valueq = p to each other process q
5
2
Ok?Ok?
Ok?
No Yes
No3
8
Leader=2Value=0
1
2 3
3
2
Value=2Value=3
112
3
Ok=true
Phase=2
06/08/2008 Computer Science Department, University of Osaka 61
Principles (5/6)Principles (5/6)
• If Winner(p) then p broadcasts EXIT to every other process
5
2
Exit
Exit
Exit
Ok Ok
Ok3
8
Leader=2
Value=0
1
2 3
3
2
Value=2Value=3
112
3
Ok=true
Phase=3
Winner(5)=true
Winner(2)=?
Winner(3)=?
Winner(8)=?
Phase=?Leader=?Ok=?
Phase=?Leader=?Ok=?
Phase=?Leader=?Ok=?
Phase=1 Phase=1
Phase=1
06/08/2008 Computer Science Department, University of Osaka 62
Principles (6/6)Principles (6/6)
• If Winner(p) then CS; If p≠L, then p broadcasts ExitCS, else p increments
Valuep
5
2
ExitCS
ExitCS
ExitCS
Ok Ok
Ok3
8
Leader=2
Value=0
1
2 3
3
2
Value=2Value=3
112
3
Ok=true
Phase=4
Winner(5)=true
Winner(2)=?
Winner(3)=?
Winner(8)=?
Leader=?Ok=?
Leader=?Ok=?
Leader=?Ok=?
Phase=1 Phase=1
Phase=1
Value=3
<CS>
06/08/2008 Computer Science Department, University of Osaka 63
ConclusionConclusion
Snap-Stabilization in message-passing is no more an
open question
06/08/2008 Computer Science Department, University of Osaka 64
ExtensionsExtensions
• Apply snap-stabilization in message-passing to:
Other topologies (tree, arbitrary topology)
Other problems
Other failure patterns
• Space requirement
まいど おおきに ! まいど おおきに !