around self-stabilization

Around Self-StabilizationAround Self-Stabilization

Part 2: 　 Strengthened Forms of Self-Stabilization

Stéphane Devismes

Post-Doc CNRS at the LRI (Paris VII)

06/08/2008 Computer Science Department, University of Osaka 2

RoadmapRoadmap

1. Self-Stabilization (recall)

2. Motivation

3. Tolerating more types of fault

4. FTSS

5. Enhance the convergence

6. Snap-Stabilization

7. Conclusion


Self-Stabilization (recall)Self-Stabilization (recall)

• [Dijkstra 1974]

• General approach for recovering from the effect of

any transient faults


MotivationMotivation

• Self-Stabilization includes several advantages:

1. Tolerance to any transient fault:• No hypothesis on the nature of extent of transient faults

• Recovers from the effects of those faults in a unified manner

2. No initialization:• Large scale systems

3. Dynamicity:• Self-organization in sensor and ad hoc networks



• But also several drawbacks:

1. Impossibility results

• Some fundamental problems have no self-stabilizing solution

2. Overhead• Self-stabilizing protocols can make use of a large amount of resources

3. Usually not tolerant for other kinds of fault

4. Eventual safety

• During the convergence, almost nothing is guaranteed

Weakened Forms

Strengthened Forms



• Strengthened Forms for:

Tolerating more types of faults

Enhance the convergence property

• Converging quickly in some (frequent) cases

• Ensure some weak safety property when there are faults


Tolerating more types of faultsTolerating more types of faults

• Types of faults:

Transient

Intermittent

Crash

Byzantine


Tolerating more types of faultTolerating more types of fault

• Transient Faults:

Usually treated by the Self-Stabilization

Duration: finite

Periodicity: rare

Effect: alter the contain of some component(s) of the

network (processes and/or links)

E.g., memory/message corruption, crash-recover, lose

of messages…



• Intermittent Faults:

Duration: finite

Periodicity: frequent

Effect: alter the contain of some component(s) of the network

(processes and/or links)

E.g., memory/message corruption, crash-recover, lose of

messages…

Some paper deals with both self-stabilization and certain types

of intermittent fault, e.g., [Delaët and Tixeuil, JPDC’02]

• Fair lose of message + finite number of message corruption



• Crash Failures:

Duration: definitive

Effect: some component(s) of the network (processes and/or links)

definitively stops working

E.g., process crash, link removal

Fault-Tolerant Self-Stabilization (FTSS) [Gopal and Perry, PODC’93]

• Usually consider process crash only.



• Byzantine Failures:

Duration: unlimited

Effect: some component(s) of the network (usually processes) work in an

arbitrary manner

E.g., processes hit by an attack

Byzantine-Tolerant Self-Stabilization [Dolev and Welch, PODC’95]

• Restriction on the number of Byzantine processes and/or

• Some synchrony assumptions

Robust Stabilizing Leader ElectionRobust Stabilizing Leader Election

Carole Delporte-Gallet (LIAFA)

Stéphane Devismes (CNRS, LRI)

Hugues Fauconnier (LIAFA)

LIAFA


TopicsTopics

• Designing Leader Election protocols in message-

passing model that are

1. Crash tolerant

2. Self-Stabilizing

3. Communication-Efficient

4. With weak synchrony assumption


ModelModel

• Fully-connected network• Communications using messages• Link :

Unidirectional No order on the delivers May be synchronous

• Process : Synchronous or crashed With identifier State initially arbitrary

1 2

3 4


Communication-EfficiencyCommunication-Efficiency

[Larrea, Fernandez, and Arevalo, 2000]:

« An algorithm is communication-efficient if it

eventually only uses n - 1 unidirectional links »

1 2

3 4


Related WorksRelated Works

• [Gopal and Perry, PODC’93]

• [Anagnostou and Hadzilacos, WDAG’93]

• [Beauquier and Kekkonen-Moneta, JSS’97]

Communication-Efficiency never considered


Self-Stabilizing Leader Election in a full timely network?

Self-Stabilizing Leader Election in a full timely network?

Yes + communication-efficiency


Algorithm (1/4)Algorithm (1/4)

• Each process p periodically sends ALIVE,p to each other if Leader = p

4

3 2

1Leader=1

Leader=2 Leader=2

ALIVE,2

ALIV

E,2

ALIVE,2

ALIVE,1

ALIVE,1

ALIV

E,1



• When an alive process p such that Leader = p receives ALIVE from process q,

Leader := q if q < p

4

3 2

1Leader=1

Leader=2 Leader=2

ALIVE,2

ALIV

E,2

ALIVE,2

ALIVE,1

ALIVE,1

ALIV

E,1

Leader=1

4



• Each alive process q such that Leader ≠ q always chooses as leader the

process from which it receives ALIVE the most recently

4

3 2

1Leader=1

Leader=2 Leader=1

ALIVE,1

ALIVE,1

ALIV

E,1

Leader=1

4



• On Time out, each alive process p sets Leader to p

4

3 2

1Leader=3

Leader=2 Leader=4

ALIVE,2

ALIV

E,2

ALIVE,2

ALIVE,1

ALIVE,1

ALIV

E,1

Leader=1

Leader=2

4


Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous?

Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous?

No


Impossibility of Communication-Efficiency in a system with at most one asynchronous link

Impossibility of Communication-Efficiency in a system with at most one asynchronous link

• Claim: Any process p such that Leader ≠ p must periodically receive messages

within a bounded time otherwise it chooses another leader

The process chooses another leader


Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous?

Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous?

Yes


Self-Stabilizing Leader Election in a system with a timely routing overlaySelf-Stabilizing Leader Election in a system with a timely routing overlay

• For each pair of alive processes (p,q), there exists at least two

paths of timely links:

From p to q

From q to p


AlgorithmAlgorithm

• Each process computes the set of alive processes and chooses as leader the

smallest process of this set

• To compute the set:

– Each process p periodically sends ALIVE,p to every other process

– Any ALIVE,p message is repeated n - 1 times

(any other process periodically receives such a message)


Self-Stabilizing Leader Election in a system without timely routing overlay ?

Self-Stabilizing Leader Election in a system without timely routing overlay ?

No


ConclusionConclusion

• Obtaining algorithms that are both self-stabilizing and crash tolerant is

highly desirable

• But designing communication-efficient solution requires strong

synchrony assumption even if the network is fully-connected

• Solution: FTPS (Fault-Tolerant Pseudo-Stabilization)


Enhance The ConvergenceEnhance The Convergence

• Fault-containing Self-Stabilization

• Time-Adaptive Self-Stabilization

• Safe-Converging Self-Stabilization

• Superstabilization

• Snap-Stabilization


Fault-Containing Self-StabilizationFault-Containing Self-Stabilization

• [Ghosh et al, PODC’96]

• Self-stabilizing + if there is a few number of faults:

Spatial containment: a few number of processes can be

contaminated by the faults

Fast convergence time


Time-Adaptive Self-StabilizationTime-Adaptive Self-Stabilization

• [Kutten & Patt-Shamir, PODC’97]

• Self-stabilizing and if f<k processes are faulty:

The output of the algorithm stabilizes in O(f)

Faults hit f processes The output is stabilized The state is stabilized


Safe-Converging Self-StabilizationSafe-Converging Self-Stabilization

• [Kakugawa & Masuzawa, IPDPS’06]

• Self-stabilizing and fast convergence to a weaker (useful)

predicate

• E.g. Minimal Dominating Set (MDS):Arbitrary initial configuration DS MDS


SuperstabilizationSuperstabilization

• [Dolev & Herman, CJTCS’97]

• A Superstabilizing Algorithm

Must be self-stabilizing

Must preserve a “passage predicate”

Passage Predicate - Defined with respect to a class of topology changes (A

topology change falsifies legitimacy and therefore the passage predicate must

be weaker than legitimacy but strong enough to be useful).

Topological change Passage Predicate


Passage Predicate - ExamplePassage Predicate - Example

In a token ring:

A processor crash can lose the token but still not falsify

the passage predicate

Passage Predicate Legitimate State

At most one token exists

in the system. (e.g. the

existence of 2 tokens

isn’t legal)

Exactly one token exists

in the system.


Snap-StabilizationSnap-Stabilization

• [Bui et al, WSS’99]

• A snap-stabilizing algorithm immediately operates

correctly after the end of the faults

• Request-based algorithm and user-centric point

of view:

Each time a user initiates a request, it obtain a correct

result for its request


Snap-StabilizationSnap-Stabilization


Self vs. SnapSelf vs. Snap

1.X2.XN.X


Self vs. SnapSelf vs. Snap

1.X

Snap-Stabilization in Message-Passing Systems


Sylvie Delaët (LRI)

Stéphane Devismes (CNRS, LRI)

Mikhail Nesterenko (Kent State University)

Sébastien Tixeuil (LIP6)


Message-Passing ModelMessage-Passing Model

• Network bidirectional and fully-connected

• Communications by messages

• Links asynchronous, fair, and FIFO

• Ids on processes

• Transient faults

m1m2m3 m3mamb mamb

1 2

3 4m


Related Works in message-passing(reliable communication in self-stabilization)

Related Works in message-passing(reliable communication in self-stabilization)

• [Gouda & Multari, 1991] Deterministic + Unbounded Capacity => Unbounded Counter

Deterministic + Bounded Capacity => Bounded Counter

• [Afek & Brown, 1993] Probabilistic + Unbounded Capacity + Bounded Counter

?

?<I’m 12>

<How old are you, Captain?>

<I’m 21><I’m 60>


Related Works in message-passing (self-stabilization)

Related Works in message-passing (self-stabilization)

• [Varghese, 1993] Deterministic + Bounded Capacity

• [Katz & Perry, 1993] Unbounded Capacity, deterministic, infinite counter

• [Delaët et al] Unbounded Capacity, deterministic, finite memory Silent tasks


Related Works (snap-stabilization)

Related Works (snap-stabilization)

• Nothing in the Message-Passing Model

• Only in State Model:

Locally Shared Memory

Composite Atomicity

• [Cournier et al, 2003]


Case 1: unbounded capacity linksCase 1: unbounded capacity links

• Impossible for safety-distributed specifications


B

A

Safety-distributed specificationSafety-distributed specification

p

q

Example : Mutual Exclusion


A


p

sp

m1 m2 m3 m4 m5

Bq

sq

m’1 m’2 m’3 m’4


A


p

sp

m1m2m3m4m5

Bq

sq

m’1m’2m’3m’4


Case 2: bounded capacity linksCase 2: bounded capacity links

• Problem to solve: Reliable Communication

• Starting from any configuration, if Tintin sends a question to Captain

Haddock, then:• Tintin eventually receives good answers

• Tintin only delivers the good answers

?

?



• Case Study: Single-Message Capacity

0 or 1 message

0 or 1 message



• Sequence number State {0,1,2,3,4}

p q

Statep Stateq0

NeigStatep NeigStateq

?

??

<0,NeigStatep,Qp,Ap>

0

<Stateq,0,Qq,Aq>

1


Until Statep = 4?



• Pathological Case:

p q

Statep Stateq0

NeigStatep NeigStateq

?

1?

<2,?,?,?>

<?,0,?,?>

1

<?,1,?,?>

2

2

<?,2,?,?>

3


3

<Stateq,3,Qq,Aq>

4


GeneralizationsGeneralizations

• Arbitrary Bounded Capacity

2xCmax+3 values

p q

Cmax values

Cmax values

1 value 1 value


GeneralizationsGeneralizations

• PIF in fully-connected network

mm

m

AmAm

Am


ApplicationApplication

Mutual Exclusion

in a fully-connected & identified network

using the PIF


Mutual ExclusionMutual Exclusion

• Specification:

Any process that requests the CS enters in the CS in finite time (Liveness)

If a requesting process enters in the CS, then it executes the CS alone (Safety)

N.b. Some non-requesting processes may be initially in the CS


Principles (1/6)Principles (1/6)

• Let L be the process with the smallest ID

• L decides using ValueL which is authorized to access the CS§ if ValueL = 0, then L is authorized§ if ValueL = i, then the ith neighbour of L is authorized

• When a process learns that it is authorized by L to access the CS:1. It ensures that no other process can execute the CS

2. It executes the CS, if it requests it 3. It notifies L when it terminates Step 2 (so that L increments ValueL)



• Each process sequentially executes 4 phases infinitely often

• A requesting process p can enter in the CS only after executing

Phases 1 to 4 consecutively

The CS is in Phase 4



• Process p evaluates the IDs

5

2

Id?

Id?

Id?

8 2

33

8

Leader=2Phase=1



• Process p asks if Valueq = p to each other process q

5

2

Ok?Ok?

Ok?

No Yes

No3

8

Leader=2Value=0

1

2 3

3

2

Value=2Value=3

112

3

Ok=true

Phase=2



• If Winner(p) then p broadcasts EXIT to every other process

5

2

Exit

Exit

Exit

Ok Ok

Ok3

8

Leader=2

Value=0

1

2 3

3

2

Value=2Value=3

112

3

Ok=true

Phase=3

Winner(5)=true

Winner(2)=?

Winner(3)=?

Winner(8)=?

Phase=?Leader=?Ok=?

Phase=?Leader=?Ok=?

Phase=?Leader=?Ok=?

Phase=1 Phase=1

Phase=1



• If Winner(p) then CS; If p≠L, then p broadcasts ExitCS, else p increments

Valuep

5

2

ExitCS

ExitCS

ExitCS

Ok Ok

Ok3

8

Leader=2

Value=0

1

2 3

3

2

Value=2Value=3

112

3

Ok=true

Phase=4

Winner(5)=true

Winner(2)=?

Winner(3)=?

Winner(8)=?

Leader=?Ok=?

Leader=?Ok=?

Leader=?Ok=?

Phase=1 Phase=1

Phase=1

Value=3

<CS>


ConclusionConclusion

Snap-Stabilization in message-passing is no more an

open question


ExtensionsExtensions

• Apply snap-stabilization in message-passing to:

Other topologies (tree, arbitrary topology)

Other problems

Other failure patterns

• Space requirement

まいど　おおきに　 ! まいど　おおきに　 !

around self-stabilization

Documents

university of osakatolerating

university of osakaself

university of osakaroadmapself

types of faultftssenhance

types of faultstypes

types of faultsenhance

types of faulttransient

stabilization recalldijkstra