leader election. leader election: the idea we study leader election in rings

Leader Election

Leader Election: the idea

We study Leader Election in rings

Why rings?

• historical reasons– original motivation: regenerate lost token in token ring

networks

• illustrates techniques and principles

• good for lower bounds and impossibility results

Outline

• Specification of Leader Election• YAIR• Leader election in asynchronous rings:

• An O(n2) algorithm• An O(nlog(n)) algorithm

• The revenge of the lower bound!• Leader election in synchronous rings

• Breaking the (nlog(n)) barrier

Message passing: Model

• n processors p0,…pn-1

• connected by bi-directional communication channels

• topology represented by undirected graph

p0

p2

p3

p4

p1some links may be missing

Processors

Each pi is a state machine

• state set Qi

• distinguished initial states

• could be infinite

pi’s state includes

• outbufi[l]: set of messages sent on l-th channel and not yet delivered

• inbufi[l]: set of messages delivered on l-th channel and not yet processed

• inbufi initially empty

• outbufi not accessible

State Transitions

A state transition:

• input: accessible state of pi (doesn’t depend on outbufi)

• consumes all messages in inbufi

• outputs at most a message per channel

Terminology

Definition: A configuration is a vector C = (q0,…,qn-1)• each qi is a state of pi

• set of outbufi are messages in transit

In an initial configuration each qi is an initial state of pi

Definition: An event is• a computation event comp(i)• a delivery event del(i,j,m)

Definition: An execution is an infinite sequence C0,0,C1,1,… where• C0 is an initial configuration• each Ci is a configuration• each i is an event

Definition: A schedule for the above execution is the sequence of events 0,1 ,…

Safety and Liveness

Safety property : “nothing bad happens”• holds in every finite execution prefix

– Windows™ never crashes– if one general attacks, both do– a program never terminates with a wrong answer

Liveness property: “something good eventually happens”• no partial execution is irremediable

– Windows™ always reboots– both generals eventually attack– a program eventually terminates

Admissible executions satisfy safety and liveness properties for a particular system type.

A really cool theorem

Every property is a combination of a safety property and a liveness property

(Alpern and Schneider)

Asynchronous Message-Passing Systems

if k = del(i,j,m)• in Ck-1

– m is in outbufi[l], where l is pi’s label for channel {pi, pj}

• in Ck , – remove m from outbufi[l]– add m to outbufi[h], where h is

pi’s label for channel {pi, pj}

if k = comp(i)

• pi changes state according to its transition function

• empties inbufi in Ck-1

• might add messages to outbufi in Ck

C0,0,C1,1,C2 …

Admissible if:• Every processor takes an infinite number of computation steps• Every message sent is eventually delivered

SynchronousMessage-Passing Systems

C0,0,C1,1,C2 …

• all asynchronous constraints, plus

• execution partitioned into disjoint rounds

• one delivery event for every message in every outbuf

• followed by one computation event for every processor

Remarks• not realistic, but

• good for algorithm design

• good for lower bounds

Complexity

TIME

• each processor’s state set includes terminated states

• termination: – all processors in terminated

states

– no messages in transit

Synchronous: count number of rounds until termination

Asynchronous: set unit of time as maximum message delay

SPACE

• Count maximum total number of messages

The Problem

• Final states of processes partitioned in two classes:

elected non-elected

• In every admissible execution, exactly one process (the leader) enters an elected state. All remaining enter a non-elected state

• Once entered a state, always in that state

Lots of variations...

• The ring can be unidirectional or bidirectional

• The number n of processors may be known or unknown

• Processors can be identical or can be somehow distinguished

• Communication may be synchronous or asynchronous

Uni- vs. Bidirectional

In unidirectional rings, messages can only be sent in a clockwise direction

Can processors be distinguished?

If no, anonymous algorithms

• Processors have no UID

• Formally: identical automata

• Can distinguish between left and right.

Can processors be distinguished?

If yes:• processors have unique IDs

• chosen from some large totally ordered space of ids (e.g. N+)

• no constraint on which ID are used (e.g. integers may not be consecutive)

• IDs can be either manipulated only by certain operations (e.g. comparison)

• or by unrestricted operations

Is n known?

If no, uniform algorithms

• Algorithm cannot use information about ring size

Communication:Asynchronous vs. Synchronous

Asynchronous:• no upper bound on message

delivery time

• no centralized clock

• no bound on relative speed of processes

Synchronous:• communication in rounds

• In a round a process:– delivers all pending

messages

– takes an execution step (which may involve sending one or more messages)

if no failures, every message sent is eventually delivered

An Impossibility Result

TheoremThere is no deterministic solution to the

leader election problem for a synchronous, non-uniform, anonymous bidirectional ring.

ProofSuppose that a solution exists for a system

A of n > 1 processes.

Each process of A starts in the same state

Lemma The states of all processors at the end of the each round of the execution of A are the same.

Proof By induction on number of rounds k• Base case: k = 0

Easy, since processes start in same state.• Inductive step: Lemma holds for k = t-1

– processors are identical up to round k = t-1

– send same messages to left and right neighbors

• every processors receives identical messages on left and right channel

– all processors apply same transition function to identical states in round t– all processors have identical states at the end of round t

Then, if one enters leader state, all do!

Observations

• What are the implication for asynchronous rings?

• What are the implication for uniform rings?

Outline





The LCR Algorithm

LeLann (1977), Chang and Roberts (1979)

• unidirectional

• asynchronous

• non anonymous: every process has uid

• uniform (does not depend on n)

3: upon receiving m from right

4: case

5: m.uid > uidi :

6: send m to left

7: m.uid < uidi :

8: discard m

9: m.uid = uidi :

10:leader := i

11:send <terminate, i> to left

12:terminate

endcase

13: upon receiving <terminate, i> from right neighbor

14:leader := i

15:send <terminate, i> to left

16:terminate

1: upon receiving no message

2: send uidi to left (clockwise)

Correctness

• messages from process with highest ID are never discarded

• therefore the correct leader is elected

• no other processor ID can traverse the entire ring

• therefore no one else is elected

Complexity

Message complexity:

O(n2)

Time complexity:

O(n)

Can we do better?

This bound is tight…

0

1

2

n-1 n-2

The HS algorithm

Hirschenberg and Sinclair (1980)

• Ring is bidirectional• Each process pi operates in phases• In each phase l, pi sends out

“tokens” containing uidi in both directions

• Tokens are intended to travel distance 2l and return to pi

Phase 2Phase 0Phase 0Phase 0Phase 1Phase 1Phase 1Phase 2Phase 2

• However, tokens may not make it back

• Token continues outbound only if greater than tokens on path

• Otherwise discarded

• All processes always forward tokens moving inbound

If pi receives its own token while it is going outbound, pi is the leader

The Protocol

1: upon receiving no message

2: if asleep then

asleep := false

send <uidi,out,1> to left and right

12: upon receiving <uidj,out,h> from right

13: case

14: uidj > uidi and h>1:

15: send <uidj,out,h-1> to left

16: uidj > uidi and h=1:

17: send <uidj,in, 1> to right

18: uidj = uidi

19: leader := i

20: endcase

3: upon receiving <uidj,out,h> from left

4: case

5: uidj > uidi and h>1 :

6: send <uidj,out,h-1> to right

7: uidj > uidi and h=1 :

8: send <uidj,in, 1> to left

9: uidj = uidi :

10: leader := i

11:endcase

21: upon receiving <uidj,in,1> from right

22: send <uidj,in,1> to left

23: upon receiving <uidj,in,1> from left 24:send <uidj,in,1> to right

25: upon receiving <uidi,in,1> from left and right

26: phase := phase +1

27: send (uidi,out,2phase) to left and 28:right

0: Init: asleep := true

Correctness

Same as LCR:

• messages from process with highest ID are never discarded

• therefore the correct leader is elected

• no other processor ID can traverse the entire ring

• therefore no one else is elected

– Winners in phase l > 0

– Tokens travel distance

– Total number of messages sent in phase l is bounded by

• Total number of phases

• No. of messages bound by which is

⎣ ⎦12 1 +−ln

Communication Complexity

• Every processor sends a token in phase 0

4n messages

• For phase l > 0, – the only processors to send a tokens are those who “won” in phase l-1

– There is a winner for every processors

⎣ ⎦( ) nlnl 824

12 1 ≤⋅+−

⎡ ⎤nlog1+

⎡ ⎤( )nn log18 + O(n log n)

2l-1+1

2l

Time Complexity

• Time for each phase l

• Final phase takes • Next to last phase is

• Total time complexity excluding last phase

Time complexity is at most

⎡ ⎤ 1log −= nl

2 · 2l = 2l+1

n (tokens only traveling outbound)

⎡ ⎤nlog22 ⋅

3n to 5n

The revenge of the lower bound

So far we have seen:• a simple O(n2) algorithm

• a more clever O(n log n) algorithm

• focus on message complexity

Facts: • (n log n) lower bound in asynchronous networks

• (n log n) lower bound in synchronous networks when using only comparisons

Outline





• The rise and fall of randomization

Leader Election with fewer than O(n log n) messages

• Synchronous rings

• UID are positive integers

• Can be manipulated using arbitrary arithmetic operations

TimeSlice

• n is known to all processors

• unidirectional communication

• O(n) messages

VariableSpeeds

• n is not known to all processors

• unidirectional communication

• O(n) messages

What about Time complexity?

What is special about synchronous rings?

• Can convey information by not sending a message

“when your phone doesn’t ring, it’s me”

TimeSlice

Runs in phases• each phase consists of n rounds

• in phase i 0

– if no one elected yet

– processor with id i

– declares itself the leader

– sends token with its UID around

Message complexity:

Time complexity:

n · UIDmin

n

VariableSpeeds

• Each process pi initiates a token

• Different tokens travel at different speeds:• for token carrying UIDv, 1 message every rounds

• (each process waits rounds after receiving the token before

sending it out)

• Each process keeps track of smallest UID seen

• Discard token with UID greater than smallest UID

vUID2vUID2

Complexity Analysis

• By the time UIDmin goes around the ring, the second smallest UID has gone only half way, third smallest a fourth of the way, etc.

• Forwarding the token carrying UIDmin has caused more messages than all the other tokens combined

• Message complexity bound by

• Time Complexity minUIDn 2⋅

2n

Variable start times

Processors can start at protocol different times

• processors that wake up spontaneously (participants) send token with UID around ring

• processors that wake up on receiving a UID (relays) do not initiate their own token

A message life cycle

• A message is in phase one • until it is received by an awake processor

• forwarded immediately

• A message is in phase two• once received by an awake processor

• forwarded after rounds12 −iUID

The New Algorithm

When participant receives a message from pi:

• if UIDi larger than minimal seen (including own), swallow it

• otherwise, delay for rounds

When relay receives a message from pi:

• if UIDi larger than minimal seen (not including own), swallow it

• otherwise, delay for rounds

12 −minUID

12 −minUID

Correctness

Lemma: Only the participant processor with the smallest identifier receives its token back

Proof: • Let pi be participating processor with smallest UID

• No processor can swallow UIDi

• All tokens must go through pi , and will be swallowed

• No other processor can receive token back

Complexity

Three categories of messages:

• phase one messages

• phase two messages sent before the message of eventual leader enters its second phase

• phase two messages sent after the eventual leader enters its second phase

Complexity

Lemma: The total number of messages in the first category is at most n.

Proof The lemma follows because at most one phase one message is forwarded by each processor

• Suppose pi forwards two phase 1 messages, carrying UIDj and UIDk

• Assume, WLOG, that pj closer to pi than pk.• Them, phase 1 message with UIDk must go through pj

• If pj awake, then it becomes a phase 2 message• Otherwise, pj becomes a relay and does not send its UID

Complexity

Lemma: The total number of messages in the second category is at most n

Proof• After the first process awakens, it takes at most n rounds before

message with UIDmin reaches a participant• During this time, token with UIDv is responsible for messages at most• Max number of messages obtained when UIDs are small (0,1,…,n-1)• Max number of messages in second category:

vUIDn 2

nnn

v

UIDv <∑ =12

Complexity

Lemma: The total number of messages in the third category is at most 2n

Proof: analogous to complexity analysis for Variable Speeds

In summary:

Message Complexity: At most 4n

Time complexity minUIDnn 2⋅+

And now for somethingcompletely different...

RANDOMIZATIONRANDOMIZATION

Randomized Algorithms

Extend transition function to accept as input

• a random number

• from a bounded range

• under some fixed distribution

Why is it important?

The bad news:

randomization alone does not generally affect

• impossibility results – leader election in anonymous network still impossible!

• worst case bounds

The good news:

randomization + weakening of problem statement does

Example: RandomizedLeader Election

• Impossibility in anonymous rings still holds• but can now elect a leader with some probability• So weaken LE as follows

Safety: In every configuration of every admissible execution, at most one processor is in an elected state

Liveness: At least one processor is elected with some non-zero probability

Behaviors allowed by weakened specification:

• terminate without a leader• never terminate

Back to Leader Election

• Use randomization to have processes generate a pseudo identifier

• Use a deterministic leader election algorithm to work with pseudo identifiers

• Not just any deterministic LE algorithm:• needs to work correctly if multiple processes generate

same pseudo id

• a plus is the ability to detect if no leader elected

A first result

Assume• synchronous ring

• non-uniform ring

• processor can randomly choose identifiers

TheoremThere is a randomized algorithm which, with probability c > 1/e, elects a leader in a synchronous ring; the algorithm sends O(n2) messages

The Algorithm

Initially

0: pidi :=

1: send pidi to left

2: upon receiving <S> from right

3: if |S| = n then

4: if pidi is unique max(S) then

5: elected := true

6: else

7: elected := false

8: else

9: send <S||pidi> to left

Observations:

• randomization used once

• one execution for each element of = {1,2}n

⎩⎨⎧

n

n

1 yprobabilit with 2

1-1 yprobabilit with 1

Code for processor pi

ℜ

{R : exec(R) satisfies P}

• exec(R): execution of R in

• Given a predicate P on executions

Pr[P]: probability of event

Definitions

ℜ

ℜ

Analysis

What is the probability that the algorithm terminates with a leader?

enc

nnn

n nnn11

11

11

11

1

11

→⎟⎠

⎞⎜⎝

⎛ −>=⎟⎠

⎞⎜⎝

⎛ −=⎟⎠

⎞⎜⎝

⎛ −⎟⎟⎠

⎞⎜⎜⎝

⎛ −−

Message Complexity:

O(n2)

Not good enough?

Trade off more time and messages for higher probability of success• if |S| = n and pi detects no single max in S

– choose new pidi

– restart algorithm

• becomes a set of n-tupleseach of which is a possibly infinite sequence over {1,2}

ℜ

Analysis

Probability of success in iteration k

(1-c)k-1· c

Time complexity:• worst-case number of iterations: • expected number of iterations:

Expected value of T:

∞ec <1

∑ =⋅=Tx

xTxTin

]Pr[][E

Expected message complexity: O(n2)

Impossibility of Uniform Algorithms

TheoremThere is no uniform randomized algorithm for leader election in a synchronous anonymous ring that terminates in even a single execution for a single ring size

Summary

• No deterministic solution for anonymous rings• No solution for uniform anonymous rings (even

when using randomization)• Protocols with O(n2) and O(n logn) messages for

uniform rings• (n log n) lower bound on message complexity for

practical protocols• O(n) message complexity for uniform synchronous

rings

leader election. leader election: the idea we study leader election in rings

Documents

outbuf i

p i s label

infinite p i s state

compi p i changes state

accessible state of

state machine state

channel slide

rings slide