concurrency and os recapaiellom/pmc/pmcaiellopartii.pdf · concurrency and os recap based on...

55
Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne 64

Upload: others

Post on 04-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Concurrency and OS recap

Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne

64

Page 2: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Process Concept

� An operating system executes a variety of programs:� Batch system – jobs� Time-shared systems – user programs or tasks

� Textbook uses the terms job and process almost interchangeably

� Process – a program in execution; process execution must progress in sequential fashion

� A process includes:� program counter � stack� data section

65

Page 3: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Diagram of Process State

66

Page 4: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Representation of Process Scheduling

67

Page 5: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Addition of Medium Term Scheduling

68

Page 6: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Context Switch� When CPU switches to another process, the system must save the

state of the old process and load the saved state for the new process

� Context-switch time is overhead; the system does no useful work while switching

� Time dependent on hardware support

69

Page 7: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

4. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Interprocess Communication (IPC)� Mechanism for processes to communicate and to synchronize their

actions� Message system – processes communicate with each other

without resorting to shared variables� IPC facility provides two operations:

� send(message) – message size fixed or variable � receive(message)

� If P and Q wish to communicate, they need to:� establish a communication link between them� exchange messages via send/receive

� Implementation of communication link� physical (e.g., shared memory, hardware bus)� logical (e.g., logical properties)

70

Page 8: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

5. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Single and Multithreaded Processes

71

Page 9: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Background

� Concurrent access to shared data may result in data inconsistency

� Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes

72

Page 10: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Race Condition� count++ could be implemented as

register1 = count register1 = register1 + 1 count = register1

� count-- could be implemented as

register2 = count register2 = register2 - 1 count = register2

� Consider this execution interleaving: S0: producer execute register1 = count {register1 = 5}

S1: producer execute register1 = register1 + 1 {register1 = 6} S2: consumer execute register2 = count {register2 = 5} S3: consumer execute register2 = register2 - 1 {register2 = 4} S4: producer execute count = register1 {count = 6 } S5: consumer execute count = register2 {count = 4}

73

Page 11: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Solution to Critical-Section Problem1. Mutual Exclusion - If process Pi is executing in its critical section,

then no other processes can be executing in their critical sections

2. Progress - If no process is executing in its critical section and there exist some processes that wish to enter their critical section, then the selection of the processes that will enter the critical section next cannot be postponed indefinitely

3. Bounded Waiting - A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted Assume that each process executes at a nonzero speed No assumption concerning relative speed of the N processes

74

Page 12: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Semaphore� Synchronization tool that does not require busy waiting

(spin lock)� Semaphore S – integer variable� Two standard operations modify S: acquire() and release()

�Originally called P() and V()� Less complicated� Can only be accessed via two indivisible (atomic) operations

acquire(S) { while S <= 0 ; // no-op S--;}release(S) { S++;}

75

Page 13: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Deadlock and Starvation� Deadlock – two or more processes are waiting indefinitely for an

event that can be caused by only one of the waiting processes� Let S and Q be two semaphores initialized to 1 P0 P1

acquire(S); acquire(Q); acquire(Q); acquire(S); . . . . . . release(S); release(Q); release(Q); release(S);� Starvation – indefinite blocking. A process may never be removed

from the semaphore queue in which it is suspended.

76

Page 14: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

8. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

The Deadlock Problem� A set of blocked processes each holding a resource and waiting to

acquire a resource held by another process in the set.� Example

� System has 2 tape drives.� P1 and P2 each hold one tape drive and each needs another one.

� Example � semaphores A and B, initialized to 1

P0 P1

wait (A); wait(B)wait (B); wait(A)

77

Page 15: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

8. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Bridge Crossing Example

� Traffic only in one direction.� Each section of a bridge can be viewed as a resource.� If a deadlock occurs, it can be resolved if one car backs up

(preempt resources and rollback).� Several cars may have to be backed up if a deadlock

occurs.� Starvation is possible.

78

Page 16: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

8. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

System Model

� Resource types R1, R2, . . ., Rm

CPU cycles, memory space, I/O devices

� Each resource type Ri has Wi instances.� Each process utilizes a resource as follows:

� request � use � release

79

Page 17: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

8. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Deadlock Characterization

� Mutual exclusion: only one process at a time can use a resource.

� Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes.

� No preemption: a resource can be released only voluntarily by the process holding it, after that process has completed its task.

� Circular wait: there exists a set {P0, P1, …, P0} of waiting processes such that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by

P2, …, Pn–1 is waiting for a resource that is held by Pn, and P0 is waiting for a resource that is held by P0.

Deadlock can arise if four conditions hold simultaneously.

80

Page 18: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

8. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Methods for Handling Deadlocks

� Ensure that the system will never enter a deadlock state.

� Allow the system to enter a deadlock state and then recover.

� Ignore the problem and pretend that deadlocks never occur in the system; used by most operating systems, including UNIX.

81

Page 19: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Dining-Philosophers Problem

� Shared data Semaphore chopStick[] = new Semaphore[5];

82

Page 20: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

7a. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Monitor with condition variables

83

Page 21: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

6. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

CPU Scheduler� Selects from among the processes in memory that are ready to

execute, and allocates the CPU to one of them� CPU scheduling decisions may take place when a process:

1. Switches from running to waiting state2. Switches from running to ready state3. Switches from waiting to ready4. Terminates

� Scheduling under 1 and 4 is nonpreemptive� All other scheduling is preemptive

84

Page 22: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

6. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Scheduling Criteria

� CPU utilization – keep the CPU as busy as possible� Throughput – # of processes that complete their execution

per time unit� Turnaround time – amount of time to execute a particular

process� Waiting time – amount of time a process has been waiting

in the ready queue� Response time – amount of time it takes from when a

request was submitted until the first response is produced, not output (for time-sharing environment)

85

Page 23: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

6. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Round Robin (RR)� Each process gets a small unit of CPU time (time quantum),

usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue.

� If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units.

� Performance� q large ⇒ FIFO� q small ⇒ q must be large with respect to context switch, otherwise

overhead is too high

86

Page 24: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

6. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Real-Time Scheduling� Hard real-time systems – required to complete a critical task within

a guaranteed amount of time� Soft real-time computing – requires that critical processes receive

priority over less fortunate ones

87

Page 25: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

6. Silberschatz, Galvin and Gagne ©2003Operating System Concepts with Java

Dispatch Latency

88

Page 26: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Foundations of Distributed Computing

Marco Aiello

Distributed Systemsa.y. 2007/08

Rijksuniversiteit Groningen

89

Page 27: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Two generals problem

• Two Generals need to coordinate an attack against an enemy. If they attack individually, they will loose, if they attack together they will win.

• But the enemy lies in the middle and can intercept the coordination messages and avoid delivery

• Can the generals defeat the enemy?

90

Page 28: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Two generals

• Theorem: there is no non-trivial protocol that guarantees that the generals will always attack simultaneously

• Proof: Ab absurdum, suppose there is one such protocol that does the job in the minimum number of steps n>0.

Consider the last message sent, the n-th. The state of the sender cannot depend on its receipt, the state of the receiver cannot depend on its arrival, so they both do not need the n-th message. So we would have a protocol with n-1 messages. But that contradict the hypothesis

• Fact: A solution requires reliable message delivery.

91

Page 29: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• A distributed system is a collection of n processes and network links among processes

• Each process is modeled as a possibly infinite state machine with state set

• A configuration is a vector where is the state of

• An event is a transition in the state machine of the process i. We distinguish two types of events: computation events and message passing events. The latter are divided into and events.

• An execution segment of an asynchronous message-passing system is a sequence

Basic definitions

pi

(pi, pj)

Qipi

C = (q0, . . . , qn!1) qi pi

!i

send(i, j, m) receive(i, j, m)

!

C0 !1 C1 !2 C2 !3 . . .

92

Page 30: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Complexity measures

• An execution is admissible if each process has an infinite number of events, and every sent message is eventually delivered (in case of synchronous system, one may omit the eventually delivered.)

• A system is terminated if all of its processes are in final states of their respective state machines and there are no messages in transit.

• The message complexity of an algorithm is the maximum, over all admissible executions of the algorithm, of the total number of messages sent.

• The time complexity of an (asynchronous) algorithm is the maximum number of rounds in any (timed) admissible execution of the algorithm until termination.

Informally, a timed execution is one for which the longest time for a message delivery experienced in the system is taken as upper bound

93

Page 31: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Logical time

Causality, clocks and other ways to miss appointments

94

Page 32: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

95

Page 33: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

event i happened before event j if

i. the two events occurred at the same process and

ii. is the event sending the uniquely identified message <M> and is the event receiving the very same message <M>

iii.(transitivity) There exists a sequence of events with k≥0, such that

the relation is a irreflexive partial order

Happened before relation !i ! !j

!i ! !j

i > j

!i!j

!i+1 !i+2 . . . !i+k

!i ! !i+1 ! !i+2 ! . . . !i+k ! !j

96

Page 34: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• A space-time diagram is a graphical representation of the evolution of events occurring at processes

• a,b,c,d,e,f are events. What is the happened before relation among all of them?

97

Space-time diagrams

•97

Page 35: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• A logical clock is a monotonically increasing software counter. It need not relate to a physical clock. Each process has a logical clock, which can be used to apply logical timestamps to events

• In the initial configuration, all logical clocks are set to 0

• With every message sent by process i the logical clock of i is piggybacked with the message

• Any internal or send event at process i, will increase by one the logical clock

• Upon receiving a message from process j, process i will set its logical clock to

98

Logical time and logical clocks (Lamport 1978)

pi LTi

LTi

max(LTi, LTj) + 1

98

Page 36: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• What is the logical clock at all the described events?

99

Logical time and logical clocks (Lamport 1978)

a b c d e f

1 2 3 4 1 5

99

Page 37: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• Theorem: Given and execution and two events in the execution, then

• Question: is the converse true?

Facts

!i ! !j , then LT (!i) < LT (!j)

!i !j

The problem is that < is total order over the integerswhile happened before is a partial order

100

Page 38: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• A vector clock is a vector of the size of the system, whose values are monotonically increasing.

• In the initial configuration, all entries of all vector clocks are set to 0

• With every message sent by process i the vector clock of i is piggybacked with the message

• Any internal or send event at process i, will result in

• Upon receiving a message from process j, process i update its vector clock in the following way

Vector clocks

101

V Ci[j]

V Ci[i] = V Ci[i] + 1

!k "= i V Ci[k] = max(V Ci[k], V Cj [k])V Ci[i] = V Ci[i] + 1

101

Page 39: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• What is the vector clock at all the described events? What are parallel events?

102

Vector clocks

a b c d e f

<1,0,0> <2,0,0> <2,1,0> <2,2,0> <0,0,1> <2,2,2>

102

Page 40: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Facts

• Proposition For any j in every reachable configuration

• Two events are parallel if are incomparable

• Theorem

• Theorem If VC is a function that maps each event in an execution to a vector in a field in a manner that captures concurrency, then the size of the vector is at least as big as the size of the system to which the execution refers to.

V Cj [i] ! V Ci[i]

!i||!j V C(!i), V C(!j)

!i ! !j " V C(!i) < V C(!j)

103

Page 41: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

!j ! k " !i # !j $ !j ! k

Consistent Cuts

• A cut through an execution is a vector k of positive integers (just number all events at all process consecutively).

• A cut k is consistent if, for all i and j, the th computation event of i does not happen before th computation event in j. (I.e., the event does not depend on any other event happening after the cut.)

< ko, . . . kn >

ki + 1

kj

1 2 3 4

1 2 3 4 5 6

<1,3> <2,4> <2,6>

104

Page 42: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Facts

• Fact Given a cut, there is a unique maximal consistent cut.

• A distributed snapshot is a cut computed by the processes.

• How to compute a snapshot?

• Assumptions: FIFO channels and each message timestamped

105

Page 43: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Distributed Snapshot (Chandy & Lamport, 1985)

i. process i selects a time for the snapshot t

ii.process i broadcasts the take a snapshot to all processes

iii.when process j receives a snapshot request for the first time from h

a.record local state

b.send take a snapshot to all neighboring processes

c.record messages from all channels

iv.when process j receives a second snapshot request

i. stop recording from the channel

v. when process j has stop recording on all channels, then it sends its recoding to the initiating process i

Theorem The algorithm delivers a consistent cut of the distributed system subsequent to t.

106

Page 44: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Leader Election

Democracy... at last

107

Page 45: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Basic definitions

• Every process terminates in one of two final states: elected or non elected

• In every admissible execution, one and only one process will be in the elected state and all others in the non elected one

• Assumption: the topology is a directed ring

• A ring is anonymous if the processes do not have a unique identifier associated with them

• An algorithm is uniform if the number of nodes in the ring is not known to the processes

108

Page 46: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Bad news...

• Theorem There is no anonymous leader election algorithm A for asynchronous ring systems. (even a version of the theorem with stronger assumptions is valid)

• Theorem There is no anonymous leader election algorithm A for nonuniform synchronous ring systems. (proof ab absurdum)

• Lemma For every round k of the admissible execution of A in the ring R, the states of all processes at the end of round k are the same. (proof by induction)

Therefore if one state machine is in the elected state, so are all the others. The second theorem implies the first one

109

Page 47: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Fault Tolerant Consensus

it is all about agreement

110

Page 48: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Consensus

• The consensus problem is a coordination problem where a number of processes have to agree on a common output

• Let’s consider the synchronous case with possible crashes or byzantine failures, then we consider the asynchronous case

• A system that can tolerate up to f crashes is called f-resilient

• We identify a subset F of the processes of the system as faulty processes

• Each round contains exactly one computation for all processes not in F and at most one for the ones in F

111

Page 49: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

• Each process has an input variable and an output variable (also called decision)

• Initially, can be any value in a given domain and is undefined. Assignment of is irreversible and thus final.

• A solution to the consensus problem must guarantee the following in every admissible execution:

Termination

Agreement

Validity

pi

Consensus in Synchronous Systems with Crashes

xi

xi

yi

yi yi

!pi "# F : yi "= $!pi, pj "# F : yi "= $ % yj "= $ then yi = yj

!pi xi = v " #pj $% F yj $= & : yj = v

112

Page 50: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

A simple algorithm

Initially V = {x}

round k, 1 ≤ k ≤ f+1

send to all processes

receive Sj from pj, 0 ≤ j ≤ n-1 and j different from i

if k = f+1 then y := min(V)

{v ! V : pi has not already sent v}

V := V !

n!1!

j=0

Sj

113

Page 51: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

Simple algorithm and beyond

• Lemma In every execution at the end of round f + 1, Vi = Vj, for every two nonfaulty processes pi and pj.

• Theorem The algorithm solves the consensus problem in the presence of f crash failures within f + 1 rounds.

• Theorem Any consensus algorithm for n processes that is resilient to f crashes requires at least f + 1 rounds in some admissible execution, for all n ≤ f + 2.

114

Page 52: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

The Byzantine case

• Byzantine army is attacking a city and they can use reliable messengers. They need to decide whether to attack or not (agreement). If they are unanimous in the attack decision, then they should attack (validity). But some of the generals could be Byzantine traitors and send malicious, conflicting messages or even form a coalition.

• Theorem In a systems with three processes and one Byzantine process, there is no algorithm that solves the consensus problem.

• Theorem (lower bound on number of faulty processes) In a system with n processes and f Byzantine processes, there is no algorithm that solves the consensus problem if n ≤ 3 f.

115

Page 53: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

The Byzantine case

• Theorem There exists an algorithm for n processes that solves the consensus problem in the presence of f Byzantine failures within f + 1 rounds using exponential size messages, if n > 3 f.

• Theorem There exists an algorithm for n processes that solves the consensus problem in the presence of f Byzantine failures within 2 (f + 1) rounds using constant size messages, if n > 4 f.

116

Page 54: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

The asynchronous case

• Theorem There is no wait-free algorithm for solving the consensus problem in an asynchronous shared memory system with n processes and the possibility of crashes.

• Theorem There is no algorithm for solving the consensus problem in an asynchronous message-passing system with n processes, of which any may fail by crashing.

117

Page 55: Concurrency and OS recapaiellom/pmc/pmcAielloPartII.pdf · Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin

References

• Hagit Attiya and Jennifer Welch Distributed Computing: Fundamentals, Simulations and Advanced Topics, Wiley, 2004.

• Lorenzo Alvisi’s course on Distributed Computing at Univ. of Texas (Google it)

118