1
University of Pittsburgh Manas Saksena 1
Distributed Mutual Exclusion
Manas Saksena
University of Pittsburgh
University of Pittsburgh Manas Saksena 2
Mutual Exclusion: Requirements
♦ Deadlock free�
Not all processes are stuck waiting to enter a CS
♦ Starvation free�
Any process that wants to enter a CS, eventuallyenters its CS
♦ Fairness�
e.g. FIFO
♦ Fault Tolerance�
Tolerance to process failures -- the system shouldbe able to re-organize itself
University of Pittsburgh Manas Saksena 3
Centralized Solution
♦ One Site is the Arbitrator
♦ All Requests go to the arbitrator
♦ Arbitrator sends “Grant” to one site at a time�
Other requests are queued
♦ No Deadlocks, No Starvation
♦ Easy to Achieve Fairness
♦ Problems�
Single Point of Failure�
Overload on the Arbitrator�
Performance: Throughput
University of Pittsburgh Manas Saksena 4
Performance
♦ T = Average Message Delay
♦ E = Average Critical Section (CS) Length
♦ Synchronization Delay
♦ Throughput =
sd E
Exit CS Enter CS Exit CS
University of Pittsburgh Manas Saksena 5
Centralized Solution:Performance
♦ Synchronization Delay
♦ Throughput
♦ Number of Messages per CS
University of Pittsburgh Manas Saksena 6
Distributed Mutual Exclusion
♦ Local FIFO orderings are not consistent�
different sites will come to different conclusions
♦ Need Consistent Global Ordering of Events
each site can come to the same conclusion
♦ How?
Time stamp = (local clock, site #)�
(Ca, Sa) ⇒ (Cb,Sb) iff (1) Ca < Cb, or (2) Ca = Cb and Sa < Sb
2
University of Pittsburgh Manas Saksena 7
Lamport’s Distributed Algorithm
♦ Enter CS:�
Send Request Message to all sites
Include Timestamp with Request Message
♦ On Receipt of Request Message�
Return a Reply Message�
Include Timestamp�
Enter Request in Local Queue
University of Pittsburgh Manas Saksena 8
Lamport’s Algorithm (contd)
♦ Entering Critical Section�
Must have received reply from all sites
− Reply timestamps must be larger than therequest timestamp (of this site)
�At the head of the (local) request queue
♦ Exiting Critical Section�
Send time stamped Release msg to all
♦ Receipt of Release Msg�
Remove the corresponding request from therequest queue
University of Pittsburgh Manas Saksena 9
Example
(2,1)
(1,2)
LocalClock
Sitenumber
University of Pittsburgh Manas Saksena 10
Example: Reply Messages
(2,1)
(1,2)
LocalClock
Sitenumber
(1,2) (2,1)⇒
Enters Critical Section
University of Pittsburgh Manas Saksena 11
Example: Release Messages
(2,1)
(1,2)
LocalClock
Sitenumber
Enters Critical Section
(1,2) (2,1)⇒
University of Pittsburgh Manas Saksena 12
Proof of Correctness
♦ By Contradiction♦ Suppose Si and Sj are concurrently in CS
�At Si, Si’s request must be at head of queue
�At Sj, Sj’s request must be at head of queue
♦ Assume Si’s request has higher priority♦ Also,
�Sj must have received a time stamp larger than itsown request from Si for it to enter CS
�If channels are FIFO, then Si’s request must be inSj’s local request queue
− Contradiction -- Sj’s request cannot be at head
3
University of Pittsburgh Manas Saksena 13
Lamport’s Algorithm:Performance
♦ Synchronization Delay
♦ Throughput
♦ Number of Message per CS
University of Pittsburgh Manas Saksena 14
Ricart-Agrawala Algorithm
♦ Improvement over Lamport’s algorithm
♦ Basic Idea�
No Release Messages�
Reply Message
− Delayed
− Serve as both reply and release message
University of Pittsburgh Manas Saksena 15
Ricart-Agrawala
♦ On Receipt of a Request Message�
(say from Req from Si, Received by Sj)�
Sj sends a reply msg to Si if
− Sj is not requesting, or
− Sj is requesting, but its request time stamp islarger than Si’s request time stamp
�Otherwise, defer reply
♦ Enter CS when reply from all received!!!�
Note: No local request queues
♦ Exit: send deferred reply’s
University of Pittsburgh Manas Saksena 16
Example: Release Messages
(2,1)
(1,2) Enters Critical Section
(1,2) (2,1)⇒
Enters Critical Section
Exits Critical Section
Deferred Reply Msg
University of Pittsburgh Manas Saksena 17
Proof of Correctness
♦ By Contradiction♦ Suppose Si and Sj are concurrently in CS
�Si must have received a reply from Sj
Sj must have received a reply from Si
♦ Assume Si’s request has higher priority♦ Also,
!Si must have received Sj’s request after it made itsown (else, Si’s req would have lower priority)
"The reply could not have been sent unless Si wasfinished with its own CS
− Contradiction
University of Pittsburgh Manas Saksena 18
Ricart-Agrawala: Performance
♦ Number of Messages per CS
♦ Synchronization Delay
♦ Throughput
4
University of Pittsburgh Manas Saksena 19
Maekawa’s Algorithm
♦ Basic Ideas:#
Send request to only a subset of sites
− request set of a site$
Request sets of two sites should have at least onecommon member
− common member determines the order inwhich the two sites will be served
%A Site can send only one Reply message at a time
− After received a Release msg from the previousReply msg
University of Pittsburgh Manas Saksena 20
Maekawa’s Algorithm
♦ Construction of Request Sets
♦ Necessary Conditions (for correctness)&
Request sets of any two sites have at least onecommon member
'A site belongs to its own request set
♦ Other Desirable Conditions(
All request sets are equal in size (= K))
Any site is contained in K request sets
− N = K (K-1) + 1
− K = √N
University of Pittsburgh Manas Saksena 21
Maekawa’s Algorithm
♦ Request:*
send request message to all sites in the req. set
♦ On Receipt of Request Message+
send a reply, provided there is no outstandingreply (the last reply received a release message)
,else, queue the request
♦ Enter CS: when all replies are received♦ Exit CS: send release to all replies (req. set)
♦ Receive Release:-
send deferred reply to next in queue, if any
University of Pittsburgh Manas Saksena 22
Maekawa’s Algorithm
♦ Deadlocks
♦ Suppose Si, Sj, Sk want to enter CS
♦ Suppose Sij, Sik, Sjk are the common sites intheir request sets
♦ It is possible that:.
Sj is blocked at Sij/
Sk is blocked at Sjk0
Si is blocked at Sik
♦ Why?1
Requests are not prioritized by timestamps
University of Pittsburgh Manas Saksena 23
Maekawa’s Algorithm
♦ Detecting Deadlocks2
Use timestamps3
A site can detect possible deadlock, if
− it has sent a reply to a larger timestamp (lowerpriority), and a lower timestamp (higher priority)request is blocked
♦ Resolving Deadlocks4
After detection, use extra messages
University of Pittsburgh Manas Saksena 24
Deadlock Resolution
♦ Failed Message (Si to Sj)5
indicates that Si cannot grant Sj’s request becauseit has granted permission to a higher priorityrequest
♦ Inquire Message (Si to Sj)6
indicates that Si would like to find out from Sj if Sj
has succeeded in locking all sites in its req set
♦ Yield Message (Si to Sj)7
indicates that Si is returning permission to Sj (toyield to a higher priority request at Sj)
5
University of Pittsburgh Manas Saksena 25
Deadlock Resolution
♦ Request (ts,i) blocks at Sj (because of Sk)8send a failed message to Sj if it’s a lower priority
9else, send an inquire message to Sk
♦ Response to Inquire from Sj:send back a yield if it has received a failed fromsome other site in its request set, or
;if it sent a yield to a site in its request, and has notreceived a reply
♦ Response to Yield from Sk<assumes a release, and places the request at theappropriate location in queue. Send reply to headof queue
University of Pittsburgh Manas Saksena 26
A Reality Check
♦ Distributed algorithms studied so far do not(as compared to the centralized one)
=Add any fault-tolerance
− in fact, make it worse (why?)>
Add many more messages?
May improve throughput− reduce synchronization delay to T from 2T
♦ Lessons@
so far, not very useful, butA
showed that it was possible (if not better)B
may lead to better algorithms in the future
University of Pittsburgh Manas Saksena 27
Token Bus/Ring Algorithm
♦ Communication Medium = CS
♦ Logical Ring (in software)C
sites ordered in some way
♦ A Token determines who has access to the media
♦ When done, pass token to the next in order
0 4 2 5 1 30
1 2
3
45
University of Pittsburgh Manas Saksena 28
Token Ring: Problems
♦ What if token gets lost?D
The site holding it crashes (and comes back,perhaps, but doesn’t remember it had the token)
EError on communication medium
♦ Detect Token LossF
Who starts the process?G
What if multiple sites start the process?
♦ Regenerate TokenH
make sure that two tokens don’t get generated
University of Pittsburgh Manas Saksena 29
Bus Access
♦ Centralized ArbiterI
makes decision of when and who
♦ Daisy ChainingJ
centralized decision makes decision of whenK
Who determined by location
− Nearer sites (to the arbiter) get first chance
♦ Distributed Self-SelectionL
Independent decision on each site based on itsown code and what is on the bus
♦ Ethernet (CSMA/CD)
University of Pittsburgh Manas Saksena 30
Token Based Algorithms
♦ Generalization of the token bus/token ringprotocol
♦ Basic IdeaM
single token
− can enter CS if possess token
− correctness proof is trivialN
explicit requests for token
− when want to enter CSO
release token when done with CS
− send to “next” requesting site
6
University of Pittsburgh Manas Saksena 31
Suzuki-Kasami Algorithm
♦ Basic IdeaP
Broadcast Requests
− necessarily received by token holder
♦ Outdated RequestsQ
Each site assigns a sequence number to itsrequests
REach site maintains an array of request numbers
− one entry for each site
− largest sequence number seen from that siteS
Outdated requests are easy to determine with thisinformation
University of Pittsburgh Manas Saksena 32
Suzuki-Kasami Algorithm
♦ TokenT
consists of
− a queue of requesting sites
− an array of integers, one entry per site– sequence number of request last executed by the site– updated by a site when it finishes execution
♦ When finished executing CSU
How do we know which sites have outstandingrequests?
VHow is starvation avoided?
University of Pittsburgh Manas Saksena 33
Singhal’s Heuristic Algorithm
♦ Basic IdeaW
Avoid Broadcasting of Request MessagesX
How?
− Maintain additional information about whomight have the token
− Send requests to only those sites
♦ Tricky PartY
must make sure that the request-set contains atleast one site that will get a token in the nearfuture
University of Pittsburgh Manas Saksena 34
Singhal’s Heuristic Algorithm
♦ Maintaining Additional InformationZ
Carry information around in the token
− state of each site (requesting, holding,none)
− latest sequence number[
Same information is kept at each site\
Basically, if you receive a request from a site, then
− add it to the request-set
♦ Details in the book
University of Pittsburgh Manas Saksena 35
Raymond’s Algorithm
♦ Sites are logically arranged as a directed tree]
edges pointing towards the root
♦ Token Holder is the root of the tree
♦ Structure is distributed^
Local Variable: holder
− points to the parent in the tree
− sufficient to maintain the tree structure
♦ Requesting:_
send requests along the directed edges of the tree
University of Pittsburgh Manas Saksena 36
Raymond’s Algorithm
Token Holder
Tree Edges
7
University of Pittsburgh Manas Saksena 37
Raymond’s Algorithm
♦ Requests`
propagated along tree edges to the token holdera
intermediate nodes add request in their local requestqueue
− forward request only if no other request has beenforwarded earlier
♦ Releaseb
propagate token along the tree edges in a reversedirection
− based on “who” sent the requestc
reverse tree edges (point towards new token holder)
University of Pittsburgh Manas Saksena 38
Raymond’s Algorithm
Token Request Token Release
University of Pittsburgh Manas Saksena 39
Raymond’s Algorithm
What happens to the other request?
University of Pittsburgh Manas Saksena 40
Deadlocks: Causes
♦ Exclusive Resource Access
♦ Wait while holdd
blocking calls to acquire more resources
♦ No Preemptione
e.g., not possible with CPU as one of theresources
♦ Circular Wait
University of Pittsburgh Manas Saksena 41
Deadlocks: Strategies
♦ Ostrich Modelf
Ignore (depend on outside mechanisms)g
Probably most common
♦ Detection & Resolutionh
Check if deadlocked. Resolve by breaking thecycle.
♦ Preventioni
Prevent at grant time, using structural properties
♦ Avoidance: impractical (Banker’s Algorithm)
University of Pittsburgh Manas Saksena 42
Deadlock Detection
♦ Construct a Process-Resource graphj
Directed Edge from Process to Resource
− Process is waiting for Resourcek
Directed Edge from Resource to Process
− Process is holding the Resource
♦ Deadlockl
Cycle in the graph
♦ Works with single-unit resourcesm
can be generalized to multi-unit resources
♦ Easy to do on a centralized system
8
University of Pittsburgh Manas Saksena 43
Distributed Deadlock Detection
♦ Centralized Schemen
A coordinator builds/maintains the resource graphfor the entire system
oIf a cycle is detected then kill someone to breakthe deadlock
♦ Three Strategiesp
Update whenever an edge is added/deletedq
Update periodicallyr
Coordinator requests explicitly
♦ Problems: Inconsistent State Information
University of Pittsburgh Manas Saksena 44
Distributed Deadlock Detection
♦ Generalize the centralized algorithm withdistributed control
♦ Various Published Algorithmss
see the text-book
♦ Many of them proven incorrectt
see the text-book
♦ Often impracticalu
e.g., Chandy-algorithm
− requires sending of/responding to probes by ablocked process
University of Pittsburgh Manas Saksena 45
Deadlock Prevention
♦ Eliminate one of 4 conditions for deadlock
♦ Examples:v
Acquire all resources in one atomic step
− eliminates wait-while-hold conditionw
Preemption
− a blocked (lower priority) process releasesresources requested by an active (higherpriority) process
xCircular Wait
− order resources
− request resources in ascending order only
University of Pittsburgh Manas Saksena 46
Distributed Deadlock Prevention
♦ Transactionsy
distributed computationsz
can be safely aborted
− execute again by restarting
♦ Global Time (e.g., using Lamport’s scheme){
Can be used to prioritize requests|
Give priority to older processes
− have consumed more system resources
University of Pittsburgh Manas Saksena 47
Distributed Deadlock Prevention
♦ Never Allow a “younger” process to wait foran “older” one
}if wait is going to occur, kill the younger process
~(it can restart later and do the work)
♦ Preempt�
Younger processes wait for Older ones�
Older ones preempt younger ones
− if older process needs a resource, held by ayounger process
– kill younger one, release resources– younger one can restart later