1 lecture 1 distributed algorithms gadi taubenfeld © 2011 distributed algorithms gadi taubenfeld...
TRANSCRIPT
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
1
Distributed AlgorithmsDistributed Algorithms
Gadi TaubenfeldGadi Taubenfeld
Lecture 1
INTRODUCTIONINTRODUCTION
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
2
A note on the use of these ppt slides:I am making these slides freely available to all (faculty, students, readers). They are in PowerPoint form so you can add, modify, and delete slides and slide content to suit your needs. They obviously represent a lot of work on my part. In return for use, I only ask the following:
That you mention their source, after all, I would like people to use my
book!
That you note that they are adapted from (or perhaps identical to) my slides, and note my copyright of this material.
Thanks and enjoy! Gadi Taubenfeld
All material copyright 2011Gadi Taubenfeld, All Rights Reserved
A note on the use of these ppt slides:I am making these slides freely available to all (faculty, students, readers). They are in PowerPoint form so you can add, modify, and delete slides and slide content to suit your needs. They obviously represent a lot of work on my part. In return for use, I only ask the following:
That you mention their source, after all, I would like people to use my
book!
That you note that they are adapted from (or perhaps identical to) my slides, and note my copyright of this material.
Thanks and enjoy! Gadi Taubenfeld
All material copyright 2011Gadi Taubenfeld, All Rights Reserved
This lecture is based on Chapter 1 from the textbook
To get the most updated version of these slides go to: http://www.faculty.idc.ac.il/gadi/book.htm
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
3
SynchronizationSynchronization
Type of synchronization Contention Coordination
Why synchronization is difficult? Easy between humans Communication between
computers is restricted• reading and writing of notes• sending and receiving
messages
Type of synchronization Contention Coordination
Why synchronization is difficult? Easy between humans Communication between
computers is restricted• reading and writing of notes• sending and receiving
messages
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
4
The Too-Much-Milk ProblemThe Too-Much-Milk Problem
Time Alice Bob
5:00 Arrive Home
5:05 Look in fridge; no milk
5:10 Leave for grocery
5:15 Arrive home
5:20 Arrive at grocery Look in fridge; no milk
5:25 Buy milk Leave for grocery
5:30 Arrive home; put milk in fridge
3:40 Buy milk
3:45 Arrive home; too much milk!
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
5
Solving the Too-Much-Milk ProblemSolving the Too-Much-Milk Problem
Correctness properties Mutual Exclusion:
Only one person buys milk at a time
Progress: Someone always buys milk if it is needed
Communication primitives
Leave a note (set a flag)
Remove a note (reset a flag)
Read a note (test the flag)
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
6
Important DistinctionImportant Distinction
Safety Property Nothing bad happens ever Example: mutual exclusion
Liveness Property Something good happens
eventually Example: progress
Safety Property Nothing bad happens ever Example: mutual exclusion
Liveness Property Something good happens
eventually Example: progress
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
7
Solution 1Solution 1
Does it work?
Aliceif (no note) { if (no milk) { leave Note buy milk remove note }}
Bobif (no note) { if (no milk) { leave Note buy milk remove note }}
No, may end up with two milk.
mutual exclusionprogress
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
8
Solution 2Solution 2
Aliceleave note Aif (no note B) { if (no milk) {buy milk}}remove note A
Bobleave note Bif (no note A) { if (no milk) {buy milk}}remove note B
Using labeled notes
Does it work? No, may end up with no milk.
mutual exclusion progress
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
9
Solution 3Solution 3
Does it work?
Aliceleave note Awhile (note B) {skip} if (no milk) {buy milk}
remove note A
Bobleave note Bif (no note A) { if (no milk) {buy milk}} remove note B
No ! a timing assumption is needed
mutual exclusion progress
Bob’s code is as before
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
10
Is solution #3 a good solution? Is solution #3 a good solution? NoNo
A timing assumption is needed! Unfair to one of the processes It is asymmetrical and complicated Alice is busy waiting – consuming CPU
cycles without accomplishing useful work
A timing assumption is needed! Unfair to one of the processes It is asymmetrical and complicated Alice is busy waiting – consuming CPU
cycles without accomplishing useful work
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
11
Solution 4Solution 4Using 4 labeled
notes
A1 A2 B2 B1
the fridge’s door
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
12
Solution 4Solution 4Notations
color:there is a note A1 A2 B2
Alice’s turn
B2dotted line:there is no note
solid line without color:we do not care if there is a note
who will buy milk ?
XX
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
13
A1 A2 B2
Alice’s turn
A1 A2 B2 B1
Alice’s turn
A2 B2 B1
Bob’s turn
A1 A2 B2 B1
Alice’s turn
A1 A2 B2 B1
Bob’s turn
A1 A2 B2 B1
Bob’s turn
Solution 4Solution 4
B2
B2
X X
X X
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
14
Aliceleave A1if B2 {leave A2} else {remove A2}while B1 and ((A2 and B2) or (no A2 and no B2)) {skip} if (no milk) {buy milk} remove A1
Bobleave B1if (no A2) {leave B2} else {remove B2}while A1 and ((A2 and no B2) or (no A2 and B2)) {skip} if (no milk) {buy milk} remove note B1
Solution 4Solution 4
A correct, fair, symmetric algorithm!
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
15
A variant of Solution 4A variant of Solution 4The order of the first two statements is replaced
Is it correct ?
No, may end up with two milk. mutual exclusionprogress
Aliceif B2 {leave A2} else {remove A2}leave A1while B1 and ((A2 and B2) or (no A2 and no B2)) {skip} if (no milk) {buy milk} remove A1
Bobif (no A2) {leave B2} else {remove B2}leave B1while A1 and ((A2 and no B2) or (no A2 and B2)) {skip} if (no milk) {buy milk} remove note B1
MILKMILK
A1 B2 B1
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
16
QuestionQuestion
Alice and Bob have a daughter C,make it work for all the three of them.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
17
QuestionQuestion
Process Afor i = 1 to 5 { x:=x+1 }
Process Bfor i = 1 to 5 { x:=x+1 }
x is an atomic register, initially 0
Q: What are all the possible values for x after both processes terminate?
A: 2 through 10 inclusive.
00
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
18
The smallest value is 2The smallest value is 2
00
A
0 1
B
0 1
112233441122334455
1 2
22
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
19
QuestionQuestion
Process Ax:=x+1;x:=x+1
Process Bx:=x+1;x:=x+1
Process Cx:=10
x is an atomic register, initially 0
Q: What are all the possible values for x after the three processes terminate?
A: 2,3,4,10,11,12,13,14.
00
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
20
The coordinated attack problemThe coordinated attack problem• Blue army wins if both blue camps attack simultaneously• Design an algorithm to ensure that both blue camps attack simultaneously
Enemy
1 2
The problem is due to by Jim Gray (1977)
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
21
The coordinated attack problemThe coordinated attack problem• Communication is done by sending messengers across valley• Messengers may be lost or captured by the enemy
Enemy
1 2
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
22
The coordinated attack problemThe coordinated attack problem
Theorem: There is no algorithm that solves the problem !
• Assume to the contrary that such an algorithm exits.
• Let P be an algorithm that solves with fewest
messages.
• P should work even when the last messenger is
captured.
• So P should work even if the last messenger is never
sent.
• But this is a new algorithm with one less message.
• A contradiction.
Proof:
The (asynchronous) model is too weak.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
23
The first problem we cover in detail. A generalization of the too-much-milk problem.
The mutual exclusion problemThe mutual exclusion problemChapters 2+3
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
24
Blocking and Non-blocking SynchronizationBlocking and Non-blocking SynchronizationConcurrent data structuresConcurrent data structures
P1 P2 P3
data structure
counter, stack, queue, link listcounter, stack, queue, link list
Chapter 4
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
25
Blocking Blocking
P1 P2 P3
data structure
Chapter 4
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
26
Non-blockingNon-blocking
P1 P2 P3
data structure
Chapter 4
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
27
Coarse-Grained LockingCoarse-Grained Locking
Easier to program, but not scalable.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
28
Fine-Grained LockingFine-Grained Locking
Efficient and scalable, but too complicated (deadlocks, etc.).
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
29
ConsensusConsensus
1
1
0
1
1
1
1
1
0
1
1
1
0
1
requirements: validity agreement termination
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
30
Impossibility of Consensus Impossibility of Consensus with One Faulty Processwith One Faulty Process
What can be done?
Strong primitives.
Timing assumptions.
Randomizations.
Theorem: There is no algorithm that solve the consensus problem using atomic read/write registers in the presence of a single faulty process.We give a detailed proof of this important result in Chapter 9 of the book.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
31
Barr
ier
P1P1
P2P2
P3P3
P4P4
Barr
ier
P1P1
P2P2
P3P3
P4P4
P1P1
P2P2
P3P3
P4P4
time
Barr
ier
Barrier SynchronizationBarrier SynchronizationChapter 5
four processes approach the barrier
all except P4 arrive
Once all arrive, they continue
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
32
Example: Prefix SumExample: Prefix Sum
abegin b c d e f
aend a+b a+b+c a+b+c+da+b+c+d+e
a+b+c+d+e+f
time
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
33
Example: Sequential Prefix SumExample: Sequential Prefix Sum
abegin b c d e f
a a+b c d e f
a a+b a+b+c a+b+c+d e f
aend a+b a+b+c a+b+c+da+b+c+d+e
a+b+c+d+e+f
a a+b a+b+c d e f
a a+b a+b+c a+b+c+da+b+c+d+e f
time
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
34
Example: Parallel Prefix SumExample: Parallel Prefix Sum
abegin b c d e f
a a+b b+c c+d d+e e+f
a a+b a+b+c a+b+c+db+c+d+ec+d+e+f
aend a+b a+b+c a+b+c+da+b+c+d+e
a+b+c+d+e+f
time
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
35
Example: Parallel Prefix SumExample: Parallel Prefix Sum
abegin b c d e f
a a+b b+c c+d d+e e+f
a a+b a+b+c a+b+c+db+c+d+ec+d+e+f
aend a+b a+b+c a+b+c+da+b+c+d+e
a+b+c+d+e+f
barrier
barriertime
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
36
Leader electionLeader election
0 0 1 0 0 0 0
O(n log n) messages is possible.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
37
Distributed minimum weight spanning treeDistributed minimum weight spanning treeGraph algorithmsGraph algorithms
4
O(n log n + E) message complexity is possible.
MIS, BFS, ...
3
51
259
5
8
7
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
38
SnapshotSnapshot
How to “learn” the global state of a network by collectingthe local state of all the processes (and of the links)
deadlock detection, termination detection, etc.deadlock detection, termination detection, etc.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
39
1. P1 reports 2002. P1 transfer 100 to P23. P2 reports 200Manager knows of 400 $
1. P1 reports 2002. P1 transfer 100 to P23. P2 reports 200Manager knows of 400 $
M
P2
200 100
100
200100
Snapshot (example)Snapshot (example)
Solutions Freezing Global clock Sending everything via M The snapshot
algorithm !!!
Solutions Freezing Global clock Sending everything via M The snapshot
algorithm !!!
P1
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
40
End-to-end communicationEnd-to-end communication
sender network receiver
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
41
RenamingRenamingreducing the name space
1
100
5
2
500
7
7
1
999
13
700
5
88
10
n
777,888
3
Tight bound of 2n-1 names (using atomic registers).
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
42
Clock synchronizationClock synchronization(Data synchronization)
The problem requires processors to bring their clocks close together, by using communication among them.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
43
CommunicationCommunicationmodels of computationmodels of computation
Concurrent programming Shared memory safe registers atomic registers (read/write) test&set swap compare-and-swap load-link/store-conditional read-modify-write semaphore monitor
Concurrent programming Shared memory safe registers atomic registers (read/write) test&set swap compare-and-swap load-link/store-conditional read-modify-write semaphore monitor
What is the relative power of these communication primitives?Answer: see Chapter 9.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
44
CommunicationCommunicationmodels of computationmodels of computation
Message passing send/receive multi-cast broadcast network topology
Message passing send/receive multi-cast broadcast network topology
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
45
TimeTimemodels of computationmodels of computation
Synchronous(CRCW, EREW)
AsynchronousPartiallysynchronous
What is the relative power of these timing models?
Concurrent/Distributed
Parallel
• less powerful• realistic• extremely powerful
• unrealistic
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
46
How many can fail ? constant,
n/3, ...
Wait-freedom, Non-blocking, Lock-freedom
what is it ?
Self-stabilization what is it ?
How many can fail ? constant,
n/3, ...
Wait-freedom, Non-blocking, Lock-freedom
what is it ?
Self-stabilization what is it ?
Fault-toleranceFault-tolerancemodels of computationmodels of computation
What can fail? processes shared memory links
Type of faults crash omission Byzantine
What can fail? processes shared memory links
Type of faults crash omission Byzantine
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
47
TerminologyTerminology
Hardware Processors
Software Threads, processes
Sometimes OK to confuse them, sometimes not.
Scheduler
Hardware Processors
Software Threads, processes
Sometimes OK to confuse them, sometimes not.
Scheduler
In this course it is always ok to confuse between processes and threads.
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
48
Problems Models
read/write
Algorithms Upper and lower bounds Impossibility results Simulations & Implementations
Problems & ModelsProblems & Models
barrier send/receive
swap
LL/SC
readers&writers
consensus
mutex
Lecture 1 Distributed Algorithms Gadi Taubenfeld © 2011
49
History of the mutex problemHistory of the mutex problem
1965 Defined by Dijkstra First solution by Dekker General solution by Dijkstra
Fischer, Knuth, Lynch, Rabin, Rivest, ...
1974 Lamport’s bakery algorithm 1981 Peterson’s solution There are hundreds of published
solutions Lots or related problems
E. W. Dijkstrawww.cs.utexas.edu/users/EWD/(Photo by Hamilton Richards 2002)