dcs 6. basic distributed algorithms fundamentals wei yuan november,21,2013
TRANSCRIPT
SDP-MARCH-Talk
DCS 6. Basic Distributed Algorithms Fundamentals
Wei YuanNovember,21,2013
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
2
Physical Clocks
• Most computers today keep track of the passage of time with a battery-backed-up CMOS clock circuit, driven by a quartz oscillator. – battery backup to continue measuring time when power
is off
• Two registers with quartz: counter, holding register
• A Programmable Interval Timer, to generate an interrupt (clock tick) periodically
• The interrupt service procedure simply adds one to a counter in memory.
3
Problem
• Getting two systems to agree on time– Two clocks hardly ever agree– Quartz oscillators oscillate at slightly different
frequencies
• Clocks tick at different rates– Create ever-widening gap in perceived time– Clock Drift (时钟漂移)
• Difference between two clocks at one point in time– Clock Skew (时钟偏移)
4
Solution
• 国际原子时间( international atomic time , TAI )• 统一协调时间( Universal coordinated
time , UTC )• ……• 时间同步算法
5
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
6
Lamport’s Logical Clock
• A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages. – A network of interconnected computers, the ARPA net– A single computer :the central control unit, the memory
units, and the input-output channels are separate processes
• Lamport L. Time, clocks, and the ordering of events in a distributed system[J]. Communications of the ACM, 1978, 21(7): 558-565.
7
Lamport’s happened before (→) relation
• Define the "happened before" relation without using physical clocks(partial ordering)
• Assumption– the system is composed of a collection of processes– Each process consists of a sequence of events– the execution of a subprogram on a computer– the execution of a single machine instruction
• We are assuming that the events of a process form a sequence, where a occurs before b in this sequence if a happens before b.
8
Lamport’s happened before () relation
(1)In the same process:if
(2) If is the sending of a message by one process and is the receipt of the same message by another process, then . (3) If and then.
• Two distinct events and are said to be concurrent if and .
• Assume that for any event . ( is an irreflexive partial ordering)
9
space-time diagram
• horizontal: space• vertical: time• dots: events• vertical lines:
process• wavy lines:
messages
10
• A clock is just a way of assigning a number to an event (abstract) – Clock for each process
• assign a number to any event in the process
– Clock for the entire system • = if is an event in process
• Clock Condition– For any events , : if then .– Cannot expect the converse condition to hold, since that
would imply that any two concurrent events must occur at the same time.(e.g., p2&p3 are both concurrent with q3)
11
• A process’ clock “ticks”– ( 1 ) means that there must be a tick line between any
two events on a process line– ( 2 ) means that every message line must cross a tick
line
12
Event counting example
13
Lamport’s logical timestamps
• Process ’s clock is represented by a register , so is the value contained by during the event .
• All processes use a local counter (logical clock) with initial value of zero
• Just before each event, the local counter is incremented by 1 and assigned to the event as its timestamp
• A send (message) event carries its timestamp • For a receive (message) event, the counter is
updated by max (receiver’s-local-counter, message-timestamp) + 1
14
Event counting example
Applying Lamport’s algorithm
15
Problem: Identical timestamps
• Concurrent events (e.g., b & g; i & k) may have the same timestamp … or not
• Total ordering: every event is assigned a unique timestamp (number), every such timestamp is unique.
16
Unique timestamps (total ordering)
We can force each timestamp to be unique• Define global logical timestamp
– represents local Lamport timestamp– represents process number (globally unique)
• e.g., (host address, process ID)
• Compare timestamps:– if and only if – or and
• Does not necessarily relate to actual event ordering
17
• Unique (totally ordered) timestamps
18
Problem: Detecting causal relations
• If – We cannot conclude .
•By looking at Lamport timestamps– We cannot conclude which events are causally related
•Solution: use a vector clock
19
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
20
Vector clocks
Rules:1. Vector initialized to 0 at each process 2. Process increments its element of the vector in local vector before timestamping event: 3. Message is sent from process with attached to it4. When receives message, compares vectors element by element and sets local vector to higher of two values • For example, received: [ 0, 5, 12, 1 ], have: [ 2, 8, 10, 1] new timestamp: [ 2, 8, 12, 1 ]
21
Comparing vector timestamps
• Define iff iff• For any two events e, e’
if then V(e) < V(e’)
… just like Lamport’s algorithm
if V(e) < V(e’) then
• Two events are concurrent if neither
V(e)V(e’) nor V(e’) V(e)
22
Vector timestamps
23
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
24
(1,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
25
(0,0,0)
(0,0,0)
(0,0,0)
(1,0,0)
(2,0,0)
Vector timestamps
26
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
27
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
28
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
29
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
30
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Two events are concurrent if neither V(e)≤V(e’) nor V(e’)≤ V(e)
Vector timestamps
31
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0)
Vector timestamps
32
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0) (2,1,0
)
Vector timestamps
33
(1,0,0)
(2,0,0)
(0,0,0)
(0,0,0)
(0,0,0) (2,2,0
)
Outline
• Physical Clocks• Logical Clocks– Lamport’s Logical Clock– Vector Clock
• Global Snapshots
34
“Distributed snapshots: determining global states of distributed systems”, K. Mani Chandy and Leslie Lamport, ACM TOCS 1985
35
Model of a Distributed System
• Finite set of processes as nodes.• Finite set of channels as edges.• Channels have infinite buffers, are error-free and FIFO.• The delay experienced by a message is arbitrary but finite.
36
p q
r
c1
c2
c3c4
A banking example to illustrate recording of consistent states
37
Global State of a Distributed System
Global State:• Union of the local states of the individual processes and the
state of the channels.• The state of a channel is determined by “Message in transit”
where the message is sent along the channel but not yet received.
• Initial global state for system:– each process is in initial state– the state of each channel is empty sequence
38
分布式系统的每个组件都有一个本地状态。进程状态:由本地存储器和活动历史描述。通道状态:由沿通道发送的消息减去沿通道接收消息的序列描述。
Global State Detection
• Many problems in distributed systems can be solved by detecting a global state of system.
• Stable property detection– A stable property which once becomes true, remains true
forever.– E.g. termination, deadlock, token loss etc.
• Checkpointing in distributed systems– E.g .debugging, failure recovering etc.
39
分布式系统中没有共享的存储器和全局时钟,本地时钟和本地存储器这样的分布式特性使得有效记录系统全局状态很困难。
检测如死锁和终止这样的稳态特性时,就需要检查系统全局状态。对于故障恢复,需要周期性地保存分布式系统的全局状态(称检查点),并通过把系统还原到最近保存的全局状态使恢复工作从进程故障点开始。
Distributed Computation• A distributed computation is the sequence of events.• There are three kind of events: local, send, receive.• An event is an atomic action that may change the state of
the process p and the state of at most one channel that is incident on p.
Definition of Event e• Event is a five-tuple e = <p, s, s', M, c>, where• p is the process in which the event occur,• s is the state of p immediately before the event,• s' is the state of p immediately after the event,• M is the message sent or received along the channel c.
40
Consistent Global State
• Consistency: every message that is recorded as received has also been recorded as sent.
• Consistent global states determined by a snapshots are the states that may have occurred during the computation.
41
同时满足以下两个条件:C1. 消息守恒。记录在进程 pi 的本地状态中发送的消息 mij
必须出现在通道 Cij 的状态中,或是出现在接收方进程 pj 的本地状态中。C2. 在得到的全局状态中,对于每一个结果,引起结果的原因也必须出现。
Chandy–Lamport Algorithm
• Each process in the system records its local state and the state of its incoming channels.
• Recorded states form a consistent global state.• Snapshot algorithm runs concurrently with the computation
but does not alter the underlying computation.• Snapshot algorithm uses marker as a recording signal.• Any process can initiate the snapshot by sending a marker
for all outgoing channels.• On receiving a marker a process records its own local state
and the states of all incoming channels.
42
Chandy–Lamport Algorithm contd.
Marker-Sending Rule for Process pi
(1) Process pi records its state.
(2) For each outgoing channel C on which a markerhas not been sent, pi sends a marker along C
before pi sends further messages along C.
43
Chandy–Lamport Algorithm contd.
Marker-Receiving Rule for Process pj
On receiving a marker along channel C:if pj has not recorded its state then
Record the state of C as the empty set Execute the “marker sending rule”else Record the state of C as the set of messages received along C after pj ’s state was recorded
and before pj received the marker along C
44
Thanks!Q&A
45
附• 集合上的关系称为偏序关系或偏序,当且仅当是自反的、反对称的和传
递的。• 偏序( Partial Order )设 A 是一个非空集, P 是 A 上的一个关系,若 P 满足下列条件:1. 对任意的 a∈A ,( a,a )∈ P;( 自反性)2. 若( a,b )∈ P ,且( b,a )∈ P ,则 a=b; (反对称性)3. 若( a,b )∈ P ,( b,c )∈ P ,则( a,c )∈ P; (传递性)则称 P 是 A 上的一个偏序关系。若 P 是 A 上的一个偏序关系,我们用 a≤b 来表示( a,b )∈ P 。
• 设如果对于每一个,或者有,或者有 , 则称小于等于为上的全序或线序。46