20 february 2004 ukc, february 2004 1 mmnet summer school tuesday 20 – wednesday 21july, 2004,...
TRANSCRIPT
![Page 1: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/1.jpg)
20 February 2004 UKC, February 2004 1
mmnet Summer School
Tuesday 20 – Wednesday 21July, 2004, Canterbury
Speakers:• David Bacon, IBM TJ Watson.• Emery Berger, UMass. • Robert Berry, IBM Hursley.• Hans Boehm, HP. • Dave Detlefs, Sun Microsystems.• Rick Hudson, Intel. • Eliot Moss, UMass.
![Page 2: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/2.jpg)
Birrell’s Reference Listing Revisited
Richard JonesUniversity of Kent
Peter DickmanUniversity of Glasgow
Luc MoreauUniversity of Southampton
![Page 3: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/3.jpg)
20 February 2004 UKC, February 2004 3
Outline
1. Distributed reference counting – benefits & issues.
2. Birrell’s algorithm – example.
3. Weaknesses of Birrell’s description.
4. Our approach• Graphical notation• Formalisation & Proof
5. Extensions• Fault tolerance• Optimisation
6. Conclusion.
![Page 4: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/4.jpg)
20 February 2004 UKC, February 2004 4
Problems of a distributed world
Concurrency everywhere• must avoid race conditions, etc
Communication is costly• changing the reference count of a remote object may cost
10,000 times as much as changing the count of a local object
Not easy to get complete knowledge of object graph• synchronisation is expensive
Faults everywhere• communications, processes
![Page 5: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/5.jpg)
20 February 2004 UKC, February 2004 5
Terminology
Processes: partition computational and storage resources.
Messages pass in point-to-point channels between processes.
Channels have properties, such as FIFO or lossy.
A reference is local if it refers to an object allocated in the same process; alternatively, it is remote (or global).
The owner of a reference is the process that initially allocated the object to which the reference refers.
![Page 6: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/6.jpg)
20 February 2004 UKC, February 2004 6
Distributed Reference Counting/Listing
Most widely used DGC technique• Maintain a count of remote references to each
global object• Reference listing alternative
Benefits• Scalable solution• Easy to implement
But…• Cannot reclaim garbage cycles• Easy to implement wrong!
![Page 7: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/7.jpg)
20 February 2004 UKC, February 2004 9
Birrell’s algorithm
Birrell, Evers, Nelson, Owicki, and Wobber. Distributed Garbage Collection for Network Objects. DEC SRC technical report 116, 1993.
Widely used: Modula-3 Network Objects; Java RMI.
Based on reference listing, avoids race conditions of naïve implementations, fault tolerance.
![Page 8: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/8.jpg)
20 February 2004 UKC, February 2004 10
Birrell’s description
Object table
w (o)
concrete O
o.dirtySet = {Q,…}surrogate for O
weak ref
Process P: owner of OProcess Q: a client of O
Object table
w (o)
Concrete and surrogate objects.•Client invokes the surrogate, whose methods perform RPC to owner.
WireRep: unique ID of owner, plus index of object at the owner.
•Marshalling
Object table: maps a wirerep w(o) to the local instance of the object.
•Client has surrogate for o concrete o in object table
Dirty set: identifiers of processes that have surrogates.•Dirty-set = o can be removed from the object table.
![Page 9: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/9.jpg)
20 February 2004 UKC, February 2004 11
P marshalls o to Q
P pushes o onto its stack;sends w(o) to Q.
1. Q looks it up in its object table. Present: use the object w(o)=NIL: surrogate being created; suspend.
2. Absent: enter w(o)=NIL in object table; send dirty(o) to owner(o);
3. Owner adds Q to its dirtySet(o) and dirty(o) returns.
4. Q creates surrogate(o) and adds it to its object table.
5. Q deletes surrogate(o) and sends clean(o) to owner(o).
P
QO
![Page 10: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/10.jpg)
20 February 2004 UKC, February 2004 12
ack
dirty
{A}
copy
Dirty calls
{A,B}
A B
Logkeep ref on stackcopydirtyackremove from stack
![Page 11: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/11.jpg)
20 February 2004 UKC, February 2004 14
Weaknesses
Tightly bound to RPC• Acknowledgement mechanism.
Implementation specific.• Assumes method invocation pushes arguments onto stack;• Unique surrogate per process (object-listing)
Under-specified• Critical sections• Race conditions• Other scenarios
Informal proof• Depends on hard-to-formalise aspects (e.g. stack)
![Page 12: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/12.jpg)
20 February 2004 UKC, February 2004 15
Our contribution
Novel graphical notation.
Formalisation.
Discovered requirement for pivotal new states.
Proof.
![Page 13: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/13.jpg)
20 February 2004 UKC, February 2004 16
New graphical notation
Intuitive.
Precise.
Uniformity of ‘direction’ of transitions.
‘Obvious’ where transitions are needed.
![Page 14: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/14.jpg)
20 February 2004 UKC, February 2004 17
Lifecycle of references
ccit ccitnil
nil
Receive reference andnote the source
Receive reference andnote the source
dirty_ack from Ownersend copy_ack to Sender
RRAR
GC unreachable
send copyOK
OKrcv Ack 1 2 3
...
nilObvious where transitions are needed
•E.g. Receive reference at state ccit.•ccitnil critical for correctness.
![Page 15: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/15.jpg)
20 February 2004 UKC, February 2004 18
Slicing
![Page 16: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/16.jpg)
20 February 2004 UKC, February 2004 19
Fault tolerance
Slicing• Owner is aware we
have a reference.
• Owner is not aware we have a reference.
ccit
ccitnil
nil
nil
OK
OKccitl
ccitu
ccitnill
nill
nilu
ccitnilu
![Page 17: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/17.jpg)
20 February 2004 UKC, February 2004 20
Benefits
Intuitive – fault-tolerant version literally encapsulates failure-free version.
Identify precisely when failures can be detected.
Define states reached after failures detected.
Remedial actions.
![Page 18: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/18.jpg)
20 February 2004 UKC, February 2004 22
Formalisation
Abstract machine• Processes communicating by asynchronous
message passing.• Atomic transitions involve 1 process at a time.
Receipt of message changes only a process’ internal state
• Trigger sending of a another message?• Store some info in a to do table?
![Page 19: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/19.jpg)
20 February 2004 UKC, February 2004 23
Benefits
Inputs and outputs desynchronised.
Size of critical sections explicit and minimised.
Asynchronous outputs (e.g. background daemon processes to do tables).
Suitable for mechanical proof.
![Page 20: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/20.jpg)
20 February 2004 UKC, February 2004 24
Formalisation
Rule name: guard pseudo-statements.
make_copy (p1,p2,r):
p1 p2 receive_T(p1,r)=OK locallyReachable(p1,r)
{
id := new Identifier;
dirty_T(p1,r) := dirty_T(p1,r) U (p1,p2,id);
post(p1, p2, copy(r,id));
}
name
guard
table
message
![Page 21: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/21.jpg)
20 February 2004 UKC, February 2004 25
More formally
Tables defined as functions whose first argument is a process.
Channels are bags of messages between pairs of processes.
A configuration of the abstract machine is a tuple of all tables and message channels.
Pseudo-statements act as configuration transformers:• Given a configuration <…,table_T,…, k>,
• table_T(a0,…an):=V denotes <…,table_T',…,k> wheretable_T'(x0,…xn) = table_T(a0,…an) if (x0,…xn) (a0,…an) table_T'(a0,…an) = V
• post(p1,p2,m) denotes <…,table_T,…,k'> wherek'(p1,p2) = k(p1,p2) {m}k'(pi,pj) = k(pi,pj), (pi,pj) (p1,p2)
![Page 22: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/22.jpg)
20 February 2004 UKC, February 2004 26
Proof style
Safety & Liveness
Invariance-based proof• Induction on length of transitions.• Case analysis of transitions.• Termination measure.
Benefits• Systematic.• Less error prone than temporal reasoning.
– E.g. establishing fine details such as mutual exclusivity complicated in a formalism based on temporal reasoning.
![Page 23: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/23.jpg)
20 February 2004 UKC, February 2004 27
Example proofLemma: For any processes p1, p2, for any reference r, for any identifier id and for any configuration, the following implication holds:
If <p1,p2,id> dirty_T(p1,r) then receive_T (p1,r) = OK
Proof: In the initial configuration, dirty tables are empty and the implication trivially holds. We consider the four rules that add/remove entries to/from dirty tables and that modify the content of receive tables to/from OK.• make_copy (p1, p2,r): make_copy adds an entry <p1,p2,id>, and
its guard ensures that the receive-table is in the OK state.• …
![Page 24: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/24.jpg)
20 February 2004 UKC, February 2004 29
Key Lemmas
Safety Lemma 3: Unusable Reference For any process p1, for any reference r and for any configuration, the following implication holds:
If receive_T(p1, r)=nil receive_T(p1, r)=ccitnil, then there exists p such that p dirty_T(owner(r), r)or there exist p,id such that <owner(r),p,id> dirty_T(owner(r), r).
Safety Lemma 2: Reference in Transit For any processes p1, p2, for any reference r, for any identifier id and for any configuration, the following implication holds:
If copy(r,id) k(p1,p2), then p1 dirty_T(owner(r),r), if p1 owner(r)or <owner(r),p2,id> dirty_T(owner(r),r), if p1 = owner(t)
Safety Lemma 1: Usable Reference For any processes p1 and p2, for any reference r with p1=owner(r) and p1p2, and for any configuration, the following implication holds:
If receive_T(p1,r)=OK, then p1 dirty_T(p2,r).
permanent
temporary
![Page 25: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/25.jpg)
20 February 2004 UKC, February 2004 30
Birrell’s algorithm is Safe
A DGC algorithm is safe if the collector cannot reclaim live objects. For Birrell's algorithm, there must be an entry in the owner's dirty table for every live object.
The proof follows directly from the 3 safety lemmas.
Birrell's Safety Requirement For all references r, and for all processes p1 and p2 and all identifiers id,
If receive t(p1,r)=OK receive_T(p1,r)=nil receive_T(p1,r)=ccitnil copy (r,id) k(p1,p2),
then there exists p such that p dirty_T(owner(r),r)or there exist p,id such that <owner(r),p,id> dirty_T(owner(r),r).
![Page 26: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/26.jpg)
20 February 2004 UKC, February 2004 31
Liveness
Liveness guarantees that if all references to an object are deleted, the owner’s dirty table will eventually become empty.
To prove this,
• We show that whenever there’s a message in a channel, a transition can be fired to consume it.
• We introduce a termination measure on the configurations that shows how far the abstract machine is from completing, and show that DGC transitions cause this measure to decrease.
• Hence all transition paths terminate.
![Page 27: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/27.jpg)
20 February 2004 UKC, February 2004 32
Termination measures
termination_measure(c) = tab_measure + msg_measure(m)
+ rt_measure(receive_T(p,r))
tab_measure = 9|dirty_call_todo_T| + 7|dirty_ack_todo_T| + 2|copy_ack_todo_T| + 2|clean_ack_todo_T| + 2|blocked_T|and
rt_measure(OK) =5 rt_measure(ccitnil) =2rt_measure(ccit) = 1rt_measure(nil) = 1rt_measure() = 0
msg_measure(copy) = 14msg_measure(dirty) = 8msg_measure(dirtyack) = 6msg_measure(clean) = 3msg_measure(copyack) =1msg_measure(cleanack)= 1
size of tables
messages between pairs of processes
states of references in processes
values chosen‘arbitrarily’
![Page 28: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/28.jpg)
20 February 2004 UKC, February 2004 33
Example: receive_dirty_ack
receive_dirty_ack (p1, p2,r) : dirtyack(r) k (p1, p2){ receive(p1,p2,dirtyack(r)); //-6 copyack_todo_T(p2) := copyack_todo_T(p2) blocked_T(p2,r); //-X // Deserialisation code to be resumed for each entry in blocked_T(p2,r) blocked_T(p2,r) := ; //-X receive_T(p2,r) := OK; //+5 }
Thus, termination measure decreases by 1.
measure= -1
![Page 29: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/29.jpg)
20 February 2004 UKC, February 2004 34
Optimisations
FIFO channels• Less synchronisation needed• Fewer messages: no clean_ack• Fewer tables.
Sender is owner• No need for dirty_call and copy_ack• But need message ordering to avoid races
Receiver is owner• Fewer dirty table entries• Again need message ordering
![Page 30: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/30.jpg)
20 February 2004 UKC, February 2004 35
Future work
Convince ourselves of appropriateness of Birrell’s remedial actions.
Correctness proof of fault-free version.
Explore applicability of our techniques• Graphical notation• Proof-techniques• Generality
Auto-generation of code from formalism.
![Page 31: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/31.jpg)
20 February 2004 UKC, February 2004 36
Conclusion
Intuitive graphical notation.
Formal, implementation-independent specification and proof of a widely used algorithm.
Discovered weaknesses in original presentation.
A widely applicable technique?
![Page 32: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/32.jpg)
Questions?
![Page 33: 20 February 2004 UKC, February 2004 1 mmnet Summer School Tuesday 20 – Wednesday 21July, 2004, Canterbury Speakers: David Bacon, IBM TJ Watson. Emery](https://reader035.vdocuments.site/reader035/viewer/2022062620/551bc146550346b4588b4ac0/html5/thumbnails/33.jpg)
20 February 2004 UKC, February 2004 38
FINIS