garbage collection in the next c++ standard hans-j. boehm, mike spertus, symantec

22
Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Upload: marissa-lopez

Post on 26-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Garbage Collection in the Next C++ Standard

Hans-J. Boehm,

Mike Spertus, Symantec

Page 2: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

The Context (1)

• Conservative garbage collection for C and C++ has been used for 20+ years.

– Usually works, possibly with a small amount of tweaking.– Especially for 64-bit applications.

• More attractive with multi-core processors.– Explicit memory management gets harder with threads.– Some parallel programming techniques much more

difficult/expensive without GC.– GC parallelizes better than malloc/free.

• GC-based leak detectors are also common.• One major limiting factor:

– C and C++ standards don’t fully sanction garbage collecting implementations.

– Programmers are hesitant to use nonstandard tools.

Page 3: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

The Context (2)

• C++ standard is undergoing revision.• “C++0x” expected somewhere near 2010 or 2011.

– Initial committee draft was put out for review.

• Many other new features:– “Concepts” (Templates type-checked in isolation).– Threads support (threads API, memory model, atomics).

• struggling with object lifetime issues.

– Library-based classic reference counting (shared_ptr).– R-value references (references to otherwise inaccessible

values) support low-cost shared_ptr moves.

• Microsoft’s C++/CLI provides a separate garbage-collected heap.

Page 4: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Our Goal

• “Transparent” garbage collection.– Ordinary pointers; works with existing library

code.– Supports

• Code designed for GC• Leak detection• “Litter collection”

– Supports atomic pointers with cheap assignment.

Page 5: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Our Proposal, version 1

• GC support in the implementation “mandatory”.• GC use optional, but must be consistent across

application.– If you have to trace a section of the heap, you might

as well collect it.• Program sections specify “gc_forbidden”,

“gc_required”, or “gc_safe” (default).– Linker diagnoses conflicts.

• Annotations can specify when integral types may contain pointers.

• This proposal is currently on hold, not in CD.

Page 6: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Issues with original proposal (1)

• gc_required / gc_forbidden must be consistent for whole program:– Too coarse.– Need to deal with plug-ins with limited

interface.

Page 7: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Issues with original proposal (2)

• Finalization is needed for interaction of GC with explicit resource management.

• Finalization is problematic in the presence of dead variable elimination.

class C {

int indx;

// E[indx] contains

// associated data.

// Finalizer cleans up E[indx]

void foo() {

int i = indx;

// this dead here.

// May be finalized?

bar(E[i]);

}

Page 8: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Our proposal, version 2

• Minimal compromise proposal– Garbage collected implementations are allowed, not

required.• Officially allows collection of memory allocated

with built-in operator new.– malloc() is arguably in the domain of the C

committee.– malloc() garbage collection may be harder to

retrofit.• Not intended as long term replacement for

proposal 1.• In current Committee Draft.

Page 9: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Proposal 2 components

1. Allow unreachable objects to be reclaimed.

2. Provide a simple API to• Explicitly prevent reclamation of specified

objects (declare_reachable()).• Declare that certain objects do not need to

be traced because they contain no pointers (declare_no_pointers()).

Page 10: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Reclamation of unreachable objects in C++

• Existing conservative collectors reclaim objects not reachable via pointer chains from variables.

• Leak detectors make similar assumptions.

intptr_t q = ~(intptr_t)p;

p = 0;

p = (foo *)(~q);

… *p …

• But current standard does not guarantee that unreachable objects are dead.

• Disallow this!• Unavoidably a

compatibility issue

Page 11: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

This isn’t as easy as it looks …

• Initial attempt:– Objects that were once unreachable may not

be dereferenced (incl. deallocation).

• Insufficient:

int_ptr_t q = ~(intptr_t)p;

foo *r = (foo *)(~q);

p = 0;

… *r …

Page 12: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

A better formulation

• Only safely-derived pointers may be dereferenced.• A safely-derived pointer was computed without

intervening integer arithmetic from another safely-derived pointer.

• Safely-derived pointers may only be stored in– pointer objects.– integer objects of sufficient size.– aligned character arrays.

• Whether a value is safely derived depends on how it was computed, not on the bits representing the pointer.– Sometimes p safely derived, r not, but p == r.

• Draft standard contains a precise inductive definition. Thanks to Clark Nelson (Intel).

Page 13: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

API addition 1

• Declare_reachable() / undeclare_reachable() allow a pointer to be dereferenced even if it is not safely-derived.– No-ops in non-GC implementation.– Allow old code to be retrofitted.

• Undeclare_reachable() returns safely derived copy of pointer.

Page 14: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Declare_reachable() example

declare_reachable(p);

int_ptr_t q = ~(intptr_t)p;

p = 0;

p = undeclare_reachable(foo *)(~q);

… *p …

Page 15: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Implementation Challenges

• Implemented as global GC-visible multiset representation, but:– Declare_reachable() applies to complete

objects. Undeclare_reachable() argument need not match exactly.

– Matching calls don’t need to come from the same thread: Scalability with thread/processor count.

Page 16: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

API Addition 2

• Declare_no_pointers(p,n) / undeclare_no_pointers(p,n) declares the address range [p, p+n) to not hold pointers; safely derived pointers may not be stored there.

• Allows the programmer to specify more “type” information.

• Much more compatible with C++ constructor/destructor model than allocation-time specifications.

• Can be applied to static/stack/heap objects.• Undeclare_no_pointers() must be called before

explicit deallocation.

Page 17: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Declare_no_pointers () example

class foo { foo * next; char cmprsd[N]; public: foo() { … declare_no_pointers(cmprsd, N); } ~foo() { … undeclare_no_pointers(cmprsd, N); } …}

Page 18: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Implementation Challenges

• Efficient handling for frequently constructed stack objects.

• Scalability.

Page 19: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Prototype Implementation

• Currently just track registered ranges.– Processing deferred to GC time.

• Keep a small number of ranges in a thread-local data structure.

• Very small ranges and smaller objects are currently ignored.

Page 20: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Preliminary Performance Measurementspr

oces

sor

nsec

s/op

-pai

r

threads

Page 21: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Conclusions

• Current C++0x draft explicitly allows garbage-collected implementations.

• Support APIs differ from existing implementations.– For good reasons, we think.

• New set of implementation challenges.• More extensive GC support will be

considered after C++0x.• Not too late for comments.

Page 22: Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec

Questions?