free-me: a static analysis for automatic individual object reclamation samuel z. guyer, kathryn...

Post on 03-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Free-Me: A Static Analysis for Automatic Individual Object

Reclamation Samuel Z. Guyer, Kathryn McKinley,

Daniel Frampton

Presented by: Dimitris Prountzos

Some slides adapted from the original PLDI talk

Motivation Automatic memory reclamation (GC)

No need for explicit “free” Garbage collector reclaims memory Eliminates many programming errors

Problem: when do we get memory back? Frequent GCs:

Reclaim memory quickly (minimize memory footprint), with high overhead

Infrequent GCs: Lower overhead, but lots of garbage in memory

Can we combine the software engineering advantage of garbage collection with the

low-cost incremental reclamation of explicit memory management ?

Can we combine the software engineering advantage of garbage collection with the

low-cost incremental reclamation of explicit memory management ?

Example

Notice: String idName is often garbageMemory:

void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } computeOn(id);}}

Read a token (new String)

Look up insymbol table

If not there, create new identifier, add

to symbol tableCompute on

identifier

Explicit Reclamation as the solution

Garbage does not accumulateMemory:

void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName) if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } else free(idName); computeOn(id);}} String idName is garbage,

free immediately

FreeMe as the solution Adds free() automatically

FreeMe compiler pass inserts calls to free() Preserve software engineering benefits

Can’t determine lifetimes for all objects Works with the garbage collector Implementation of free() depends on collector

Goal: Incremental, “eager” memory reclamation Results: reduce GC load, improve performance

Potential: 1.7X performancemalloc/free vs GC

in tight heaps(Hertz & Berger, OOPSLA 2005)

Outline

• Motivation• Analysis• Runtime support• Experimental evaluation• Conclusions

FreeMe Analysis Goal:

Determine when an object becomes unreachable

Not a whole-program analysis*

Idea: pointer analysis + liveness Pointer analysis for reachability Liveness analysis for when

Within a method, for allocation site “p = new A” where can we place a call to “free(p)”?

I’ll describe the interprocedural

parts later

Pointer Analysis

idName

symbolTable

readTokenString

Identifier

(global)

id

String idName = stream.readToken();Identifier id = symbolTable.lookup(idName);if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id);}computeOn(id);

Connectivity graph Variables Allocation sites Globals (statics)

Analysis algorithm Flow-insensitive, field-insensitive

Pointer Analysis in more depth (1) Data Structures

void function(A p1, A p2, A p3)

{

v1 = new O

p1.f = v1

p2 = p1.f

p3 = p1

}

Pointer Analysis in more depth (2)Calculating the Points-To relation

p1p1

Np1Np1

NI1NI1

p2p2

Np2Np2

NI2NI2

p3p3

Np3Np3

NI3NI3

∀i,PtsTo(pi) = {NPi }

∀ i,PtsTo(NPi ) = {N I i }

∀ i,PtsTo(N I i ) = {N I i }

v1v1

O1O1

∀n ∈ PtsTo(p1) :PtsTo(n)∪= PtsTo(v1)

PtsTo(p2)∪= PtsTo * (p1)

PtsTo(p3)∪= PtsTo(p1)

PtsTo(p1) = {NP1 ,O1}

PtsTo(p2) = {NP1 ,O1,N I1 ,NP 2}

PtsTo(p3) = {NP1 ,NP3}

Interprocedural component Detection of factory methods

Return value is a new object Can be freed by the caller

Effects of methods called

Describes how parameters are connected

Compilation strategy: Summaries pre-computed for all methods Free-me only applied to hot methods

String idName = stream.readToken();

symbolTable.add(idName, id);

Hashtable.add: (0 → 1) (0 → 2)

Generating summaries in more depth

PtsTo(p1) = {NP1 ,O1}

PtsTo(p2) = {NP1 ,O1,N I1 ,NP 2}

PtsTo(p3) = {NP1 ,NP3}

(p2, p1),(p2,*p1)

(p3, p1)

p1p1

Np1Np1

NI1NI1

p2p2

Np2Np2

NI2NI2

p3p3

Np3Np3

NI3NI3

v1v1

O1O1

void function(A p1, A p2, A p3){ v1 = new O p1.f = v1 p2 = p1.f p3 = p1 }

Getfield is needed because a single pointer link in summary may represent multiple pointers in the callee

Getfield is needed because a single pointer link in summary may represent multiple pointers in the callee

The need for liveness analysis

• Liveness identifies when objects become unreachable, not just whether or not they escape

idName

symbolTable

readTokenString

Identifier

(global)

id

Adding Liveness Key :

idName

readTokenString

Identifier(global)

An object is reachable only when all incoming pointers are live

From a variable: Live range of the variable

From a global: Live from the pointer store onward

Live from the pointer store until source object becomes unreachable

From other object:

Reachability is union of all these live ranges

Liveness Analysis Computed as sets of edges

Variables

Heappointers

String idName = stream.readToken();

id = new Identifier(idName);

computeOn(id);

if (id == null)

Identifier id = symbolTable.lookup(idName);

symbolTable.add(idName, id);

idName

(global)

readTokenString

Identifier

Where can we free it? Where object

exists

-minus-

Where reachable

String idName = stream.readToken();

id = new Identifier(idName);

computeOn(id);

if (id == null)

Identifier id = symbolTable.lookup(idName);

symbolTable.add(idName, id);

readTokenString

Compiler inserts call to free(idName)

Free placement issues

• Select earliest point A, eliminate all B: A dom B

• Deal with double free’s

Limitations & Extensions• Free-me frees short-lived objects with explicit

pointers to them in the code

• Factory variability

• Potential extensions:– Container internals

• interprocedural liveness & pointer analysis || shape analysis– Large data structures

• Global context sensitive heap model • interprocedural liveness & pointer analysis

Runtime support for FreeMe Run-time: depends on collector

Mark/sweepFree-list: free() operation

Generational mark/sweepUnbump: move nursery “bump pointer” backward (LIFO frees)

Unreserve: reduce copy reserve Very low overhead Run longer without collecting

Size to free defined statically/dynamically (query object)

Experimental Evaluation

Volume freed – in MB

100%

50%

0% compress

105

jess

263

raytrace

91

mtrt

98

javac

183

jack

271

pseudojbb

180

xalan

8195antlr

1544716

bloat

fop

103

hsqldb

515

jython

348

pmd

822

ps

523

db

74

SPEC benchmarks DaCapo benchmarks

Increasing alloc size Increasing alloc size

Volume freed – in MB

100%

50%

0% compress

105

jess

263

raytrace

91

mtrt

98

javac

183

jack

271

pseudojbb

180

xalan

8195antlr

1544716

bloat

fop

103

hsqldb

515

jython

348

pmd

822

ps

523

db

74

016

7373

24

163

34 1607

673

22230

57

75

278

22

45

FreeMeMean: 32%

Comparing FreeMe & other approaches

Stack-like • free() allocations of same method• Restrict free instrumentation to end of method • No factory methods• No conditional freeing

Uncond • Prove objects dead on all paths • Influence of free on some paths

Mark/sweep – time

20%

15%

6%

All benchmarks

Mark/sweep – GC time

All benchmarks30%

9%

GenMS – time

All benchmarks

Brings into question all techniques that target short-lived objects

GenMS – GC time

Why doesn’t this help?

Note: the number of GCs is greatly reduced

FreeMe mostly finds short-lived objects

All benchmarks

Nursery reclaims dead objects for free

(cost ~ survivors)

Bloat – GC time

12%

Conclusions• FreeMe analysis

– Finds many objects to free: often 30% - 60%– Most are short-lived objects

• GC + explicit free()– Advantage over stack/region allocation: no need to make

decision at allocation time

• Generational collectors– Nursery works very well

• Mark-sweep collectors– 50% to 200% speedup– Works better as memory gets tighter

Embedded applications:Compile-ahead

Memory constrainedNon-moving collectors

Discussion

• Are the percentages of freed memory good enough ?

• Is compile-time memory management inherently incompatible with generational copying collection?

• Abandon techniques that replace nursery?• How close is the actual gap between compiler

analysis + ‘weaker’ GC aglorithms vs. generational collection

top related