incrementalized pointer and escape analysis martin rinard mit lcs

Post on 21-Jan-2016

227 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Incrementalized Pointer and Escape Analysis

Martin RinardMIT LCS

Context

•Unsafe languages (C, C++, …)•Safe languages (Java, ML, Scheme, …)•Big difference: garbage collection

Goal: analyze program to safely allocate objects on the call stack instead of in

garbage-collected heap

Advantages• No dangling

references• No memory leaks

Disadvantages• Collection

overhead• Collection pauses

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

When program runs

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

When program runs

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

When program runsIt allocates objects in heap

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation SitesProgram With Allocation Sites

When program runsIt allocates objects in heap

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation SitesProgram With Allocation Sites

When program runsIt allocates objects in heap

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation SitesProgram With Allocation Sites

When program runsIt allocates objects in heap

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

When objects become unreachable

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Program With Allocation Sites

When objects become unreachableGarbage collector (eventually)

reclaims memory

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Stack AllocationCorrelate lifetimes of objectswith lifetimes of procedures

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Stack AllocationAllocate object on activation

record of procedure

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Stack AllocationWhen procedure returns

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Stack AllocationWhen procedure returns

Object automatically deallocated

Problem

• How does compiler determine if it is safe to allocate objects on stack?

• Classic problem in program analysis• Standard approach

• Analyze whole program• Use information to find “captured”

objects• Allocate captured objects on stack

Stack Allocation

Percentage of Memory Allocated on Stack

0

20

40

60

80

100

barnes water jlex db raytrace compress

Whole-Program Analysis

Normalized Execution Times

0

20

40

60

80

100

barnes water jlex db raytrace compress

Whole-Program Analysis

Reference: execution time without optimization

Analysis Times

0

25

50

75

100

125

150

barnes water jlex db raytrace compress

Ana

lysi

s T

ime

(sec

onds

)

Whole-Program Analysis

223645

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Key Observation Number One:

Most optimizations require only

the analysis of a small part of

program surrounding the object

allocation site

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

Key Observation Number Two:

Most of the optimization

benefit comes from a small

percentage of the allocation

sites

99% of objects

allocated at these two

sites

Intuition for Better Analysis

• Locate important allocation sites

• Use demand-driven approach to analyze region surrounding site

• Somehow avoid sinking analysis resources into unprofitable sites

void compute(d,e) ———— ———— ————

void multiplyAdd(a,b,c) ————————— ————————— —————————

void multiply(m) ———— ———— ————

void add(u,v) —————— ——————

void main(i,j) ——————— ——————— ———————

void evaluate(i,j) —————— —————— ——————

void abs(r) ———— ———— ————

void scale(n,m) —————— ——————

99% of objects allocated at these

two sites

Example

Employee Database Example

Traverse database, extract max salary

Name

Salary

John Doe

$45,000

Ben Bit

$30,000

Jane Roe

$55,000

Vector

max salary = $55,000

highest paid = Jane Roe

Coding Max Computation (in Java)

class EmployeeDatabase { Vector database = new Vector();Employee highestPaid;void computeMax() {

int max = 0;Enumeration enum = database.elements();while (enum.hasMoreElements()) {

Employee e = enum.nextElement();if (max < e.salary()) {

max = e.salary(); highestPaid = e;}

} }

}

Coding Max Computation (in Java)

class EmployeeDatabase { Vector database = new Vector();Employee highestPaid;void computeMax() {

int max = 0;Enumeration enum = database.elements();while (enum.hasMoreElements()) {

Employee e = enum.nextElement();if (max < e.salary()) {

max = e.salary(); highestPaid = e;}

} }

}

Would like to allocate enum object on stack, not on the heap

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Bottom Up and Compositional

Currently analyzed procedure

Currently analyzed part of

the program

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Bottom Up and Compositional

Currently analyzed procedure

Currently analyzed part of

the program

Points-to Graph in Examplevoid computeMax() {

int max = 0;Enumeration enum = database.elements();while (enum.hasMoreElements()) {

Employee e = enum.nextElement();if (max < e.salary()) { max = e.salary(); highestPaid = e; }

}

} vector elementData [ ]

this

enum

e

database highestPaid

Escape Information

• Escaped nodes• parameter nodes• returned nodes• nodes reachable from other escaped

nodes• Captured is the opposite of escaped

vector elementData [ ]

this

enum

e

database highestPaid

green = escaped

white = captured

Stack Allocation Optimization

• Examine graph from end of procedure• If a node is captured in this graph• Allocate corresponding objects on stack

(may need to inline procedures to apply optimization)

vector elementData [ ]

this

enum

e

database highestPaid

green = escaped

white = captured

Can allocate enum object on stack

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Whole Program Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Incrementalized Analysis RequirementsMust be able to

• Analyze procedure independently of callers• Whole-program analysis is

compositional• Already does this

• Skip analysis of invoked procedures

• But later incrementally integrate analysis results if desirable to do so

First Extension to Whole-Program Analysis

• Skip the analysis of invoked procedures• Parameters are marked as escaping into

skipped call site

Assume analysis skips enum.nextElement()

vector1

this

enum

e

database highestPaid

Node 1 escapes intoenum.nextElement()

First Extension Almost Works

• Can skip analysis of invoked procedures• If allocation site is captured, great!• If not, escape information tells you what

procedures you should have analyzed…

Should have analyzed enum.nextElement()

vector1

this

enum

e

database highestPaid

Node 1 escapes intoenum.nextElement()

Second Extension to Base Algorithm

• Record enough information to undo skip and incorporate analysis into existing result• Parameter mapping at call site• Ordering information for call sites

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Attempt to stack allocate Enumeration object from elements

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze elements(intraprocedurally)

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze elements(intraprocedurally)

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze elements(intraprocedurally)

Escapes only into the caller

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze computeMax(intraprocedurally)

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze computeMax(intraprocedurally)

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze computeMax(intraprocedurally)

Escapes to• hasMoreElements• nextElement

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Analyze hasMoreElements and nextElement(intraprocedurally)

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

Incorporate Results

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized AnalysisEnumeration object Captured in computeMax

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized AnalysisEnumeration object Captured in computeMax

Inline elements

Stack allocate enumeration object

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

We skipped the analysis of some procedures

void computeMax() ———— ———— ————

Enumeration elements() ——— ——— ———

Vector elementData() ———— ———— ————

boolean hasMoreElements() —————— —————— ——————

void printStatistics() ——————— ——————— ———————

Employee nextElement() —————— ——————

int salary() —————— ——————

Incrementalized Analysis

We skipped the analysis of some procedures

We ignored some other procedures

Result• We can incrementally analyze

• Only what is needed • For whatever allocation site we want• And even temporarily suspend

analysis part of the way through!

New Issue• We can incrementally analyze

• Only what is needed • For whatever allocation site we want• And even temporarily suspend

analysis part of the way through!

But…• Lots of analysis opportunities

• Not all opportunities are profitable• Where to invest analysis resources?• How much resources to invest?

Analysis Policy

Formulate policy as solution to an investment problem

GoalMaximize optimization payoff from

invested analysis resources

Analysis Policy Implementation

• For each allocation site, estimate marginal return on invested analysis resources

• Loop• Invest a unit of analysis resources (time)

in site that offers best return Expand analyzed region surrounding site

• When unit expires, recompute marginal returns (best site may change)

Marginal Return Estimate

N · P(d)

C · T

N = Number of objects allocated at the siteP(d) = Probability of capturing the site, knowing

we explored a region of call depth d C = Number of skipped call sites the

allocation site escapes through T = Average time needed to analyze a call site

Marginal Return Estimate

N · P(d)

C · T

As invest analysis resources• explore larger regions around

allocation sites• get more information about sites• marginal return estimates improve• analysis makes better investment

decisions!

Usage Scenarios

• Ahead of time compiler• Give algorithm an analysis budget• Algorithm spends budget• Takes whatever optimizations it

uncovered

• Dynamic compiler• Algorithm acquires analysis budget as

a percentage of run time• Periodically spends budget, delivers

additional optimizations• Longer program runs, more

optimizations

Experimental Results

Methodology

• Implemented analysis in MIT Flex System

• Obtained several benchmarks• Scientific computations: barnes,

water• Our lexical analyzer: jlex• Spec benchmarks: db, raytrace,

compress

Analysis Times

0

25

50

75

100

125

150

barnes water jlex db raytrace compress

Ana

lysi

s Ti

me

(sec

onds

)

Incrementalized Analysis Whole-Program Analysis

223 645

jdk

1.2

Stack Allocation

Percentage of Memory Allocated on Stack

0

20

40

60

80

100

barnes water jlex db raytrace compress

Incrementalized Analysis Whole-Program Analysis

Normalized Execution Times

0

20

40

60

80

100

barnes water jlex db raytrace compress

Incrementalized Analysis Whole-Program Analysis

Reference: execution time without optimization

Other Analyses and Uses• Whole-program pointer and escape analysis

(OOPSLA 1999)• Stack allocation• Synchronization elimination

• Pointer and escape analysis for multithreaded programs (PLDI 1999, PPoPP 2001)• Correct use of region-based allocation• Elimination of dynamic region checks• Synchronization elimination• Memory bank disambiguation (RAW, DeepC)• Foundation for other analyses

• Bitwidth analysis (MIT and CMU)• Symbolic accessed region analysis

Other Analyses and Uses

• Symbolic analysis of accessed regions of memory blocks (PPoPP 1999, PLDI 2000)• Automatic parallelization of sequential

divide and conquer programs• Data race freedom for parallel divide

and conquer programs• Absence of array bounds violations for

both• Bitwidth analysis

Future Directions

Information from designers and developers

• Type systems for program properties• Automatic parallelization and data race freedom

for divide and conquer programs (CC 2001)• Data race freedom for OO programs (OOPSLA

2001)

• Application-specific design properties• Shapes of recursive data structures• Object role transitions and constraints

• Distributed systems, component-based systems• Interaction patterns• Failure propagation properties• Global view of system

top related