shape analysis by graph decomposition

Shape Analysisby Graph Decomposition

R. ManevichM. Sagiv

Tel Aviv University

G. Ramalingam

MSR India

J. BerdineB. Cook

MSR Cambridge

2

Motivation Challenge: precise and efficient shape

analyses Prove properties of dynamically allocated

linked data structures Observation: often many correlations

irrelevant for proving shape properties

Our approach: develop a flexible abstraction that takes advantage of this

3

h1 t1

...

h2 t2

...

h1 t1 h2 t2

Example program – 2 lists// @assume h1!=null && h1==t1 && h1.n==null &&// h2!=null && h2==t2 && h2.n==null//// @loop_invariant Reach(h1,t1) &&// Reach(h2,t2) &&// DisjointLists(h1,h2)

EnqueueEvents() {L1: while (...) { List temp = new List(getEvent()); if (nondet()) { t1.n = temp; t1 = temp; } else { t2.n = temp; t2 = temp; } }}

Correlation between two lists irrelevant for proving loop invariant

4

size>2

size=2

size=1

size>2size=2size=1

Abstract states - full heaps [VMCAI’05]

h1

>1

t1

h2 t2

1

h2 t2

h1 t1

>1

h2 t2

1

h1 t1

>1

h2 t2

>1

h1 t1

1

h2 t2

1

h1 t1

1

h2 t2

>1

h1 t1

1

h2 t2

h1 t1

>1

h2 t2

h1 t1

h1 t1

h2 t2

5

Graph decomposition

1

h2 t2

1

h1 t1

>1

h2 t2

1

h1 t1

h1

>1

t1

h2 t2

>1

h2 t2

>1

h1 t1

1

h2 t2

>1

h1 t1

1

h2 t2

h1 t1

1

h2 t2

h1 t1

>1

h2 t2

h1 t1

h1 t1

h2 t2

6

Connected component 1

Connected component 2

Graph decomposition

1

h2 t2

1

h1 t1

Connected components by undirected reachability

1

h2 t2

1

h1 t1

decompose

7

Abstract states – decomposed heaps

h1 t1 h1

1

t1 h1

>1

t1

h2 t2 h2

1

t2 h2

>1

t2

For k lists:full heap abstraction generates 3k abstract statesdecomposed heap abstraction generates 3×k abstract states

Coarser abstraction precise enough to prove invariantbut generates fewer states

8

Overall view

h1 t1

...

h2 t2

...

h1 t1

h2 t2

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1

Concrete domain:concrete heaps

Full heaps domain:shape graphs

Decomposed heaps domain:shape subgraphs

FH

FH

GD

GD

Shape graphs trackALL correlations

Shape subgraphs trackSOME correlations

9

Main results New abstraction for shape analysis reduces

exponential factors by: Connected component decomposition Abstracting away null-value correlations

Sound and sufficiently precise transformers Most precise transformers are FNP-complete Polynomial time efficient transformers Sufficiently precise

Implementation and empirical results Sufficiently precise on set of benchmarks,

including Windows device driver models State space/time reduced by factor of 33/212

10

Outline Full heap abstraction [VMCAI’05]

Reference abstraction Further abstraction by decomposition

Connected component decomposition Abstracting away null-value correlations

(details in paper) Abstract transformers

Concretization by composition Experimental results

11

Full heap abstraction [VMCAI’05]

h1 t1

...

h2 t2

...

h1 t1

h2 t2

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1




FH

FH

GD

GD

12

Full heap abstraction [VMCAI’05]

Abstraction for singly-linked lists Basic concepts:

Interruptions (bounded number of) Uninterrupted list segments (bounded number of)

Abstraction keeps interruptions and abstracts segment lengths to {1,>1} Result is a shape graph

x

y

Concrete heapx

y

1

>1

>1

>1

Shape graph

βFH

FH by point-wiseextension

13

Graph decomposition abstraction

h1 t1

...

h2 t2

...

h1 t1

h2 t2

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1




FH

FH

GD

GD

14

Graph decomposition abstraction

Abstraction of shape graphs Further abstraction over shape graphs

Decouples connected components Intuitively different components =

different logical data structures Result = set of shape subgraphs

15

Connected components decomposition

1

h2 t2

h1 t1

h1

>1

t1

h2 t2

GD

h1 t1

h2

1

t2

h1

>1

t1

h2 t2

17

Concretization GD

h1 t1

...

h2 t2

...

h1 t1

h2 t2

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1

h1 t1

h2 t2

h1 t1

h2 t2

>1

>1

1

1




FH

FH

GD

GD

18

1

h2 t2

h1 t1

h1

>1

t1

h2 t2

GD

Abstracting correlations

GD

1

h2 t2h1 t1

h1

>1

t1h2 t2

h1 t1 h2 t2

h2

1

t2

h1

>1

t1

h1 t1

h2

1

t2

h1

>1

t1

h2 t2

19

Abstract transformers Need transformers for program

statements x=new List() x=null x=y x=y.n x.n=y assume(x!=y) assume(x==y) …

20

Abstract transformers outline Induced transformers by concretization

(from subgraphs and shape graphs) Problem: concretization introduces exponential

space blow-up Most precise transformers by partial

concretization Avoids exponential space blow-up Requires oracle to test strong feasibility Strong feasibility test NP-complete

Conservative transformers Give up on strong feasibility test Avoids exponential time blow-up

21

Most precise transformer [CC’77]

h1 t1

...

h2 t2

...

h1 t1

h2 t2




FH

FH

GD

GD

st st

Problem: concretization is exponential space in worst-case

22

Partial concretization Compose weakly-feasible subgraphs

Subgraphs that do not share any variables Compose only subgraphs in footprint of

statement Compose at most any 2 or 3 subgraphs

h1 t1h2

1

t2 h1

>1

t1h2

1

t2 h1 t1h1 t1 h1

>1

t1h1 t1

23

Transformer exampletemp h1 t1 h1

1

t1 h2 t2

t1.n = temp

temph1

1

t1

t1.n = temp

temph1

1

t1

1

t1.n = temp

h2 t2

t1.n = temp

h2 t2temph1

1

t1temph1 t1

24

Most precise transformer

x z w x y w y z

Can we extend to havevariable w?

M1 M2 M3 M4 M5

x z y

Most precise requires strong feasibility test Check that subgraphs can be extended to

include all variables

25


Inconsistency: shared variable x

x z w x y w y z

M1 M2 M3 M4 M5

x z y

Most precise requires strong feasibility test Check that subgraphs can be extended to

include all variables

26


Inconsistency:shared variable y

Conclusion: can’t extend with w

M1 and M4 are weakly-feasiblebut not strongly-feasible in {M1,…,M5}

Strong feasibility NP-complete Therefore most precise transformer

FNP-complete

x z y

x z w x y w y z

M1 M2 M3 M4 M5

27

Making the transformers efficient Vanilla transformer inefficient in

practice Incremental transformers

Reuse results of previous iterations Details in paper

Engineering optimizations Avoid unnecessarily composing subgraphs … Optimized transformers linear time in

practice

28

Prototype implementation Implemented in Java Supports assertions

assertReach(x,y) assertDisjointLists(x,y) assertAcyclicList(x) assertCyclicList(x) assert(x==y) assert(x!=y)

Check cleanness properties Absence of null derefs Absence of memory leaks No misuse of dangling pointers

29

Experiments – precision Precision lost in just 2/21 benchmarks

getLast Unable to prove x points to last cell Due to imprecise transformer Can be avoided by simple and efficient

heuristics queue_2_stack

Intentionally constructed Loss of correlations important to prove

property

Same precision as full heap analysis on other benchmarks

30

Experiments – “standard” suite Programs operating on 1-2 lists

insert, delete, reverse, merge… New analysis slightly less efficient But running times < 0.6 seconds so…

31

Experiments – multiple lists

1.40.5

12.0

33.5

2.44.6

11.6

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0(89,430 / 7,733)

number of shape graphsnumber of subgraphs

x

32

Experiments – multiple lists

1.00.525.0

95.0

14.621.7

212.5

0.0

50.0

100.0

150.0

200.0

250.0

full shape graph analysis time graph decomposition analysis time

x(552.6 / 2.6)

33

Properties of the abstraction No loss of precision when connected

components represent completely independent lists Reduces state space exponentially

Loss of precision when mixing abstract statesGD(X1 X2) GD(X1) GD(X2)

So where is this technique useful?

34

Related work Partial isomorphism join [Manevich et al. SAS’04]

Applied in more generic context but does not reduce exponential blow-ups addressed in this paper

Heap analysis by separation[Yahav et al. PLDI’04] [Hackett et al. POPL’05] Decompose verification problem itself and

conservatively approximate contexts Heap decomposition for interprocedural

analysis [Rinetzky et al. POPL’05] [Rinetzky et al. SAS’05] [Gotsman et al. SAS’06] [Gotsman et al. PLDI’07] Decompose/compose at procedure boundaries

Predicate/variable clustering [Clark et al. CAV’00] Statically-determined decomposition

35

Conclusions New abstraction scheme to control

precision/cost trade-off for shape analyses Efficient algorithms for abstract domain

operations Abstraction Partial concretization Transformers …

Applicable beyond singly-linked lists E.g., class of graphs supported by Lev-Ami et al.

[CAV’06] Doubly-linked lists Trees …

36

Ongoing work Extension for concurrent program

analysis Future work:

Tune abstraction by counterexample-guided refinement

37

Questions?

38

Conservative transformer Computes superset of subgraph computed

by most precise transformer Algorithm sketch:

Compose components in footprint of statement Apply local st on footprint and decompose

result Test consistency instead of strong feasibility Pass other components as is

Time(st) polynomial in #vars in st x=null : linear x.n=y: quadratic assume(x==y) : cubic

39

Concretization GD

Maps sets of shape subgraphs to sets of full shape graphs

Mathematically: GD(XG) = {G | β(G) XG} Algorithmically: by composing weakly-

feasible subgraphs Subgraphs that do not share any variables Full shape graph includes all program variables

shape analysis by graph decomposition

Documents