efficient field-sensitive pointer analysis for c

24
Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK [email protected] www.doc.ic.ac.uk/~djp1/

Upload: reid

Post on 02-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Efficient Field-Sensitive Pointer Analysis for C. David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK [email protected] www.doc.ic.ac.uk/~djp1/. What is Pointer Analysis?. Determine pointer targets without running program - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Efficient Field-Sensitive Pointer Analysis for C

Efficient Field-Sensitive Pointer Analysis for C

David J. Pearce, Paul H.J. Kelly and Chris Hankin

Imperial College, London, UK

[email protected]/~djp1/

Page 2: Efficient Field-Sensitive Pointer Analysis for C

What is Pointer Analysis? Determine pointer targets without running program

What is flow-insensitive pointer analysis?> One solution for all statements – so precision lost> This is a trade-off for efficiency over precision> This work considers flow-insensitive pointer analysis only

int a,b,*p,*q = NULL;

p = &a;

if(…) q = p; // p{a,b}, q{a,NULL}p = &b;

Page 3: Efficient Field-Sensitive Pointer Analysis for C

Pointer analysis via set-constraints Generate set-constraints from program and solve them

> Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; r = &b; q = &c;

if(...) q = p; else q = r;

(program)

Page 4: Efficient Field-Sensitive Pointer Analysis for C

Pointer analysis via set-constraints

int a,b,c,*p,*q,*r;

p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }

if(...) q = p; // q pelse q = r; // q r

(program) (constraints)

Generate set-constraints from program and solve them> Use constraint graph for efficient solving

Page 5: Efficient Field-Sensitive Pointer Analysis for C

Pointer analysis via set-constraints

int a,b,c,*p,*q,*r;

p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }

if(...) q = p; // q pelse q = r; // q r

p

q

r{a} {b}

(program) (constraints) (constraint graph)

{c}

Generate set-constraints from program and solve them> Use constraint graph for efficient solving

Page 6: Efficient Field-Sensitive Pointer Analysis for C

Pointer analysis via set-constraints

int a,b,c,*p,*q,*r;

p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }

if(...) q = p; // q pelse q = r; // q r

p

q

r{a} {b}

(program) (constraints) (constraint graph)

{a,b,c}

Generate set-constraints from program and solve them> Use constraint graph for efficient solving

Page 7: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity How to deal with aggregate types ?

> Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x

p

x

q{a} {b}

{}

r {}

Page 8: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity How to deal with aggregate types ?

> Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x

p

x

q{a} {b}

{a,b}

r {a,b}

Page 9: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity – A simple solution Use a separate node per field for each aggregate

> Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r xf1

p

xf2

q{a} {b}

{}r

{}xf1{}

Page 10: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity – A simple solution Use a separate node per field for each aggregate

> Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r xf1

p

xf2

q{a} {b}

{a}r

{a}xf1{b}

Page 11: Efficient Field-Sensitive Pointer Analysis for C

Problem – can take address of field in C

System thus far has no mechanism for this First idea – use string concatenation operator ||

> Works well for this example

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s { x } p = &(s->f2); // p ?

xf2 {..}xf1 {..}

Page 12: Efficient Field-Sensitive Pointer Analysis for C

Problem – can take address of field in C

System thus far has no mechanism for this First idea – use string concatenation operator ||

> Works well for this example

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s { x } p = &(s->f2); // p (*s) || f2

xf2 {..}xf1 {..}

Page 13: Efficient Field-Sensitive Pointer Analysis for C

Problem – can take address of field in C

System thus far has no mechanism for this First idea – use string concatenation operator ||

> Works well for this example

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 }

xf2 {..}xf1 {..}

Page 14: Efficient Field-Sensitive Pointer Analysis for C

Problem – compatible types

First idea – use string concatenation operator ||> Casting identical types except for field names> Derivation same as before - but,node xf2 no longer exists!

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2

xf4 {..}xf3 {..}

Page 15: Efficient Field-Sensitive Pointer Analysis for C

Problem – compatible types

First idea – use string concatenation operator ||> Casting identical types except for field names> Derivation same as before - but,node xf2 no longer exists!

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 }

xf4 {..}xf3 {..}

Page 16: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s { xf3 }

p = &(s->f2); // p s + 1

Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field

p s xf3 xf4

0 1 2 3

Page 17: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s { xf3 } s { 2 }

p = &(s->f2); // p s + 1

Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field

p s xf3 xf4

0 1 2 3

Page 18: Efficient Field-Sensitive Pointer Analysis for C

Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s { xf3 } s { 2 }

p = &(s->f2); // p s + 1 p { 2 } + 1 p { 3 }

Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field

p s xf3 xf4

0 1 2 3

Page 19: Efficient Field-Sensitive Pointer Analysis for C

Experimental Study

Time (s) Avg Deref Sizebash (55324 LOC)

Field-insensitiveField-sensitive

0.510.53

543.086.7

emacs (93151 LOC)

Field-insensitiveField-sensitive

0.40.69

79.35.4

sendmail (49053 LOC)

Field-insensitive Field-sensitive

0.492.05

558.4214.2

Named (75599 LOC)

Field-insensitiveField-sensitive

30.0129.1

2865.52167.7

ghostscript (159853 LOC)

Field-insensitiveField-sensitive

277.42510.4

7703.17365.2

Page 20: Efficient Field-Sensitive Pointer Analysis for C

Conclusion

Field-sensitive Pointer Analysis> Presented new technique for C language> Elegantly copes with language features

- Taking address of field- Compatible types and casting - Technique also handles function pointers without modification

> Experimental evaluation over 7 common C programs- Considerable improvements in precision obtained- But, much higher solving times- And, relative gains appear to diminish with larger benchmarks

Page 21: Efficient Field-Sensitive Pointer Analysis for C

Constraint Graphs (continued) What about statements involving a pointer dereference?

> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p { r } s = &a; // s { a }

q = p; // q p*q = s; // *q s

p

q

{r} s

r {}

{a}

(program) (constraints) (constraint graph)

{}

Page 22: Efficient Field-Sensitive Pointer Analysis for C

Constraint Graphs (continued) What about statements involving a pointer dereference?

> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p { r } s = &a; // s { a }

q = p; // q p*q = s; // *q s r s

p

q

{r} s

r {}

{a}

(program) (constraints) (constraint graph)

{r}

Page 23: Efficient Field-Sensitive Pointer Analysis for C

Constraint Graphs (continued) What about statements involving a pointer dereference?

> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p { r } s = &a; // s { a }

q = p; // q p*q = s; // *q s r s

p

q

{r} s

r {}

{a}

(program) (constraints) (constraint graph)

{r}

Page 24: Efficient Field-Sensitive Pointer Analysis for C

Constraint Graphs (continued) What about statements involving a pointer dereference?

> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p { r } s = &a; // s { a }

q = p; // q p*q = s; // *q s r s

p

q

{r} s

r {a}

{a}

(program) (constraints) (constraint graph)

{r}