textbook: principles of program analysis chapter 4msagiv/courses/apl17/forward11.pdf · iterative...
TRANSCRIPT
1
Iterative Program Analysis
Abstract Interpretation
Mooly Sagiv
Textbook: Principles of Program Analysis
Chapter 4
CC79, CC92
Fixed Points
A monotone function f: L L where (L, , , , , ) is a complete lattice
Fix(f) = { l: l L, f(l) = l}
Red(f) = {l: l L, f(l) l}
Ext(f) = {l: l L, l f(l)}
– l1 l2 f(l1 ) f(l2 )
Tarski’s Theorem 1955: if f is monotone then:
– lfp(f) = Fix(f) = Red(f) Fix(f)
– gfp(f) = Fix(f) = Ext(f) Fix(f)
f()
f()
f2()
f2()
Fix(f)
Ext(f)
Red(f)
gfp(f)
lfp(f)
Chaotic Iterations A lattice L = (L, , , , , ) with finite strictly increasing chains
Ln = L L … L
A monotone function f: LnLn
Compute lfp(f)
The simultaneous least fixed of the system {x[i] = fi(x) : 1 i n }
x := (, , …, )
while (f(x) x ) do
x := f(x)
for i :=1 to n do
x[i] =
WL = {1, 2, …, n}
while (WL ) do
select and remove an element i WL
new := fi(x)
if (new x[i]) then
x[i] := new;
Add all the indexes that directly depends on i to WL
Specialized Chaotic Iterations
System of Equations
S =
dfentry[s] =
dfentry[v] = {f(u, v) (dfentry[u]) | (u, v) E }
FS:LnLn
FS (X)[s] =
FS(X)[v] = {f(u, v)(X[u]) | (u, v) E }
lfp(S) = lfp(FS)
Specialized Chaotic Iterations
Chaotic(G(V, E): Graph, s: Node, L: Lattice, : L, f: E (L L) ){
for each v in V to n do dfentry[v] :=
df[s] =
WL = {s}
while (WL ) do
select and remove an element u WL
for each v, such that. (u, v) E do
temp = f(e)(dfentry[u])
new := dfentry(v) temp
if (new dfentry[v]) then
dfentry[v] := new;
WL := WL {v}
Specialized Chaotic Iterations
System of Equations
S =
dfentry[s] =
dfentry[v] = {f(u, v) (dfentry[u]) | (u, v) E }
FS:LnLn
FS (X)[s] =
FS(X)[v] = {f(u, v)(X[u]) | (u, v) E }
lfp(S) = lfp(FS)
Specialized Chaotic Iterations
Chaotic(G(V, E): Graph, s: Node, L: Lattice, : L, f: E (L L) ){
for each v in V to n do dfentry[v] :=
df[s] =
WL = {s}
while (WL ) do
select and remove an element u WL
for each v, such that. (u, v) E do
temp = f(e)(dfentry[u])
new := dfentry(v) temp
if (new dfentry[v]) then
dfentry[v] := new;
WL := WL {v}
z =3
x =1
while (x>0)
if (x=1)
y =7 y =z+4
x=3
print y
e.e[z3]
e.e[x1]
e. if x >0 then e else
e. if e x 0 then e else
e. e [x1, y , z] e. if e x 0 then e else
e.e[y7] e.e[ye(z)+4]
e.e[x3]
e.e
1
2
3
4
56
7
8
[x0, y0, z0]
WL dfentry]v]
{1}
{2} df[2]:=[x0, y0, z3]
{3} df[3]:=[x1, y0, z3]
{4} df[4]:=[x1, y0, z3]
{5} df[5]:=[x1, y0, z3]
{7} df[7]:=[x1, y7, z3]
{8} df[8]:=[x3, y7, z3]
{3} df[3]:=[x, y, z3]
{4} df[4]:=[x, y, z3]
{5,6} df[5]:=[x1, y, z3]
{6,7} df[6]:=[x, y, z3]
{7} df[7]:=[x, y7, z3]
The Abstract Interpretation
Technique (Cousot & Cousot)
The foundation of program analysis
Defines the meaning of the information computed by static tools
A mathematical framework
Allows proving that an analysis is sound in a local way
Identify design bugs
Understand where precision is lost
New analysis from old
Not limited to certain programming style
Abstract (Conservative) interpretation
abstract
representation
Set of states
abstraction
Abstract
semantics
statement sabstract
representation
abstraction
Operational
semantics
statement sSet of states
abstract
representation
Abstract (Conservative) interpretation
abstract
representation
Set of states
concretization
Abstract
semantics
statement sabstract
representation
concretization
Operational
semantics
statement sSet of states Set of states
Galois Connections Lattices C and A and functions : C A and : AC
The pair of functions (, ) form
Galois connection if
– and are monotone
– a A
» ( (a)) a
– c C
» c ((C))
Alternatively if:
c C
a A
(c) a iff c (a)
and uniquely determine each other
The Abstraction Function (CP) Map collecting states into constants
The abstraction of an individual state
CP:[Var* Z] [Var* Z{, }]
CP() =
The abstraction of set of states
CP:P([Var* Z]) [Var* Z{, }]
CP (CS) = { CP () | CS} = {| CS}
Soundness
CP (Reach (v)) df(v)
Completeness
The Concretization Function Map constants into collecting states
The formal meaning of constants
The concretization
CP: [Var* Z{, }] P([Var* Z])
CP (df) = {| CP () df} = { | df}
Soundness
Reach (v) CP (df(v))
Completeness
Galois Connection Constant
Propagation
CP is monotone
CP is monotone
df [Var* Z{, }]
– CP( CP (df)) df
c P([Var* Z])
– c CP CP ( CP(C))
Upper Closures
Define abstractions on sets of concrete states
: P() P() such that
– is monotone, i.e., X Y X Y
– is extensive, i.e., X X
– is closure, i.e., ( X) = X
Every Galois connection defines an upper closure
Proof of Soundness
Define an “appropriate” operational semantics
Define “collecting” operational semantics by pointwise
extension
Establish a Galois connection between collecting states
and abstract states
(Local correctness) Show that the abstract interpretation
of every atomic statement is sound
w.r.t. the collecting semantics
(Global correctness) Conclude that the analysis is sound
Collecting Semantics
The input state is not known at
compile-time
“Collect” all the states for all
possible inputs to the program
No lost of precision
A Simple Example Program
z = 3
x = 1
while (x > 0) (
if (x = 1) then y = 7
else y = z + 4
x = 3
print y
)
{[x0, y0, z0]}
{[x1, y0, z3]}
{[x1, y0, z3], [x3, y0, z3],}
{[x0, y0, z3]}
{[x1, y7, z3], [x3, y7, z3]}
{[x1, y7, z3], [x3, y7, z3]}
{[x3, y7, z3]}
{[x3, y7, z3]}
An “Iterative” Definition
Generate a system of monotone equations
The least solution is well-defined
The least solution is the collecting interpretation
But may not be computable
Equations Generated for Collecting Interpretation
Equations for elementary statements
– [skip]
CSexit(1) = CSentry(l)
– [b]CSexit(1) = {: CSentry(l), b=tt}
– [x := a]
CSexit(1) = {(s[x Aas]) | s CSentry(l)}
Equations for control flow constructsCSentry(l) = CSexit(l’) l’ immediately precedes l in the control flow graph
An equation for the entryCSentry(1) = { | Var* Z}
Specialized Chaotic Iterations
System of Equations
(Collecting Semantics)S =
CSentry[s] ={0}
CSentry[v] = {f(e)(CSentry[u]) | (u, v) E }
where f(e) = X. {st(e) | X} for atomic statements
f(e) = X.{ | b(e) =tt }
FS:LnLn
Fs(X)[v] = {f(e)[u] | (u, v) E }
lfp(S) = lfp(FS)
The Least Solution
2n sets of equations
CSentry(1), …, CSentry (n), CSexit(1), …, CSexit (n)
Can be written in vectorial form
The least solution lfp(Fcs) is well-defined
Every component is minimal
Since Fcs is monotone such a solution always exists
CSentry(v) = {s|s0| <P, s0 > * (S’, s)),
init(S’)=v}
Simplify the soundness criteria
)CS(CS csF
f()
f()
f2()
f2()
f(x)=x
f(x)x
f(x)x
gfp(f)
lfp(f)
f#()
f#()
f#2()
f#2()
f#(y)=y
f#(y)y
f#(y)y
gfp(f#)
lfp(f#)
a: f((a)) (f#(a))
Soundness Theorem(1)
1. Let (, ) form Galois connection from C to A
2. f: C C be a monotone function
3. f# : A A be a monotone function
4. aA: f((a)) (f#(a))
lfp(f) (lfp(f#))
(lfp(f)) lfp(f#)
Soundness Theorem(2)
1. Let (, ) form Galois connection from C to A
2. f: C C be a monotone function
3. f# : A A be a monotone function
4. cC: (f(c)) f#((c))
(lfp(f)) lfp(f#)
lfp(f) (lfp(f#))
Soundness Theorem(3)
1. Let (, ) form Galois connection from C to A
2. f: C C be a monotone function
3. f# : A A be a monotone function
4. aA: (f((a))) f#(a)
(lfp(f)) lfp(f#)
lfp(f) (lfp(f#))
Proof of Soundness (Summary)
Define an “appropriate” operational semantics for
atomic statements
Define “collecting” operational semantics
Establish a Galois connection between collecting
states and abstract domains
(Local correctness) Show that the abstract
interpretation of every atomic statement is sound
w.r.t. the collecting semantics
(Global correctness) Conclude that the analysis is
sound
Constant Propagation
: [Var Z] [Var Z{, }]
– () = ()
: P([Var Z]) [Var Z{, }]
– (X) = {() | X} = { | X}
:[Var Z {, }] P([Var Z])
– (#) = { | () # } = { | # }
Local Soundness
– st#(#) ({ st | (#) = { st | # }
Optimality (Induced)
– st#(#) = ({ st | (#)} = { st | # }
Soundness
Completeness
Proof of Soundness (Summary)
Define an “appropriate” structural operational
semantics
Define “collecting” structural operational
semantics
Establish a Galois connection between collecting
states and reaching definitions
(Local correctness) Show that the abstract
interpretation of every atomic statement is sound
w.r.t. the collecting semantics
(Global correctness) Conclude that the analysis is
sound
Best (Conservative) interpretation
abstract
representation
Set of states
concretization
Abstract
semantics
statement sabstract
representation
abstraction
Operational
semantics
statement sSet of states
concretization
Set of states
Induced Analysis
(Relatively Optimal)
It is sometimes possible to show that a given analysis is not only sound but optimal w.r.t. the chosen abstraction
– but not necessarily optimal!
Define S# (df) = ({S| (df)})
But this S# may not be computable
Derive (at compiler-generation time) an alternative form for S#
A useful measure to decide if the abstraction must lead to overly imprecise results
Numeric Abstract Domain Examples
y
x
signs
x 0
y
x
intervals
x [a, b]
y
x
octagons
x y c
y
x
polyhedra
ai xi c
Example Dataflow Problem
Formal available expression analysis
Find out which expressions are available at a given program point
Example programx = y + tz = y + rwhile (…) {
t = t + (y + r)}
Lattice
Galois connection
Basic statements
Soundness
Example: May-Be-Garbage
A variable x may-be-garbage at a program point v
if there exists a execution path leading to v in
which x’s value is unpredictable:
– Was not assigned
– Was assigned using an unpredictable expression
Lattice
Galois connection
Basic statements
Soundness
Points-To AnalysisDetermine if a pointer variable p may point
to q
on some path leading to a program point
“Adapt” other optimizations
– Constant propagation
x = 5;
*p = 7 ;
… x …
Pointer aliases
– Variables p and q are may-aliases at v if the points-to
set at v contains entries (p, x) and (q, x)
Side-effect analysis
*p = *q + * * t
The PWhile Programming Language
Abstract Syntax
a := x | *x | &x | n | a1 opa a2
b := true | false | not b | b1 opb b2 | a1 opr a2
S := x := a | *x := a | skip | S1 ; S2 |
if b then S1 else S2 | while b do S
Concrete Semantics 1 for PWhile
For every atomic statement S
S : States1 States1
x := a ()=[loc(x) Aa ]
x := &y ()
x := *y ()
x := y ()
*x := y ()
State1= [LocLocZ]
Abstract Semantics for PWhile
•For every atomic statement S
S #: P(Var* Var*) P(Var* Var*)
x := &y #
x := *y #
x := y #
*x := y #
/* */ t := &a; /* {(t, a)}*//* {(t, a)}*/ y := &b; /* {(t, a), (y, b) }*/
/* {(t, a), (y, b)}*/ z := &c; /* {(t, a), (y, b), (z, c) }*/
if x> 0; then p:= &y; /* {(t, a), (y, b), (z, c), (p, y)}*/
else p:= &z; /* {(t, a), (y, b), (z, c), (p, z)}*/ /* {(t, a), (y, b), (z, c), (p, y), (p, z)}*/
*p := t;
/* {(t, a), (y, b), (y, c), (p, y), (p, z), (y, a), (z, a)}*/
Flow insensitive points-to-analysis
Steengard 1996
Ignore control flow
One set of points-to per program
Can be represented as a directed graph
Conservative approximation
– Accumulate pointers
Can be computed in almost linear time
Precision
We cannot usually have
– (CS) = DF
on all programs
But can we say something about precision in all
programs?
Precision criteria
– Join over all paths
– Induced analysis
Summary
Abstract interpretation Connects Abstract and
Concrete Semantics
Galois Connection
Local Correctness
Global Correctness
Widening
Accelerate the termination of Chaotic iterations by
computing a more conservative solution
Can handle lattices of infinite heights
Specialized Chaotic Iterations+
Chaotic(G(V, E): Graph, s: Node, L: lattice, : L, f: E (L L) ){
for each v in V to n do dfentry[v] :=
In[v] =
WL = {s}
while (WL ) do
select and remove an element u WL
for each v, such that. (u, v) E do
temp = f(e)(dfentry[u])
new := dfentry(v) temp
if (new dfentry[v]) then
dfentry[v] := new;
WL := WL {v}
Example Interval Analysis
Find a lower and an upper bound of the value of a
variable
Usages?
Lattice
L = (Z{-, }Z {-, }, , , , ,)
– [a, b] [c, d] if c a and d b
– [a, b] [c, d] = [min(a, c), max(b, d)]
– [a, b] [c, d] = [max(a, c), min(b, d)]
– =
– =
Example Program
Interval Analysis
[x := 1]1 ;
while [x 1000]2 do
[x := x + 1;]3
IntEntry(1) = [minint,maxint]
IntExit(1) = [1,1]
IntEntry(2) = IntExit(1) IntExit(3)
IntExit(2) = IntEntry(2)
[x:=1]1
[x 1000]2
[x := x+1]3
[exit]4
IntEntry(3) = IntExit(2) [minint,1000]
IntExit(3) = IntEntry(3)+[1,1]
IntEntry(4) = IntExit(2) [1001,maxint]
IntExit(4) = IntEntry(4)
Widening for Interval Analysis
[c, d] = [c, d]
[a, b] [c, d] = [
if a c
then a
else -,
if b d
then b
else
]
Example Program
Interval Analysis
[x := 1]1 ;
while [x 1000]2 do
[x := x + 1;]3
IntEntry(1) = [-, ]
IntExit(1) = [1,1]
IntEntry(2) = InExit(2) (IntExit(1) IntExit(3))
IntExit(2) = IntEntry(2)
[x:=1]1
[x 1000]2
[x := x+1]3
[exit]4
IntEntry(3) = IntExit(2) [-,1000]
IntExit(3) = IntEntry(3)+[1,1]
IntEntry(4) = IntExit(2) [1001, ]
IntExit(4) = IntEntry(4)
Requirements on Widening
For all elements l1 l2 l1 l2
For all ascending chains l0 l1 l2 …the following sequence is finite– y0 = l0
– yi+1 = yi li+1
For a monotonic function f: L Ldefine– x0 = – xi+1 = xi f(xi )
Theorem:– There exits k such that xk+1 = xk
– xk Red(f) = {l: l L, f(l) l}
Narrowing
Improve the result of widening
y x y (x y) x
For all decreasing chains x0 x1 …the following sequence is finite– y0 = x0
– yi+1 = yi xi+1
For a monotonic function f: L L and x Red(f) = {l: l L, f(l) l}define– y0 = x
– yi+1 = yi f(yi )
Theorem:– There exits k such that yk+1 =yk
– yk Red(f) = {l: l L, f(l) l}
Narrowing for Interval Analysis
[a, b] = [a, b]
[a, b] [c, d] = [
if a = -
then c
else a,
if b =
then d
else b
]
Example Program
Interval Analysis
[x := 1]1 ;
while [x 1000]2 do
[x := x + 1;]3
IntEntry(1) = [- , ]
IntExit(1) = [1,1]
IntEntry(2) = InExit(2) ( IntExit(1) IntExit(3))
IntExit(2) = IntEntry(2)
[x:=1]1
[x 1000]2
[x := x+1]3
[exit]4
IntEntry(3) = IntExit(2) [-,1000]
IntExit(3) = IntEntry(3)+[1,1]
IntEntry(4) = IntExit(2) [1001, ]
IntExit(4) = IntEntry(4)
Widening and Narrowing
Summary
Very simple but produces impressive precision
Sometimes non-monotonic
The McCarthy 91 function
Also useful in the finite case
Can be used as a methodological tool
int f(x) [- , ]
if x > 100
then [101, ] return x -10 [91, -10];
else [-, 100] return f(f(x+11)) [91, 91] ;