foundations of data-flow analysis
DESCRIPTION
Foundations of Data-Flow Analysis. Basic Questions. Under what circumstances is the iterative algorithm used in the data-flow analysis correct ? How precise is the solution obtained by the iterative algorithm? Will the iterative algorithm converge ? - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/1.jpg)
Foundations of Data-Flow Analysis
![Page 2: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/2.jpg)
Basic Questions
Under what circumstances is the iterative algorithm used in the data-flow analysis correct?
How precise is the solution obtained by the iterative algorithm?
Will the iterative algorithm converge? What is the meaning of the solution to the
equations?
![Page 3: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/3.jpg)
Data-Flow Analysis Framework A direction of the data flow D, which is either for
wards or backwards
A semilattice, which includes a domain of values
V and a meet operator A family F of transfer functions from V to V. This
family must include functions suitable for the bou
ndary conditions, which are constant transfer fun
ctions for the special nodes ENTRY and EXIT in
any control flow graph
![Page 4: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/4.jpg)
Example: Reaching Definitions The direction: forwards The domain of values: the set of subsets of
the set of all definitions in the program The meet operator: set union The family of transfer functions: the set of
transfer functions for various statements
![Page 5: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/5.jpg)
Semilattices A semilattice is a set V and a binary meet o
perator such that for all x, y, and z in V: x x = x (meet is idempotent) x y = y x (meet is commutative) x (y z) = (x y) z (meet is associative)
A semilattice has a top element, denoted 丅 , such that for all x in V, 丅 x = x
Optionally, a semilattice may have a bottom element, denoted , such that for all x in V, x =
![Page 6: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/6.jpg)
Example: Reaching Definitions The domain of values is the set of all subsets
of the universal set U, or the power set of U, denoted 2U
The meet operator is the set union The set union is idempotent, commutative,
and associative The top element is the empty set The bottom element is the universal set U
![Page 7: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/7.jpg)
Partial Orders A relation is a partial order on a set V if fo
r all x, y, and z in V: x x (the partial order is reflexive) If x y and y x, then x = y (the partial order is
antisymmetric) If x y and y z, then x z (the partial order is t
ransitive) The pair (V, ) is called a poset, or partially
ordered set We define x < y if and only if x y and x y
![Page 8: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/8.jpg)
The Partial Order for a Semilattice It is useful to define a partial order for a sem
ilattice (V, ). For all x and y in V, we define x y if and only if x y = x
is reflexive: x x = x x x is antisymmetric:
x y x y = x, y x y x = y, x = (x y) = (y x) = y
is transitive: x y x y = x, y z y z = y, (x z) = ((x y) z) = (x (y z )) = (x y) = x x z
![Page 9: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/9.jpg)
Example: Reaching Definitions The relation is the set inclusion
x y = x x y This says that sets larger in size is smaller in
the partial order The set inclusion is reflexive, antisymmetric,
and transitive
![Page 10: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/10.jpg)
Greatest Lower Bounds
A greatest lower bound (or glb) of domain elements x and y is an element g such that
g x, g y, and If z is any element such that z x and z y, t
hen z g
![Page 11: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/11.jpg)
Meet and Greatest Lower Bound The meet of x and y is the greatest lower
bound of x and y Let g = x y g x:
g x = (x y) x = x (y x) = x (x y) = (x x) y = x y = g
g y z x and z y z g
z g = z (x y) = (z x) y = z y = z
![Page 12: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/12.jpg)
Lattice Diagrams
{d2}{d1} {d3}
{d1, d3}{d1, d2} {d2, d3}
{d1, d2, d3}
丅
![Page 13: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/13.jpg)
Product Lattices
The product lattice for lattices (A, A) and (B, B) is defined as follows:
The domain of the product lattice is A B The meet for the product lattice:
(a, b) (a’, b’) = (a A a’, b B b’) The partial order for the product lattice:
(a, b) (a’, b’) iff a A a’ and b B b’ This definition can be extended to the product
of any number of lattices
![Page 14: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/14.jpg)
Example
({},{},{})
({},{d2},{})({d1},{},{}) ({},{},{d3})
({d1},{},{d3})({d1},{d2},{}) ({},{d2},{d3})
({d1}, {d2}, {d3})
丅
![Page 15: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/15.jpg)
Height of a Semilattice An ascending chain in a poset (V, ) is a sequence
x1 < x2 < … < xn
The height of a semilattice is the largest number of < relations in any ascending chain
An iterative data flow analysis algorithm is convergent if the corresponding semilattice has finite height
A lattice consisting of a finite set of values will have a finite height
It is also possible for a lattice with an infinite number of values to have a finite height
![Page 16: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/16.jpg)
Transfer Functions
The family of transfer functions F: V V in a data-flow framework has the following properties:
F has an identity function I, such that I(x) = x for all x in V
F is closed under composition; that is, for any two functions f and g in F, the function h defined by h(x) = g(f(x)) is in F
![Page 17: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/17.jpg)
Example: Reaching Definitions The identity function: gen[B] = kill[B] = Closure under composition:
f1(x) = G1 (x - K1), f2(x) = G2 (x - K2), f2(f1(x)) = G2 ((G1 (x - K1)) - K2)
= (G2 (G1 - K2 )) (x - (K1 K2)).
Let G = G2 (G1 - K2 ) and K = K1 K2. f(x) = f2(f1(x)) = G (x - K).
![Page 18: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/18.jpg)
Monotone Frameworks
A framework (D, F, V, ) is monotone if x y implies f(x) f(y),
for all x and y in V, and f in F Equivalently, a framework (D, F, V, ) is mono
tone if f(x y) f(x) f(y), for all x and y in V, and f in F
![Page 19: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/19.jpg)
Proof of Equivalence
() x y x and x y y f(x y) f(x) and f(x y) f(y) f(x) f(y) is the glb of f(x) and f(y) f(x y) f(x) f(y)() x y x y = x f(x y) = f(x) f(x) f(y) f(y) f(x) f(y)
![Page 20: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/20.jpg)
Distributive Frameworks
A framework (D, F, V, ) is distributive if f(x y) = f(x) f(y)
for all x and y in V, and f in F
Distributivity implies monotonicity
![Page 21: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/21.jpg)
Example: Reaching DefinitionsLet y and z be sets of definitions, and
f(x) = G (x - K)
Then
G ((y z) - K) = (G (y - K)) (G (z - K))
![Page 22: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/22.jpg)
The Iterative Algorithm for General Frameworks: Input A control flow graph, with specially labeled ENTRY
and EXIT nodes, A direction of the data flow D, A set of values V, A meet operator , A set of functions F, where fB in F is the transfer func
tion for basic block B, and A constant value vENTRY or vEXIT in V, representing the
boundary condition for forward and backward frameworks, respectively
![Page 23: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/23.jpg)
The Iterative Algorithm for General Frameworks: Output Values in V for IN[B] and OUT[B] for each
basic block B in the control flow graph
![Page 24: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/24.jpg)
The Iterative Algorithm for General Frameworks: Forward
OUT[ENTRY] = vENTRY;
for (each basic block B other than ENTRY)
OUT[B] := 丅 ;
while (changes to any OUT occur)
for (each basic block B other than ENTRY) {
IN[B] := p pred(B) OUT[p];
OUT[B] := fB(IN[B]);
}
![Page 25: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/25.jpg)
The Iterative Algorithm for General Frameworks: Backward
IN[EXIT] = vEXIT;
for (each basic block B other than EXIT)
IN[B] := 丅 ;
while (changes to any IN occur)
for (each basic block B other than EXIT) {
OUT[B] := s succ(B) IN[s];
IN[B] := fB(OUT[B]);
}
![Page 26: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/26.jpg)
Properties of the Iterative Algorithm If the algorithm converges, the result is a soluti
on to the data-flow equations If the framework is monotone, then the solution
found is the maximum fixedpoint (MFP) of the data-flow equations. The maximum fixedpoint is a solution with the property that in any other solution, the value of IN[B] and OUT[B] are the corresponding values of MFP
If the semilattice of the framework is monotone and finite height, then the algorithm is guaranteed to converge
![Page 27: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/27.jpg)
The Ideal Solution Consider any path
P = ENTRY B1 … Bk-1 Bk The transfer function for P is
fP = fBk-1(fBk-2
( … (fB1) … ))
The ideal solution is
IDEAL[B] = Ppossible paths from ENTRY to B fP(vENTRY) Any answer that is greater than IDEAL is incorr
ect Any value smaller than or equal to IDEAL is co
nservative, i.e., safe
![Page 28: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/28.jpg)
The Meet-Over-Paths Solution Finding all possible paths is undecidable The meet-over-paths solution is
MOP[B] = P paths from ENTRY to B fP(vENTRY) The paths considered in the MOP solution is
a superset of all the paths that are possibly executed
MOP[B] IDEAL[B]
![Page 29: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/29.jpg)
MFP Solution versus MOP Solution The iterative algorithm visits basic blocks, not
necessarily in the order of execution At each confluence point, the algorithm
applies the meet operator to the data-flow values obtained so far. Some of these values used were introduced artificially in the initialization process, not representing the result of any execution from the beginning of the program
![Page 30: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/30.jpg)
Early Meet over Paths
ENTRY
B1 B2
B4
B3
MOP[B4] = ((f B3 f B1
) (f B3 f B2
))(vENTRY)
IN[B4] = f B3 ((f B1
(vENTRY) f B2
(vENTRY)))
![Page 31: Foundations of Data-Flow Analysis](https://reader035.vdocuments.site/reader035/viewer/2022082201/568143a6550346895db02b20/html5/thumbnails/31.jpg)
Comparison of Solutions
Using the iterative algorithm, we have
IN[B] MOP[B]
for monotone frameworks and
IN[B] = MOP[B]
for distributive frameworks
MFP MOP IDEAL