introduction to data flow analysis
DESCRIPTION
Introduction to Data Flow Analysis. Data Flow Analysis. Construct representations for the structure of flow-of-data of programs based on the structure of flow-of-control of programs - PowerPoint PPT PresentationTRANSCRIPT
1
Introduction to Data Flow Analysis
2
Data Flow AnalysisData Flow Analysis
• Construct representations for the structure of flow-of-data of programs based on the structure of flow-of-control of programs
• Collect information about the attributes of data at various program points according to the structure of flow-of-data of programs
3
PointsPoints
• Within each basic block, a point is assigned between two adjacent statements, before the first statement, and after the last statement
4
An ExampleAn Example
d1: i = m - 1d2: j = nd3: a = u1
d4: i = i + 1
d5: j = j - 1
d6: a = u2
B1
B2
B3
B4
B5 B6
5
PathsPaths
• A path from p1 to pn is a sequence of points p1, p2, …, pn such that for each i, 1 i n-1, either
• pi is the point immediately preceding a statement and pi+1 is the point immediately following that statement in the same block, or
• pi is the end of some block and pi+1 is the beginning of a successor block
6
An ExampleAn Example
d1: i = m - 1d2: j = nd3: a = u1
d4: i = i + 1
d5: j = j - 1
d6: a = u2
B1
B2
B3
B4
B5 B6d7: i = u3
if e3
7
Reaching DefinitionsReaching Definitions
• A definition of a variable x is a statement that assigns or may assign a value to x
• A definition d of some variable x reaches a point p if there is a path from the point immediately following d to p such that no unambiguous definition of x appear on that path
8
An ExampleAn Example
d1: i = m - 1;d2: j = n;d3: a = u1; dod4: i = i + 1;d5: j = j - 1; if e1 thend6: a = u2 elsed7: i = u3 while e2
9
Ambiguity of Definitions Ambiguity of Definitions
• Unambiguous definitions (must assign values)– assignments to a variable– statements that read a value to a variable
• Ambiguous definitions (may assign values)– procedure calls that have call-by-reference paramet
ers– procedure calls that may access nonlocal variables– assignments via pointers
10
Safe or Conservative InformationSafe or Conservative Information
• Consider all execution paths of the control flow graph
• Allow definitions to pass through ambiguous definitions of the same variables
• The computed set of reaching definitions is a superset of the exact set of reaching definitions
11
Information for Reaching Information for Reaching DefinitionsDefinitions
• gen[S]: definitions generated within S and reaching the end of S
• kill[S]: definitions killed within S
• in[S]: definitions reaching the beginning of S
• out[S]: definitions reaching the end of S
12
Data Flow EquationsData Flow Equations
• Data flow information can be collected by setting up and solving systems of equations that relate information at various points
out[S] = gen[S] (in[S] - kill[S])
The information at the end of a statement is either generated within the statement or enters at the beginning and is not killed as control flows through the statement
13
The Iterative AlgorithmThe Iterative Algorithm
• Repeatedly compute in and out sets for each node in the control flow graph simultaneously until there is no change
in[B] = p pred(B) out[P]
out[B] = gen[B] (in[B] - kill[B])
14
Algorithm: Reaching DefinitionsAlgorithm: Reaching Definitions
/* Assume in[B] = for all B */for each block B do out[B] := gen[B]change := true;while change do begin change := false; for each block B do begin in[B] := p pred(B) out[p] oldout := out[B] out[B] := gen[B] (in[B] - kill[B]) if out[B] oldout then change := true endend
15
An ExampleAn Example
d1: i = m - 1d2: j = nd3: a = u1
d4: i = i + 1d5: j = j - 1
d6: a = u2
B1
B2
B3 B4
d7: i = u3
111 0000000 1111
000 1100110 0001
000 0010001 0000
000 0001100 1000
16
An ExampleAn Example
BlockInitial Pass 1 Pass 2
In[B] In[B] In[B]Out[B] Out[B]Out[B]
000 0000 000 0000 111 0000 000 0000 111 0000111 0000
000 0000 111 0011 001 1110 111 1111 001 1110000 1100
000 0000 001 1110 001 0111 001 1110 001 0111000 0001
000 0000 001 1110 000 1110 001 1110 000 1110000 0010
B1
B2
B3
B4
17
Conservative ComputationConservative Computation
• The computed gen set of reaching definitions is a superset of the exact gen set of reaching definitions
• The computed kill set of reaching definitions is a subset of the exact kill set of reaching definitions
• The computed in and out sets of reaching definitions is a superset of the exact in and out sets of reaching definitions
18
Local Data Flow InformationLocal Data Flow Information
• The gen and kill sets for a basic block is obtained from the gen and kill sets for the statements in the basic block
• Only the in and out sets for the basic blocks are computed in the global data flow analysis
• The in and out sets for the statements in a basic block can be computed locally from the in set for the basic block if necessary
19
UD-Chains and DU-ChainsUD-Chains and DU-Chains• A variable is used at statement s if its r-valu
e may be required• The reaching definitions information is ofte
n stored as use-definition chains (or ud-chains)
• The ud-chain for a use u of a variable x is the list of all the definitions of x that reach u
• The definition-use chains (or du-chains) for a definition d of a variable x is the list of all the uses of x that use the value defined at d
20
A Taxonomy of Data Flow A Taxonomy of Data Flow ProblemsProblems
in[B] = p pred(B) out[p]
out[B] = gen[B] (in[B] - kill[B])
out[B] = s succ(B) in[s]
in[B] = gen[B] (out[B] - kill[B])
in[B] = p pred(B) out[p]
out[B] = gen[B] (in[B] - kill[B])
out[B] = s succ(B) in[s]
in[B] = gen[B] (out[B] - kill[B])
Any
path
All
path
Forward-Flow Backward-Flow
21
Available ExpressionsAvailable Expressions
• An expression x+y is available at a point p if every path from the initial node to p evaluates x+y, and after the last such evaluation prior to reaching p, there are no subsequent assignments to x or y
22
An ExampleAn Example
t2 = 4 * i
t1 = 4 * i
i = …t0 = 4 * i
t2 = 4 * i
t1 = 4 * i
?
23
The gen and kill SetsThe gen and kill Sets
• A block kills expression x+y if it possibly assign x or y and does not subsequently reevaluate x+y
• A block generates expression x+y if it definitely evaluates x+y and does not subsequently redefine x or y
24
The gen Set for a BlockThe gen Set for a Block
• No expressions are available at the beginning
• Assume set A of expressions is available before statement x = y+z. The set of expressions available after the statement is formed by– adding to A the expression y+z– deleting from A any expression involving x
• At the end, A is the set of generated expressions
25
The kill Set for a BlockThe kill Set for a Block
• All expressions y+z such that either y or z is defined and y+z is not generated by the block
26
An ExampleAn Example
Statements Available Expressions
……………….…………nonea = b + c ……………….…………only b + cb = a - d ……………….…………only a - dc = b + c …………….……………only a - dd = a - d …………….……………none
27
The in and out SetsThe in and out Sets
in[B] = , for B = initial
in[B] = p pred(B) out[p], for B initial
out[B] = gen[B] (in[B] - kill[B])
28
Initialization of the in SetsInitialization of the in Sets
B1
B2
Oj+1 = G (Ij - K)Ij+1 = out[B1] Oj+1
I0 = O1 = GI1 = out[B1] GO2 = G
I0 = UO1 = U - KI1 = out[B1] - KO2 = G (out[B1] - K)
29
Algorithm: Available ExpressionsAlgorithm: Available Expressions/* Assume in[B1] = and in[B] = U for all B B1 */in[B1] = ;out[B1] = gen[B1];for each block B B1 do out[B] := U - kill[B];change := true;while change do begin change := false; for each block B B1 do begin in[B] := p pred(B) out[p] oldout := out[B] out[B] := gen[B] (in[B] - kill[B]) if out[B] oldout then change := true endend
30
Conservative ComputationConservative Computation
• The computed gen set of available expressions is a subset of the exact gen set of available expressions
• The computed kill set of available expressions is a superset of the exact kill set of available expressions
• The computed in and out sets of available expressions is a subset of the exact in and out sets of available expressions
31
Live VariablesLive Variables
• A variable x is live at a point p if the value of x at p could be used along some path in the control flow graph starting at p; otherwise, x is dead at p
32
The def and use SetsThe def and use Sets
• def[B]: the set of variables definitely assigned values in B
• use[B]: the set of variables whose values are possibly used in B prior to any definition of the variable
33
The in and out SetsThe in and out Sets
out[B] = s succ(B) in[s]
in[B] = use[B] (out[B] - def[B])
34
Algorithm: Live VariablesAlgorithm: Live Variables/* Assume in[B] = for all B */for each block B do in[B] := change := true;while change do begin change := false; for each block B do begin out[B] := s succ(B) in[s] oldin := in[B] in[B] := use[B] (out[B] - def[B]) if in[B] oldin then change := true endend
35
Conservative ComputationConservative Computation
• The computed use set of live variables is a superset of the exact use set of live variables
• The computed def set of live variables is a subset of the exact def set of live variables
• The computed in and out sets of live variables is a superset of the exact in and out sets of live variables
36
Busy ExpressionsBusy Expressions
• An expression is busy at a point p if along all paths from p to the final node its value is used before the expression is killed
37
The use and kill SetsThe use and kill Sets
• use[B]: the set of expressions that are used before they are killed in B
• kill[B]: the set of expressions that are killed before they are used in B
38
The in and out SetsThe in and out Sets
out[B] = , for B = final
out[B] = s succ(B) in[s], for B final
in[B] = use[B] (out[B] - kill[B])
39
Algorithm: Busy ExpressionsAlgorithm: Busy Expressions/* Assume out[Bn] = and out[B] = U for all B Bn */out[Bn] = ;in[Bn] = use[Bn];for each block B Bn do in[B] := U - kill[B];change := true;while change do begin change := false; for each block B Bn do begin out[B] := s succ(B) in[s] oldin := in[B] in[B] := use[B] (out[B] - kill[B]) if in[B] oldin then change := true endend
40
Conservative ComputationConservative Computation
• The computed use set of busy expressions is a subset of the exact use set of busy expressions
• The computed kill set of busy expressions is a superset of the exact kill set of busy expressions
• The computed in and out sets of busy expressions is a subset of the exact in and out sets of busy expressions