cse322, programming languages and compilers 1 6/27/2015 lecture #16, may 31, 2007 data flow...
Post on 21-Dec-2015
215 views
TRANSCRIPT
Cse322, Programming Languages and Compilers
104/18/23
Lecture #16, May 31, 2007• Data flow equations,•available expressions•live variables,•solving data flow equations,•constant propagation,•using associativity and commutativity•computing dominators.
Cse322, Programming Languages and Compilers
204/18/23
Assignments
• Reminders– project #3 is Now due Friday, June 7, 2007 at 5:00 PM
» In order to get the course graded, there will be no extensions
– Final exam will be Tuesday June 12,
• Possibility – Next Tuesday, June 5, limited lecture, and I answer questions in class about the project.
• Project Three Info– The template has been posted since Wed.
• I have finished grading project 1. But I am missing projects from several people.
Cse322, Programming Languages and Compilers
304/18/23
Notes about using GCC as an assembler• Since I last taught this course, 64-bit computers
have become much more common.• Compiling in native 64-bit isn’t always compatible
with the strategy we discussed earlier.– gcc –m32 myfunc.s runtime.c
– I have successfully done this using a dosbox and gcc 3.2.3. This is an emulator set up to emulate a 386 (can you say very old?) architecture. The professor was using CYGWIN on a Windows machine, i.e., very similar to the dosbox.
– OK, that was easier than I thought it would be. Passing "-m32" does the trick for gcc on my x86_64 Ubuntu system, but only after installing 32-bit runtime libs (sudo apt-get install libc6-dev-i386). Fortunately, it looks like the 32-bit libs are installed on the linuxlab boxes.
• The underscore convention can be controlled using the following strategy.
– In the file Phase3.sml, use one of the follwoing– fun globalName nm = nm;– fun globalName nm = "_"^nm;– To control if you want the underscore to appear.
Cse322, Programming Languages and Compilers
404/18/23
Data-Flow Equations• Compute certain sets for each basic block.• Examples: set of
– available expressions when entering and leaving a block
– variable definitions (assignments) reaching and leaving a block
– live variables on block entry and exit
– etc.
• The sets are computed by solving some recursive equations on the flow graph.
• There are different equations for each kind of set.
• After computing the sets for each block, it is (relatively) easy to propagate the information to each point inside a block.
Cse322, Programming Languages and Compilers
504/18/23
Available expressions• Used to eliminate common subexpressions
(inside basic blocks was done using Dags).
• Definitions: an expression is
generated when it is computed
and killed when any of its operands
is changed.
• An expression is available at a point p if it has been generated but not killed before control reaches p.
• Inside a basic block, the control flow is sequential (no jumps) so it is easy to propagate Available set information.
Cse322, Programming Languages and Compilers
604/18/23
Available Expressions (cont.)
Example
(1) a := b+c ; generate E1=b+c
(2) d := a*e ; generate E2=a*e
(3) b := c+d ; kill E1 (but not E2!),
; generate E3=c+d
(4) e := e+1 ; kill E2, generate
; and then kill E4 = e+1
• Among the expressions generated inside this block, only E3 is available at the end of the block; E3 is generated by the block.
Cse322, Programming Languages and Compilers
704/18/23
Available expressions (cont.)
• Example: suppose E=a+b is generated somewhere in the program
(1) b := 5 ;kill E
• E is killed by some instruction inside the block; we will say that E is killed by the block.
• Definitions:
• genAv[i] = the set of expressions generated by block i.• killAv[i] = the set of expressions killed by block i.• inAv[i] = the set of expressions available at the
beginning of block i.• outAv[i] = the set of expressions available at the end
of block i.
Cse322, Programming Languages and Compilers
804/18/23
Available expressions (cont.)
• Equations (forward-flow, all-path):
• inAv[i] = ∩k pred i outAv[k]• outAv[i] = genAv[i] U (inAv[i]-killAv[i])• inAv[1] = { }
• At compile time we can only compute an approximation of the “real” (run-time) in and out.
• To be safe, in this case “approximation” means a “subset”.
• The above equations have many solutions (e.g., all empty) and they are all safe approximations.
• To be useful we want the largest solution.
k1 kn. . .
i
Cse322, Programming Languages and Compilers
904/18/23
Live variables (again)• A variable is live at a certain point if its value at that point
could be used later in the program; otherwise it is dead.
• usedlv[i] = the set of variables used before they are assigned a new value in block i.
• killlv[i] = the set of variables defined (assigned) in block i.
• inlv[i] = the set of live variables at the beginning of block i.
• outlv[i] = the set of live variables at the end of block i.
Cse322, Programming Languages and Compilers
1004/18/23
Live variables (cont.)
• Example
• The inlv-set of the following block is
{b,c,e}.
(1) a := b+c ; use b,c; kill a; live: a,c,e
(2) d := a*e ; use a,e; kill d; live: c,d,e
(3) b := c+d ; use c,d; kill b; live: e
(4) e := e+1 ; use e; kill e; live:
(5) end
b is no longer live, since it will not be
used again
Cse322, Programming Languages and Compilers
1104/18/23
Live variables (cont.)
Equations (backward-flow, any-path):
• outlv[i] = U k succ i inlv[k]• inlv[i] = usedlv[i] U (outlv[i]-killlv[i])
• Note: there are no initial conditions.• Here we want the approximations to be
supersets of the run-time sets.• To be useful we want the smallest
solutions.
. . .
i
k1 kn
Cse322, Programming Languages and Compilers
1204/18/23
Solving Data-Flow Equations• Arbitrary control flow can be achieved by composing three
structures:– sequence (;)
– conditional (if-then-else)
– loop (while-do)
• Disadvantage: may lead to duplication of code.
• The flow graph of structured programs has special properties.
• Solving data-flow equations on these graphs can be reduced to computing attributes.
• In general a more complicated scheme is necessary.
Cse322, Programming Languages and Compilers
1304/18/23
Solving Data-Flow Equations (cont.)
• The general method for solving data-flow equations is by iteration (sometimes called fixpoint iteration).
• General algorithm:
1. Compute local information (i.e., gen/used, kill) at each basic block.
2. For each block i, assign in[i] and out[i] some initial values.
3. Use the recursive equations as assignment statements to compute new values for in[i] and out[i].
4. If the new values are the same as the old ones then stop, otherwise go to step 3.
Cse322, Programming Languages and Compilers
1404/18/23
Solving Data-Flow Equations (cont.)
• In general:
• To compute the smallest solution start with “small” initial values (e.g., empty sets); at each iteration obtain a larger new value.
• To compute the largest solution start with “large” initial values (e.g., “total” sets); at each iteration obtain a smaller new value.
Cse322, Programming Languages and Compilers
1504/18/23
Solving Data-Flow Equations (cont.)
• Example: live variables
3
(1) a := 0
(2) b := a+1(3) c := a+b
(4) print a (5) d := c+1(6) a := c+d
1
2
4
Cse322, Programming Languages and Compilers
1604/18/23
Solving Data-Flow Equations (cont.)
• Local information:
kill[1] = {a} used[1] = { }
kill[2] = {b,c} used[2] = {a}
kill[3] = { } used[3] = {a}
kill[4] = {d,a} used[4] = {c}
• Initial values (want the smallest solution):
in[i] = out[i] = { }, i=1,2,3,4.
Cse322, Programming Languages and Compilers
1704/18/23
Solving Data-Flow Equations (cont.)
Iteration 1:out[i] = { }, i=1,2,3,4
in[1] = { }
in[2] = {a}
in[3] = {a}
in[4] = {c}
• Iteration 2:out[1] = {a}
out[2] = {a,c}
out[3] = { }
out[4] = {a}
in[1] = { }
in[2] = {a}
in[3] = {a}
in[4] = {c}
• There will be no more changes
outlv[i] = U k succ i inlv[k]inlv[i] = usedlv[i] U (outlv[i]-killlv[i])
(1) a := 0
(2) b := a+1(3) c := a+b
(4) print a (5) d := c+1(6) a := c+d
1
2
4
Cse322, Programming Languages and Compilers
1804/18/23
Solving Data-Flow Equations (cont.)
• Using the live variables information:
• b is not in out[2] so it is a local variable in block 2. It does not need to be preserved when control leaves block 2.
• d is local in block 4.
Cse322, Programming Languages and Compilers
1904/18/23
Constant Propagation
• Consider
x <- 0
y <- z * x
w <- y+4
q <- w-1
• We know quite a bit about what is going on, even before we execute the first statement.
x is 0
y is also 0 (why?)
w is 4
q is 3
Cse322, Programming Languages and Compilers
2004/18/23
Constant propagation as a dataflow problem
• General algorithm:1. Compute local information at each basic block.
2. For each block i, assign in[i] and out[i] some initial values.
3. Use the recursive equations as assignment statements to compute new values for in[i] and out[i].
4. If the new values are the same as the old ones then stop, otherwise go to step 3.
• In constant propagation the blocks are individual statements in the IR
• Information are constants for each variable (or unknown)
• Example Information set = { (a,3),(b,4),(c,)}
unknown or not a constant
Cse322, Programming Languages and Compilers
2104/18/23
Equations
• constants(n) = ppreds(n) Fp(constants(p))
is the pairwise meet of two information elements (a,x) (a,y)
• And Fp is a function specific to each instruction that is explained on the second following page.
Cse322, Programming Languages and Compilers
2204/18/23
The Meet
• (a,x) (a,y) = if x=y then (a,x) else (a,)
• Note that– (a,x) (a,) = (a,)
– (a,) (a,y) = (a,)
• For a variable a, the meet of two constants is the constant if they are both the same constant, and unknown otherwise.
Cse322, Programming Languages and Compilers
2304/18/23
Instruction specific function Fp
Each instruction has its own Fp
• The Fp tracks the effect of that instruction on constants
• x <- y
F(x <- y) (p) = if p has the form {(x,c1),(y,c2), … ,}
then (p – {(x,c1)}) {(x,c2)}
• x <- y `op` z
F(x <- y `op` z) (p) = if p has the form {(x,c1),(y,c2),(z,c3) … ,}
then (p – {(x,c1)}) {(x,c2 `op` c3)}
Cse322, Programming Languages and Compilers
2404/18/23
Operation specific rules
• If an operation has an identity element or a erasure element, then we can use special rules
• x <- n + 0• x <- n * 0• x <- n * 1
Cse322, Programming Languages and Compilers
2504/18/23
x <- 0 {(x,0)}
y <- z * x {(x,0),(y,z*0)}={(x,0),(y,0)}
w <- y+4 {(x,0),(y,0),(w,4)}
q <- w-1 {(x,0),(y,0),(w,4),(q,3)}
x <- z*3 {(y,0),(w,4),(q,3)}
Once the constants are computed the code can be specialized.x <- 0
y <- 0
w <- 4
q <- 3
x <- z*3
Cse322, Programming Languages and Compilers
2604/18/23
ML code• Strategy
– Simplify IR expressions under a mapping of variables to constants.– Build such a mapping by a data flow analysis– Rewrite each IR statement using constant information.
• Complication– In IR1 constants are stored as strings.– We need to convert back and forth to integers to do simplification
fun constInt "true" = 1 | constInt "false" = 0 | constInt s = (case (Int.fromString s) of SOME n => n | NONE => raise(BadConst s))
val toString = Int.toString;
Cse322, Programming Languages and Compilers
2704/18/23
Optimizing constructions
• We normally build an addition expression by
– plus x y = BINOP(ADD,x,y)
• Use knowledge of ADD and constants to perform some addition at IR.EXP build time.
fun plus(CONST(n,Int),CONST(m,Int))
= CONST(toString(constInt n +
constInt m),Int)
| plus(CONST("0",Int),n) = n
| plus(n,CONST("0",Int)) = n
| plus(x,y) = BINOP(ADD,x,y)
Cse322, Programming Languages and Compilers
2804/18/23
SUB and MULfun minus(CONST(n,Int),CONST(m,Int)) = CONST(toString(constInt n – constInt m),Int) | minus(n,CONST("0",Int)) = n | minus(x,y) = BINOP(SUB,x,y)
fun times(CONST(n,Int),CONST(m,Int)) = CONST(toString(constInt n * constInt m),Int) | times(CONST("0",Int),n) = CONST("0",Int) | times(n,CONST("0",Int)) = CONST("0",Int) | times(CONST("1",Int),n) = n | times(n,CONST("1",Int)) = n | times(x,y) = BINOP(MUL,x,y)
Cse322, Programming Languages and Compilers
2904/18/23
Strategy
• Rebuild every EXP by using the optimizing constructors.
fun simp BINOP(MUL,x,y))
= times(simp x,simp y)
| simp env (BINOP(ADD,x,y))
= plus(simp x,simp y))))
| simp env (BINOP(SUB,x,y))
= minus(simp x, simp y)
| simp env (BINOP(m,x,y))
= BINOP(m,simp x, simp y)
| simp env x = x;
Cse322, Programming Languages and Compilers
3004/18/23
Using associativity and commutativity• Consider
– V1 * (3 * (P2 * 6))
– P2 + 0 + T3 + 4
• The strategy on the previous page won’t work. why?• But equivalent expressions
– V1 * P2 * 3 * 6
– P2 + T3 + 0 + 4
will work.• How do we arrange this?• Flatten into a list
– [V1, 3, P2, 6]– [P2, 0, T3, 4]
• Rearrange the list– [V1, P2, 3, 6]– [P2, T3, 0, 4]
• Then apply optimizing construction from right-to-left
Cse322, Programming Languages and Compilers
3104/18/23
flattening to a list
fun flatADD (BINOP(ADD,x,y)) =
flatADD x @ flatADD y
| flatADD x = [x]
fun flatMUL (BINOP(MUL,x,y)) =
flatMUL x @ flatMUL y
| flatMUL x = [x]
Cse322, Programming Languages and Compilers
3204/18/23
Rearranging and optimizing
fun split xs =
let fun constp (CONST(_,Int)) = true
| constp x = false
in (List.filter (Bool.not o constp) xs)
@
(List.filter constp xs) end;
fun useOper oper unit [x] = x
| useOper oper unit [] = unit
| useOper oper unit (x::xs)
= oper(x,useOper oper unit xs);
note left to write order of applying oper
Cse322, Programming Languages and Compilers
3304/18/23
Simplifyingfun simp env (BINOP(MUL,x,y)) = useOper times ONE (split (flatMUL (BINOP(MUL,simp env x,simp env y)))) | simp env (BINOP(ADD,x,y)) = useOper plus ZERO (split (flatADD (BINOP(ADD,simp env x,simp env y)))) | simp env (BINOP(SUB,x,y)) = minus(simp env x, simp env y) | simp env (BINOP(m,x,y)) = BINOP(m,simp env x, simp env y) | simp env (v as (VAR _ | TEMP _ | PARAM _)) = (case List.find (fn (a,b) => a=v) env of NONE => v | SOME(_,SOME m) => CONST(m,Int) | SOME(_,NONE) => v) | simp env x = x;
note, we look variables up!
Cse322, Programming Languages and Compilers
3404/18/23
Constant propagation• Create and propagate a mapping of variables bound
to constants.• Use the data flow equations
– constants(n) = ppreds(n) Fp(constants(p))
• In a sequence of straightline code, like we have in a basic block, every node has exactly one predecessor, except the first which has none.
• We will use a list of pairs to represent the constants function: (EXP,string Option) list
fun overRide v m [] = [(v,m)]
| overRide v m ((x,y)::zs) =
if v=x then ((x,m)::zs)
else (x,y)::overRide v m zs
Cse322, Programming Languages and Compilers
3504/18/23
4 interesting casesfun constProp info [] = [] | constProp info (x::xs) = (case x of MOVE(v as(VAR _ | TEMP _ | PARAM _), n as CONST(m,Int)) => let val info2 = overRide v (SOME m) info in (x,info2) :: constProp info2 xs end | MOVE(v as(VAR _ | TEMP _ | PARAM _) ,u as(VAR _ | TEMP _ | PARAM _)) => (case List.find (fn (a,b) => a=u) info of SOME(_,w) => let val info2 = overRide v w info in (x,info2) :: constProp info2 xs end | NONE => let val info2 = overRide v NONE info in (x,info2) :: constProp info2 xs end) | MOVE(v as(VAR _ | TEMP _ | PARAM _),u as BINOP(_,_,_)) => (case simp info u of (new as CONST(m,INT)) => let val info2 = overRide v (SOME m) info in (MOVE(v,new),info2) :: constProp info2 xs end | new => let val info2 = overRide v NONE info in (MOVE(v,new),info2) :: constProp info2 xs end) | _ => (x,info):: constProp info xs)
Cse322, Programming Languages and Compilers
3604/18/23
x <- 5
Move a constant into a variable
MOVE(v as(VAR _ | TEMP _ | PARAM _)
,n as CONST(m,Int)) =>
let val info2 = overRide v (SOME m) info
in (x,info2) :: constProp info2 xs end
Cse322, Programming Languages and Compilers
3704/18/23
x <- y• Move a variable into another variable
MOVE(v as(VAR _ | TEMP _ | PARAM _)
,u as(VAR _ | TEMP _ | PARAM _)) =>
(case List.find (fn (a,b) => a=u) info of
SOME(_,w) =>
let val info2 = overRide v w info
in (x,info2)::constProp info2 xs end
| NONE =>
let val info2 = overRide v NONE info
in (x,info2)::constProp info2 xs end)
Cse322, Programming Languages and Compilers
3804/18/23
x <- y + 3• Move an expression into a variable
MOVE(v as(VAR _ | TEMP _ | PARAM _) ,u as BINOP(_,_,_)) =>(case simp info u of (new as CONST(m,INT)) => let val info2 = overRide v (SOME m) info in (MOVE(v,new),info2) :: constProp info2 xs end | new => let val info2 = overRide v NONE info in (MOVE(v,new),info2) :: constProp info2 xs end)
Cse322, Programming Languages and Compilers
3904/18/23
ExampleT0 := 0
T1 := (V5 * T0)
T2 := T1 + 4
T3 := T2 - 1
T0 := (V5 * 3)
T0 := 0 % T0=0
T1 := 0 % T0=0, T1=0
T2 := 4 % T0=0, T1=0, T2=4
T3 := 3 % T0=0, T1=0, T2=4, T3=3
T0 := (V5 * 3) % T1=0, T2=4, T3=3