machine-independent optimizations Ⅰ cs308 compiler theory1

42
Machine-Independent Optimizations CS308 Compiler Theory 1

Upload: eustace-morgan

Post on 22-Dec-2015

242 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Machine-Independent Optimizations Ⅰ

CS308 Compiler Theory 1

Page 2: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Code optimization

• Elimination of unnecessary instructions

• Replacement of one sequence of instructions by a faster sequence of instructions

• Local optimization

• Global optimizations– based on data flow analyses

CS308 Compiler Theory 2

Page 3: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

The Principal Sources of Optimization

• Optimization– Preserves the semantics of the original program

– Applies relatively low-level semantic transformations

CS308 Compiler Theory 3

Page 4: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Causes of Redundancy

• Redundant operations are– at the source level

– a side effect of having written the program in a high-level language

• Each of high-level data-structure accesses expands into a number of low-level arithmetic operations

• Programmers are not aware of these low-level operations and cannot eliminate the redundancies themselves.

• By having a compiler eliminate the redundancies– The programs are both efficient and easy to maintain.

CS308 Compiler Theory 4

Page 5: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

A Running Example: Quicksort

CS308 Compiler Theory 5

Page 6: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

CS308 Compiler Theory 6

Page 7: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Semantics-Preserving Transformations

• A number of ways in which a compiler can improve a program without changing the function it computes– Common-sub expression elimination

– Copy propagation

– Dead-code elimination

– Constant folding

CS308 Compiler Theory 7

Page 8: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Common Subexpressions

• Common subexpression– Previously computed

– The values of the variables not changed

• Local:

CS308 Compiler Theory 8

Page 9: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Common Subexpressions

• Global

CS308 Compiler Theory 9

Page 10: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

CS308 Compiler Theory 10

Page 11: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Copy Propagation

• Copy statements or Copies– u = v

CS308 Compiler Theory 11

Page 12: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Dead-Code Elimination

• Live variable– A variable is live at a point in a program if its value can be used subsequently;

– otherwise, it is dead at that point.

• Constant folding– Deducing at compile time that the value of an expression is a constant and using the

constant instead

CS308 Compiler Theory 12

Page 13: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

CS308 Compiler Theory 13

Page 14: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Code Motion

• An important modification that decreases the amount of code in a loop

• Loop-invariant computation– An expression that yields the same result independent of the number of times a loop is

executed

• Code Motion takes loop-invariant computation before its loop

CS308 Compiler Theory 14

while (i <= limit-2)

t = limit -2while (i <= t)

Page 15: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Induction Variables and Reduction in Strength

• Induction variable– For an induction variable x, there is a positive or negative constant c such that each time x is

assigned, its value increases by c

• Induction variables can be computed with a single increment (addition or subtraction) per loop iteration

• Strength reduction– The transformation of replacing an expensive operation, such as multiplication, by a

cheaper one, such as addition

• Induction variables lead to – strength reduction

– eliminate computation

CS308 Compiler Theory 15

Page 16: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Now We have:

CS308 Compiler Theory 16

Inside-out

Page 17: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

CS308 Compiler Theory 17

Page 18: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Test yourself

• E-9.1.1

CS308 Compiler Theory 18

Page 19: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Data-Flow Analysis

• Techniques that derive information about the flow of data along program execution paths

• Examples– One way to implement global common sub expression elimination requires us to determine

whether two identical expressions evaluate to the same value along any possible execution path of the program.

– If the result of an assignment is not used along any subsequent execution path, then we can eliminate the assignment as dead code.

CS308 Compiler Theory 19

Page 20: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

The Data-Flow Abstraction

• Execution paths– Within one basic block, the program point after a statement is the same as the program

point before the next statement.

– If there is an edge from block B1 to block B2 , then the program point after the last statement of B1 may be followed immediately by the program point before the first statement of B2.

• Define an execution path from point P1 to point Pn to be a sequence of points P1 , P2 , . . . , Pn such that for each i = 1 , 2, . . . , n - 1, either1 . Pi is the point immediately preceding a statement and Pi+1 is the point immediately

following that same statement, or

2. Pi is the end of some block and Pi+1 is the beginning of a successor block.

• Reaching definition– The definitions that may reach a program point along some path

CS308 Compiler Theory 20

Page 21: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

The Data-Flow Analysis Schema

• data-flow value– represents an abstraction of the set of all possible program states that can be observed for a

program point

• Domain– The set of possible data-flow values for the application.

– Example: the domain of data-flow values for reaching definitions is the set of all subsets of definitions in the program.

• Denote the data-flow values before and after each statement s by IN[S] and OUT[S]

• Data-flow problem– to find a solution to a set of constraints on the IN [S] 'S and OUT[S] 'S, for all statements S.

– Two sets of constraints: those based on the semantics of the statements ("transfer functions" ) and those based on the flow of control.

CS308 Compiler Theory 21

Page 22: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Transfer Functions

• The data-flow values before and after a statement are constrained by the semantics of the statement.

• transfer function– Both a and b will have the same value after the b=a statement.

– Transfer function of a statement s is denoted as fs

• Two flavors of transfer function– Information propagate forward along execution paths

– Flow backwards up the execution paths

CS308 Compiler Theory 22

Page 23: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Control-Flow Constraints

• Simple for within a basic block– if a block B consists of statements Sl , S2 , . . . , Sn in that order, then the control-flow

value out of Si is the same as the control-flow value into Si+1.

• Complicated for between basic blocks

CS308 Compiler Theory 23

Page 24: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Data-Flow Schemas on Basic Blocks

• IN[B] , OUT[B]– denote the data-flow values immediately before and immediately after basic block B

• IN[B] = IN[S1], OUT[B] = OUT[Sn]– Suppose block B consists of statements Sl , . . . , Sn , in that order.

• fB = fSn ○ • • • ○ fS2 ○ fS1

CS308 Compiler Theory 24

Page 25: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Reaching Definitions

• A definition d reaches a point p if there is a path from the point immediately following d to p, such that d is not "killed" along that path.

• A definition of a variable x is killed if there is any other definition of x anywhere along the path.

• Conservative– if we do not know whether a statement s is assigning a value to x, we must assume that it

may assign to it.

CS308 Compiler Theory 25

Page 26: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Transfer Equations for Reaching Definitions

• Generates a definition d of variable u and

• Kills all other definitions in the program that define variable u

• Transfer function of definition d can be expressed as

• where gend = {d} , the set of definitions generated by the statement, and killd is the set of all other definitions of u in the program.

CS308 Compiler Theory 26

Page 27: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Transfer Equations for Reaching Definitions

• If

• Then

CS308 Compiler Theory 27

Page 28: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Transfer Equations for Reaching Definitions

• Suppose block B has n statements, with transfer functions

for Then

CS308 Compiler Theory 28

Page 29: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Transfer Equations for Reaching Definitions

• The gen set contains all the definitions inside the block that are "visible" immediately after the block

• Downwards exposed– A definition is downwards exposed in a basic block only if it is not "killed" by a subsequent

definition to the same variable inside the same basic block.

– A basic block's kill set is simply the union of all the definitions killed by the individual statements.

– kill=kill1 U kill2={d1,d2}

– gen=gen2 U (gen1-kill2) = {d2}

– f(x)={d2} U (x-{d1,d2}) //always includes d2

CS308 Compiler Theory 29

Page 30: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Control-Flow Equations

• OUT[P] IN[B] whenever there is a control-flow edge from P to B.

• IN[B] needs to be no larger than the union of the reaching definitions of all the predecessor blocks

CS308 Compiler Theory 30

Page 31: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Iterative Algorithm for Reaching Definitions

• The reaching definitions problem is defined by the following equations:

• for all basic blocks B other than ENTRY

CS308 Compiler Theory 31

Page 32: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Iterative Algorithm for Reaching Definitions

Algorithm : Reaching definitions.

INPUT: A flow graph for which killB and genB have been computed for each block B.

OUTPUT: IN[B ] and OUT[B]

METHOD:

CS308 Compiler Theory 32

Page 33: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

CS308 Compiler Theory 33

Page 34: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Live-Variable Analysis

• In live-variable analysis we wish to know for variable x and point p whether the value of x at p could be used along some path in the flow graph starting at p. If so, we say x is live at p; otherwise, x is dead at p.

• Definitions:

1. defB: the set of variables defined in B prior to any use of that variable in B

2. useB: the set of variables whose values may be used in B prior to any

definition of the variable.

CS308 Compiler Theory 34

Page 35: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Live-Variable Analysis

• Equations relating def and use:

for all basic blocks B other than EXIT

CS308 Compiler Theory 35

Page 36: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Live-Variable Analysis

• Algorithm: Live-variable analysis.

• INPUT: A flow graph with def and use computed for each block.

• OUTPUT: IN[B] and OUT[B].

• METHOD:

CS308 Compiler Theory 36

Page 37: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Available Expressions

• An expression x + y is available at a point p:– if every path, from the entry node to p evaluates x + y, and after the last such evaluation

prior to reaching p, there are no subsequent assignments to x or y.

• A block kills expression x + y :– if it assigns (or may assign) x or y and does not subsequently recompute x + y.

• A block generates expression x + y :– if it definitely evaluates x + y and does not subsequently define x or y.

CS308 Compiler Theory 37

Page 38: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Available Expressions

• The primary use of available-expression information is for detecting global common subexpressions.

CS308 Compiler Theory 38

Page 39: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Available Expressions

• Computation of the set of generated expressions– At point p set S of expressions is available, and q is the point after p, with statement x=y+z

1. Add to S the expression y + z.

2. Delete from S any expression involving variable x .

Example:

CS308 Compiler Theory 39

Page 40: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Available Expressions

• Let

IN[B] be the set of expressions that are available before B

OUT[B] be the same for the point following the end of B

e_genB be the expressions generated by B

e_killB be the set of expressions killed in B

• Then

• For all basic blocks B other than ENTRY

CS308 Compiler Theory 40

Page 41: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Available Expressions

• Algorithm: Available expressions.

• INPUT: A flow graph with e_killB and e_genB computed for each block B. The initial block is B1 .

• OUTPUT: IN[B] and OUT[B].

• METHOD:

CS308 Compiler Theory 41

Page 42: Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1

Test yourself

• gen, kill, IN, OUT sets

for each block

• e_gen, e_kill, IN, OUT sets

for available expressions.

• def, use, IN, OUT sets

for live variable analysis.

CS308 Compiler Theory 42