cpsc 388 – compiler design and construction

Post on 30-Dec-2015

37 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

CPSC 388 – Compiler Design and Construction. Optimization. Optimization Goal. Produce Better Code Fewer instructions Faster Execution Do Not Change Behavior of Program!. Optimization Techniques. Peep-hole optimization Done after code generation Makes small local changes to assembly - PowerPoint PPT Presentation

TRANSCRIPT

CPSC 388 – Compiler Design and Construction

Optimization

Optimization Goal

Produce Better Code Fewer instructions Faster Execution

Do Not Change Behavior of Program!

Optimization Techniques Peep-hole optimization

Done after code generation Makes small local changes to assembly

Moving Loop-Invariants Done before code generation Find Computations in loops that can be moved

outside Strength Reduction in for loops

Done before code generation Replace multiplications with additions

Copy Propagation Done before code generation Replace use of variable with literal or other variable

Peep-hole Optimization Look through small window at assembly

code for common cases that can be improved

1. Redundant load2. Redundant push/pop3. Replace a Jump to a jump4. Remove a Jump to next instruction5. Replace a Jump around jump6. Remove Useless operations7. Reduction in strength

Redundant Load

Beforestore Rx, M

load M, Rx

Afterstore Rx, M

Redundant Push/Pop

Beforepush Rx

pop Rx

After… nothing …

Replace a jump to a jump

Before goto L1

L1:goto L2

After goto L2

L1:goto L2

Remove a Jump to next Instruction

Before goto L1

L1:…

AfterL1:…

Replace a jump around jump

Before if T0 = 0 goto L1

else goto L2

L1:…

After if T0 != 0 goto L2

L1:…

Remove useless operations

Beforeadd T0, T0, 0

mul T0, T0, 1 After

… nothing …

Reduction in Strength

Beforemul T0, T0, 2

add T0, T0, 1 After

shift-left T0

inc T0

One optimization may lead to another

load Tx, M

add Tx, 0

store Tx, M After One Optimization:

load Tx, M

store Tx, M After Another Optimization:

load Tx, M

You Try It The code generated from this program contains opportunities for the first

two kinds (redundant load, jump to a jump). Can you explain how just by looking at the source code?

public class Opt {

public static void main() { int a; int b;

if (true) { if (true) { b = 0; } else { b = 1; } return; } a = 1; b = a; }}

Moving Loop-Invariant Computations Out of the Loop

For greatest gain, optimize “hot spots”, i.e. inner loops.

An expression is loop invariant if the same value is computed on every iteration of the loop

Compute the value once outside loop and reuse value inside loop

Example

for (int i=0;i<100;i++) {

for (int j=0;j<100;j++) {

for (int k=0;k<100;k++) {

A[i][j][k]=i*j*k;

}

}

}

Examplefor (int i=0;i<100;i++) {

for (int j=0;j<100;j++) {for (int k=0;k<100;k++) {

T0=i*j*k;T1=FP+<offset of A>-i*4000-j*400-

k*4;Store T0, 0(T1)

}}

}Invariant to I loopInvariant to J loopInvariant to K loop

Exampletmp0=FP + <offset of A>for (int i=0;i<100;i++) {

tmp1=tmp0-i*4000;for (int j=0;j<100;j++) {

tmp2=tmp1-j*400;tmp3=i*j;for (int k=0;k<100;k++) {

T0=tmp3*k;T1=tmp2-k*4;store T0, 0(T1)

}}

}

Comparison before and afterof inner most loop(executed 1 million times)

Original Code 5 multiplications (3

for lvalue, 2 for rvalue)

3 subtractions(for lvalue)

1 indexed store

New Code 2 multiplications (1

for lvalue, 1 for rvalue)

1 subtraction (for lvalue)

1 indexed store

Questions

How do you recognize loop-invariant How do you recognize loop-invariant expressions?expressions?

When and where do we move the When and where do we move the computations of those expressions?computations of those expressions?

Recognizing Loop Invariants

An expression is invariant with respect to a loop if for every operand, one of the following holds: It is a literal It is a variable that gets its value only

from outside the loop

When and Where to move invariant expressions

Must consider safety of move

Must consider profitability of move

Safety of moving invariants

If evaluating expression might cause an error and the loop might not get executed:b=a;

while (a != 0) {

x = 1/b; //possible “/0” if moved

a--;

}

Safety of moving invariants What about preserving order of

events? if the unoptimized code performed

output THEN had runtime error Is it valid for the optimized code to

simply have runtime error? Changing order of computations may

change result for floating-point computations due to differing precisions

Profitability of moving invariants

If the computation might NOT execute in the original program then moving the computation might actually slow down the program!

Moving is Safe and Profitable If

Loop will execute at least once Code will execute if loop does

Isn’t inside any condition Is on all paths through loop (both if and

else portions) Expression is in non short-circuited

part of the loop test E.g. while (x < i+j*100)

You Try It

What are some examples of loops for which the compiler can be sure that the loop will execute at least once?

Strength Reduction

Concentrate on “hot spots”

Replace expensive operations (*) with cheaper ones (+)

Example Strength Reduction

For i from low to high do

…i*k1+k2

Where i is the loop index K1 and K2 are constant with respect to

the loop

Consider the sequence of values for i and expression

Examples Strength Reduction

Iteration #

i i*k1+k2

1 low low*k1+k2

2 low+1 (low+1)*k1+k2=low*k1+k2+k1

3 low+1+1 (low+1+1)*k1+k2=low*k1+k2+k1+k1

Example Strength Reduction

Compute low*k1+k2 once before loop Store value in a temporary Use the temporary instead of the

expression inside loop Increment temporary by k1 at the end

of the loop

Example Strength Reduction

temp=low*k1+k2

For i from low to high do

…temp…

temp=temp+k1

end

Another Exampletmp0 = FP + offset Afor (i=0; i<100; i++) { tmp1 = tmp0 - i*40000 // i * -40000 + tmp0 for (j=0; j<100; j++) { tmp2 = tmp1 - j*400 // j * -400 + tmp1 tmp3 = i*j // j * i + 0 for (k=0; k<100; k++) { T0 = tmp3 * k // k * tmp3 + 0 T1 = tmp2 - k*4 // k * -4 + tmp2 store T0, 0(T1) } }}

Now Perform Strength Reduction

tmp0 = FP + offset Atemp1 = tmp0 // temp1 = 0*-40000+tmp0for (i=0; i<100; i++) { tmp1 = temp1 temp2 = tmp1 // temp2 = 0*-400+tmp1 temp3 = 0 // temp3 = 0*i+0 for (j=0; j<100; j++) { tmp2 = temp2 tmp3 = temp3

temp4 = 0 // temp4 = 0*tmp3+0temp5 = tmp2 // temp5 = 0*-4+tmp2

for (k=0; k<100; k++) { T0 = temp4 T1 = temp5 store T0, 0(T1)

temp4 = temp4 + tmp3 temp5 = temp5 - 4

}temp2 = temp2 - 400temp3 = temp3 + i

} temp1 = temp1 - 40000}

You Try It Suppose that the index variable is

incremented by something other than one each time around the loop. For example, consider a loop of the form:

for (i=low; i<=high; i+=2) ...

Can strength reduction still be performed? If yes, what changes must be made to the proposed algorithm?

Copy Propagation

Statements of the form “x=y” (called d) are called copy statements. For every use, u, of variable x reached by a copy statement such that: No other definition of x reaches u, and y can’t change between d and u

You can replace the use of x at u with a use of y.

Examples of Copy Propagation

x=y

a=x+z

x=y

if (…) x=2

a=x+z

x=y

if (…) y=3

a=x+z

Yes No No

Question

Why is this a useful transformation?

If ALL uses of x reached by definition d are replaced, then the definition of d is useless, and can be removed.

tmp0 = FP + offset Atemp1 = tmp0 // cannot be propagatedfor (i=0; i<100; i++) { tmp1 = temp1 temp2 = tmp1 // cannot be propagated temp3 = 0 // cannot be propagated for (j=0; j<100; j++) { tmp2 = temp2 tmp3 = temp3

temp4 = 0 // cannot be propagatedtemp5 = tmp2 // cannot be propagated

for (k=0; k<100; k++) { T0 = temp4 T1 = temp5 store T0, 0(T1)

temp4 = temp4 + tmp3 temp5 = temp5 - 4

}temp2 = temp2 - 400temp3 = temp3 + i

} temp1 = temp1 - 40000}

tmp0 = FP + offset Atemp1 = tmp0for (i=0; i<100; i++) { temp2 = temp1 temp3 = 0 for (j=0; j<100; j++) {

temp4 = 0 temp5 = temp2

for (k=0; k<100; k++) { store temp4 0(temp5)

temp4 = temp4 + temp3 temp5 = temp5 - 4

} temp2 = temp2 - 400 temp3 = temp3 + i

} temp1 = temp1 - 40000}

Comparision before and after

Before 5 *, 3 +/-, 1

indexed store in inner most loop

After 2 +/- in inner most

loop 2 +/-, 2 copy

statements in middle loop

1 +/-, 1 copy in outer loop

top related