chapter 10 code optimization zhang jing, wang hailing college of computer science & technology...

57
Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Sc ience & Technology H arbin Engineering University

Upload: bethanie-hawkins

Post on 20-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

Chapter 10 Code Optimization

Zhang Jing, Wang HaiLing

College of Computer Science & Technology

Harbin Engineering University

Page 2: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 2

As we imagined, the target code made by compiler should run faster or take less space, or both. .

In fact, this goal is difficult to be achieved or only can be reached in limited cases.

In order to obtain the goal, we use code improving transformations which is called optimizing. Of course, code optimization can only guarantee the possibility that the code is best. .

Page 3: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 3

There are two types of code optimization, the first one is machine-independent

optimizations which means the optimization has no relationship with properties of the target machine, optimizations of this type is on the level of intermediate code or source program.

The second one is the optimization which is related with target machine, namely, the optimization is based on the level of target code. The position of code optimization in compiler is shown below. .

Page 5: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 5

As we all know, a compiler is a program that reads the source program in a high-level language and translates it into (typically) machine language. This is a complicated process involving a number of stages. If the compiler is an optimizing compiler, one of these stages "optimizes" the machine language code so that it either takes less time to run or occupies less memory or sometimes both. .

Page 6: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 6

Of course, whatever optimizations the compiler does, it must not affect the logic of the program i.e. the optimization must preserve the meaning of the program. One might wonder what type of optimizations the compiler uses to produce efficient machine code? Since in no case the meaning of the program being compiled should be changed, the compiler must inspect the program very thoroughly and find out the suitable optimizations that can be applied. .

Page 7: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 7

10.1 Classifications of optimizations Optimizations that are performed automatically by a compiler or manually by the programmer, can be classified by various characteristics

The scope of the optimization:

(1) Local optimizations - Performed in a part of

one procedure.

1) Common sub-expression elimination.

Page 8: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 8

2) Using registers for temporary results, and if possible for variables.

3) Replacing multiplication and division by shift and add operations.

(2) Global optimizations - Performed with the help of data flow analysis.

1) Code motion (hoisting) outside of loops.

2) Constant propagation.

3) Strength reductions.

Page 9: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 9

The improvement in optimization:

(1) Space optimizations - Reduces the size of the executable/object.

1) Constant folding.

2) Dead-code elimination.

3) Redundant Code Elimination.

4) Unreachable Code Elimination.

Page 10: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 10

(2) Speed optimizations - Most optimizations belong to this category

The code types of optimization:

(1) Source program optimization.

(2) Three address code optimization.

(3) Quadruples code optimization.

(4) target code optimizations.

Page 11: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 11

10.2 Source program optimizations Source program optimization is that optimizations

work regardless of processor or compiler and the object is source program. .

1. Eliminating common sub-expressions Register operations are much faster than memory

operations, so all compilers try to put in registers data that is supposed to be heavily used, like temporary variables and array indexes.

Page 12: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 12

To facilitate such register scheduling, the largest sub-expressions may be computed before the smaller ones. This is an old optimization trick that compilers are able to perform quite well: .

Example1:

X = A * LOG(Y) + (LOG(Y) ** 2)

t = LOG(Y) X = A * t + (t ** 2)

Optimize

Page 13: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 13

Example2:

/* Sum neighbors of i,j */up = val[(i-1)*n + j];down = val[(i+1)*n + j];left = val[i*n + j-1];right = val[i*n + j+1];sum = up + down + left + right;

int inj = i*n + j;up = val[inj - n];down = val[inj + n];left = val[inj - 1];right = val[inj + 1];sum = up + down + left + right;

Optimize

Page 14: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 14

2. Redundant Code Elimination

i:=m-1j:=nt:=4*nv:=a[t]s:=m-1u:=a[s]i:=i+1

i:=m-1j:=nt:=4*nv:=a[t]u:=a[i]i:=i+1

Optimize

Page 15: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 15

3. Unreachable Code Elimination

A common example of unreachable code elimination is an if statement. If the compiler finds out that the condition inside the if is never going to be true, then the body of the if statement will never be executed. In that case, the compiler can completely eliminate this unreachable code, thus saving the memory space occupied by the code. .

Page 16: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 16

i:=m-1if (j>0) goto L1j:=nt:=4*nv:=a[t]L1:v:=a[i]i:=i+1……..

i:=m-1v:=a[i]i:=i+1……..

Optimize

Page 17: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 17

4. Dead Code Elimination Dead code is the code in the program that will

never be executed for any input or other conditions. The dead code example is an constant that it has never been used, it is shown below.

i:=m-1j:=nt:=4*nv:=a[t]i:=v+1……..

i:=m-1j:=nt:=4*nv:=a[t]i:=v+1……..

Optimize

Page 18: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 18

5. Strength Reduction

To replace an equivalent but cheaper (shorter) sequence. One type of code optimization is strength reduction in which a "costly" operation is replaced by a less expensive one. For example, the evaluation of x2 is much more efficient if we multiply x by x rather than call the exponentiation routine. One place where this optimization can be applied is in loops.

Page 19: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 19

Replace costly operation with simpler one. Shift, add instead of multiply or divide

16*x x << 4

Utility of this optimization is machine dependent. Depends on cost of multiply or divide instruction, shift or add is usually a single cycle operation. Recognize sequence of products turn them into a sequence of adds.

Page 20: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 20

Example 1

i:=m-1

j:=i

i:=i+i

i:=j+i

……..

i:=m-1j:=ii:=3*i……..

Optimize

Page 21: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 21

Example 2

r1:=r2*2 r1:=r2+r2 r1:=r2<<1

r1:=r2/2 r1:=r2>>1

r1:=0Optimizer1:=r2*0

Optimize

Optimize Optimize

Page 22: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 22

6. Constant folding: Constant folding is the simplest code optimization to

understand. Let us suppose that you write the statement x = 45 * 88; in your C program. A non-optimizing compiler will generate code to multiply 45 by 88 and store the value in x. An optimizing compiler will detect that both 45 and 88 are constants, so their product will also be a constant. Hence it will find 45 * 88 = 3960 and generate code that simply copies 3960 into x. This is constant folding, and means the calculation is done just once at compile time, rather than every time the program is run.

Page 23: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 23

r2:=3*2 r2:=6Optimize

Elimination of redundant loads and stores.

r2:=6i:=r2r3:=ir4:=r3*3

r2:=6i:=r2r4:=r2*3

Optimize

Page 24: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 24

Constant propagation:

r2:=4r3:=r1+r2r2:=….

r2:=4r3:=r1+4r2:=….

r3:=r1+4r2:=….

Optimize Optimize

r1:=3r2:=r1*2

r1:=3r2:=3*2

r1:=3r2:=6

Optimize Optimize

Page 25: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 25

Copy propagation:

Elimination of useless instructions

r1:=r1+0r1:=r1*1

r2:=r1r3:=r1+r2r2:=5

r2:=r1r3:=r1+r1r2:=5

r3:=r1+r1r2:=5

Optimize Optimize

Page 26: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 26

7. Loop optimizations

A very important part for optimization is loops. If we make the number of instructions in a loop decreased, the running time of a program will be improved, though sometimes, it maybe cause the number of code outside the loop increased. .

Page 27: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 27

There are three ways for loop optimization: code motion, induction-variable elimination and reduction in strength. Code motion is to move code outside a loop; induction-variable elimination means to eliminate extra variable from a loop; reduction in strength can replace a complicated operation by a simple one.

Reduce frequency with which computation performed , if it will always produce same result, so moving code out of loop. .

Page 28: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 28

Example 1

Example 2

J=2*4 While (i<=j)

While (i<= 2*4) ……

Optimize

Optimize

for (i = 0; i < n; i++)for (j = 0; j < n; j++)a[n*i + j] = b[j];

for (i = 0; i < n; i++) {int ni = n*i;for (j = 0; j < n; j++)a[ni + j] = b[j];}

Page 29: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 29

Example 3

Moving as much as possible computations outside loops, saves computing time. In the following example (2.0 * PI) is an invariant expression that there is no reason to recompute it 100 times. .

DO I = 1, 100 ARRAY(I) = 2.0 * PI * IENDDO

t = 2.0 * PIDO I = 1, 100 ARRAY(I) = t * IENDDO

Optimize

Page 30: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 30

So we can conclude that the transformation of loop optimization is

(1) Take an expression transformation to get the same result with the transformation before and to obtain independent of the time number.

(2) Place the expression before the loop.

Page 31: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 31

10.3 Optimizations of three- address code

Actually, there are several blocks in one program, namely, block means a part of program with one entrance and one exit, further more, block running is in sequence. For example, figure 10.2a is a block, on the other hand figure 10.2 b is not a block, because there are two exits in it, so it is only a part of program. .

Page 33: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 33

Combining the results of two expressions that they should be in one block. The procedure of combining is firstly to calculate the result of constants in one expression and then use the new result to replace all the calculation which is related with the constants. So the step of combining in detail is as follows. .

Page 34: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 34

Step1 recognize constant expression. Step2 replace constant expression by the result

of constants computing Step3 generate target code according to the

combine results.

Page 35: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 35

Example 10.1

Page 36: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 36

Because the value of “a” is known at compiling, we can compute the known values and replace them by their result to obtain the optimized three address code. .

Before combining constant, we should first create a symbol table “Tab” which has two fields, field N stores variable name, field V deposits the variable value. The format of three address code is shown below, the first part is operatorω; second one is operand1, we name it P1; the third part is operand2, we call it P2. .

Page 37: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 37

operator operand1 operand2

ω P1 P2

We combine constants from top of three address code to the end of three address code in block, the pointer of three address code is “i”.

Page 38: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 38

The algorithm of combining constants is:

1 If operator does not equal to “ : =”: P1 or P2 is the variable name in symbol table

“Tab”, we can use the value V of P1 or P2 to replace P1 or P2 in three address code.

If operator equals to “ : =” .

(1) P1 is variable in “Tab”, we can replace P1 in three address code by its value V in “Tab”. .

Page 39: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 39

(2) P1 is constant, we can find P2 in “Tab”, if P2 is in “Tab”, we replace value of P2 by value of P1, if P2 is not in “Tab”, store ( P2 , P1 ) to “Tab”, add 1 to pointer i of “Tab”. .

(3) P1 is not constant and there is P2 in “Tab”, delete P2 and its value from “Tab”. .

Page 40: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 40

2. If both P1 and P2 are constant, we can combine them, namely, replace ( ω , P1 , P2 ) by ( α ,P1ωP2 , 0 ) , format of P1 in three address code equals P1ωP2 here, format P2 in three address code equals 0, αpresents the result of P1ωP2. .

3. If P1 or P2 is the number of three address code, and the operator in three address code belongs to this number equal to α, then we use P1 in three address code which belongs to this number replace P1 or P2 in present three address code.

Page 41: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 41

4. If i is the end number of three address code, then exit; on the other hand, if i is not the end number of three address code, then i : =i+1, and return to step 1.

5. The three address codes which their operator does not equal toαare optimized three address codes.

Page 42: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 42

We can optimize Example 10.1 by the optimization algorithm above.

( 1 )(: = , 10 ,a )( 2 ) ( + , a , 20 )( 3 )(: = ,( 2 ),b )( 4 )( / , b , a )( 5 )(: = ,( 4 ),c )

( 1 )(: = , 10 ,a )( 2 ) ( α , 30 ,0 ) (3) (: = , 30 ,b )( 4 ) ( α= , 3 ,0 )( 5 )(: = , 3 ,c )

( 1 ) ( a,10 )( 2 ) ( a,10 )( 3 ) ( a,10 ) ( b,30 )( 4 ) ( a,10 ) ( b,30 )( 5 ) ( a,10 ) ( b,30 )( c,3 )

three address code optimizing symbol table of it

Page 43: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 43

The optimized three address code is

( 1 )(: = , 10 , a )( 3 )(: = , 30 , b )( 5 )(: = , 3 , c )

Page 44: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 44

10.4 Optimizations of quadruples

Actually, dead code elimination is done in a block. We shall take a block of quadruples for example to introduce which instruction is extra code and should be eliminated. .

Page 45: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 45

Example 10.2

a : =b*c+a ;d : =b*c+a;c : =b*c+a

( 1 )( * , b , c , T

1 )( 2 )( + , T1 , a ,T2 )( 3 )(: = , T2 , ,a )( 4 )( * , b , c , T

3 )( 5 )( + , T3 , a ,T4 )( 6 )(: = , T4 , ,d )( 7 )( * , b , c , T

5 )( 8 )( + , T5 , a ,T6 )( 9 )(: = , T6 , ,c )

A block

four address code of the block

Page 46: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 46

From the code above, we know that instruction 4 and 7 are same with instruction 1, in addition, they have the same results. Instruction 8 does the same with calculation of instruction 5. In order to optimize the code, instruction 4, 7 and 8 should not be computed, because they can be replaced by others. .

How to judge the extra instructions automatically? We can use the depending algorithm to recognize it.

Page 47: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 47

Depending algorithm

(1) At first, we define that depending number for every instruction is 0, namely, dep(X)=0.

(2) If the format of four address code is ( ω , A , B , Ti ) , then dep ( Ti ) =max ( dep ( A ), dep ( B )) +1 (3) If variable “a” is endowed value by instruction i, that is (: = ,

b , , a ) , then dep ( a ) =i (4) If two instruction i and j (i<j) have the same format like ( ω, P1, P

2, ) , and their depending number is same as well , we can judge that instruction j is extra and would not be computed any more, the instruction can be changed

( Same , Ti , Tj , 0 )

Page 48: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 48

especially, ifωis a operator which position of operands can be exchanged, we can say ( ω, P1, P2, ) have same format with ( ω, P2 , P1 , ) .

With the help of depending algorithm, the optimized code of example10.2 is shown by table 10.1. .

Page 50: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 50

10.5 Optimizations of target code

The aim of optimized code is to generate its target code, so this section, we will take expression for example to explain how to optimize target code.

Example 10.3 An expression:

a*b+c/d+a*(a*b+c/d)-a*(c/d+b*a)/d The target code of the expression is:

Page 51: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 51

CLA a /*push “a” to stack*/ MUL b /* “a” from stack multiple “b” , and then push the computing result to stack */ STO T1 /*store the result of stack to T1*/ CLA c /*push “c” to stack*/ DIV d /* value “c” from stack divided by “d” ,and then push the computing result to stack */ ADD T1 /* value from stack add T1 , and then push the computing result to stack */ STO T1 CLA a MUL b STO T2 /*store the result of stack to T2*/ CLA c DIV d ADD T2 MUL a /* value from stack multiple “a” , and then push the computing result to stack */ ADD T1 STO T1 CLA c DIV d STO T2 CLA b MUL a ADD T2 /* value from stack add T2 , and then push the computing result to stack */ MUL a DIV d /* value from stack divided by “d” ,and then push the computing result to stack */ RUB T1 /* T1 minus the value from stack ,and then push the computing result to stack */

Page 52: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 52

Its three address code of it is:

(1) ( * , a , b )(2) ( / , c , d )(3) ( + ,( 1 ),( 2 ))(4) ( * , a ,( 3 ))(5) ( +, ( 3 ) , ( 4 ))(6) ( / ,( 4 ), d )(7) (—,( 5 ),( 6 ))

Page 53: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 53

The appearing time of three address code can be judged by the appearing times of three address code order. For example, the appearing time of three address code order 3 and order 4 are 2, the others are 1. .

Before optimizing target code, we define a concept of AC that labels the situation of stack. At beginning, AC=0; after completing one instruction, AC≠0. If the appearing time of three address code is more than 1, store the result of the three address code to a stack, and then AC=0.

Page 54: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 54

Now we begin to generate the optimized target code of example 10.3 from its three address code.

AC=0, three address code ( * , a , b ) , the target code:

CLA a MUL b

AC≠0, three address code ( / , c , d ) , the target code

STO T1 CLA c DIV d

Page 55: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 55

AC≠0, three address code ( + ,( 1 ),( 2 ))、( * , a ,( 3 )) , the target code:

AC=0, three address code ( + ,( 3 ),( 4 )) , the target code:

ADD T1 STO T2

MUL a STO T3

ADD T2

Page 56: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 56

AC≠0, three address code ( / ,( 4 ), d ) , the target code:

AC≠0, three address code (—,( 5 ),( 6 )) , the target code:

STO T4 CLA T3

DIV d

RUB T4

Page 57: Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University

[email protected] 57

So the optimized target code of it is:

CAL aMUL bSTO T1

CLA cDIV dADD T1

STO T2

MUL aSTO T3

ADD T2

STO T4

CLA T3

DIV dRUB T4

From the example above, we can see that there are 25 instructions before optimization, but after optimization, it has only 14 instructions. .