pl&c lab, dongguk university compiler lecture note, intermediate languagepage 1 제 9 장 중...
TRANSCRIPT
Compiler Lecture Note, Intermediate Language Page 1
PL&C Lab, DongGuk University
제 9 장중 간 언어
컴파일러 입문
Compiler Lecture Note, Intermediate Language Page 2
PL&C Lab, DongGuk University
Contents
• Introduction
• Polish Notation
• Three Address Code
• Tree Structured Code
• Abstract Machine Code
• Concluding Remarks
Compiler Lecture Note, Intermediate Language Page 3
PL&C Lab, DongGuk University
• Compiler Model
Source Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
tokens
AST
Front-End
Code Optimizer
Target Code Generator
IC
Back-End
IL
Object Program
Front-End- language dependant partBack-End - machine dependant part
Introduction
Compiler Lecture Note, Intermediate Language Page 4
PL&C Lab, DongGuk University
• IL 의 필요성– Modular Construction
– Automatic Construction
– Easy Translation
– Portability
– Optimization
– Bootstrapping
• IL 의 분류– Polish Notation --- Postfix, IR
– Three Address Code --- Quadruple, Triple, Indirect triple
– Tree Structured Code --- PT, AST, TCOL
– Abstract Machine Code --- P-code, EM-code, U-code, Bytecode
Compiler Lecture Note, Intermediate Language Page 5
PL&C Lab, DongGuk University
• Two level Code Generation
• ILS
– 소스로부터 자동화에 의해 얻을 수 있는 형태– 소스 언어에 의존적이며 high level 이다 .
• ILT
– 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태– 목적기계에 의존적이며 low level 이다 .
• ILS to ILT
– ILS 에서 ILT 로의 번역이 주된 작업임 .
Source Front-End ILS ILS-ILT ILT Back-End Target
Compiler Lecture Note, Intermediate Language Page 6
PL&C Lab, DongGuk University
☞ Polish mathematician Lucasiewiez invented the parenthesis-free notation.
• Postfix(Suffix) Polish Notation• earliest IL
• popular for interpreted language - SNOBOL, BASIC
– general form :
e1 e2 ... ek OP (k ≥ 1)
where, OP : k_ary operator
ei : any postfix expression (1 ≤ i ≤ k)
Polish Notation
Compiler Lecture Note, Intermediate Language Page 7
PL&C Lab, DongGuk University
– example :if a then if c-d then a+c else a*c else a+b
〓〉 a L1 BZ c d - L2 BZ a c + L3 BR
L2: a c * L3 BR L1: a b + L3:
– note1) high level: source to IL - fast & easy translation
IL to target - difficulty
2) easy evaluation - operand stack
3) optimization 부적당 - 다른 IL 로의 translation 필요4) parentheses free notation - arithmetic expression
– interpretive language 에 적합
Source Translator Postfix Evaluator Result
Compiler Lecture Note, Intermediate Language Page 8
PL&C Lab, DongGuk University
• Internal Representation(IR)– low-level prefix polish notation - addressing structure of target
machine
• compiler-compiler IL - table driven code generation
– IR program - a sequence of root-level IR expression
– IR expression:OP e1 e2 ... ... ek (k ≥ 1)
where, OP: k-ary operator - 1-1 correspondence with target machine
instruction.
┌─ root-level operator - not appear in an operand│ root-level IR expression.⇒└─ internal operator - appear in an operand
internal IR expression.⇒
ei : operand --- single symbol or internal IR expression.
Compiler Lecture Note, Intermediate Language Page 9
PL&C Lab, DongGuk University
– exampleD := E⇔ := + d r ↑ + e rwhere, r : local base register
d, e : location of variable D and E + : additive operator ↑ : unary operator giving the value of the
location := : assignment operator(root-level)
– example
FOR D := E TO F DO Loop body;
:= + d r ↑+ e r := + temp r ↑+ f r j L2:L1 Loop body := + d r + ↑+ d r 1:L2 <= L1 ? ↑+ d r ↑+ temp r
D := E; TEMP := F; GOTO 21: Loop body D := D + 1; 2: IF D <= TEMP THEN GOTO 1;
Compiler Lecture Note, Intermediate Language Page 10
PL&C Lab, DongGuk University
– Note1) Shift-reduce parser --- prefix : fewer states than postfix
2) Several addressing mode┌─ prefix : operator 만 보고 결정 (no backup)
└─ postfix : backup 필요
ex) assumption: first operand computed in register r.
r.1 ::= (/ d. 1 r. 2)r.1 ::= (+ r. 1 r. 2)
┌ prefix - [r -> / . d r] │ first operand changed to d and continue └ postfix - [r -> . d r /] [r -> . r r +] shift r, shift r and block([r -> r r . +]) ⇒ backup
3) Easy translationIR to target - easy
source to IR - difficulty
Compiler Lecture Note, Intermediate Language Page 11
PL&C Lab, DongGuk University
• most popular IL, optimizing compiler
• General form:
A := B op C
where, A : result addressB, C : operand addressesop : operator
(1) Quadruple - 4-tuple notation <operator>,<operand1>,<operand2>,<result>
(2) Triple - 3-tuple notation <operator>,<operand1>,<operand2>
(3) Indirect triple - execution order table & triples
Three Address Code
Compiler Lecture Note, Intermediate Language Page 12
PL&C Lab, DongGuk University
– example
• A ← B + C * D / E• F ← C * D
Indirect TripleQuadruple Triple
Operations Triple
* C D T1 (1) * C D 1.(1) (1) * C D
/ T1 E T2 (2) / (1) D 2.(2) (2) / (1) E
+ B T2 T3 (3) + B (2) 3.(3) (3) + B (2)
T3 A (4) A (3) 4.(4) (4) A (3)
* C D T4 (5) * C D 5.(1) (5) F (1)
T4 F (6) F (5) 6.(5)
Compiler Lecture Note, Intermediate Language Page 13
PL&C Lab, DongGuk University
• Note• Quadruple vs. Triple
– quadruple - optimization 용이– triple - removal of temporary addresses
⇒ Indirect Triple
• extensive code optimization 용이– IL rearrange 가능 (triple 제외 )
• easy translation - source to IL
• difficult to generate good code– quadruple to two-address machine
– triple to three-address machine
Compiler Lecture Note, Intermediate Language Page 14
PL&C Lab, DongGuk University
• Abstract Syntax Tree– parse tree 에서 redundant 한 information 제거 .
• ┌ leaf node --- variable name, constant
└ internal node --- operator
– [ 예제 8] --- Text p.377{ x = 0;
y = z + 2 * y;
while ((x<n) and (v[x] != z)) x = x+1;
return x;
}
Tree Structured Code
Compiler Lecture Note, Intermediate Language Page 15
PL&C Lab, DongGuk University
• Tree Structured Common Language(TCOL)– Variants of AST - containing the result of semantic analysis.
– TCOL operator - type & context specific operator
– Context┌ value ----- rhs of assignment statement
├ location ----- lhs of assignment statement
├ boolean ----- conditional control statement
└ statement ----- statement
ex) . : operand --- location result --- value
while : operand --- boolean, statement
result --- statement
Compiler Lecture Note, Intermediate Language Page 16
PL&C Lab, DongGuk University
Example) int a; float b;
...
b = a + 1;
– Representation ----- graph orientation┌ internal notation ------ efficient
└ external notation ------ debug, interface
linear graph notation
Example) int a; float b;
...
b = a + 1;
AST: assign
b add
a 1
TCOL: assign
b float
addi
a
1.
Compiler Lecture Note, Intermediate Language Page 17
PL&C Lab, DongGuk University
• Note– AST ----- automatic AST generation(output of parser)
Parser Generator ┌ leaf node specification
└ operator node specification
– TCOL ----- automatic code generation : PQCC(1) intermediate level:high level --- parse tree like notation
control structure
low level --- data access
(2) semantic specification: dereferencing, coercion, type specific
operator
dynamic subscript and type checking
(3) loop optimization ----- high level control structure
easy reconstruction
(4) extensibility ----- define new TCOL operator
Compiler Lecture Note, Intermediate Language Page 18
PL&C Lab, DongGuk University
• Motivation• ┌ rapid development of machine architectures
└ proliferation of programming languages
– portable & adaptable compiler design --- P_CODE• porting --- rewriting only back-end
– compiler building system --- EM_CODE
M front-ends
N back-ends+ M compilers for N target machines
Abstract Machine Code
Compiler Lecture Note, Intermediate Language Page 19
PL&C Lab, DongGuk University
• Model
front-end
back-end
target machine
abstract machine interpreter
source program
interfacetarget code
abstract machine code
Compiler Lecture Note, Intermediate Language Page 20
PL&C Lab, DongGuk University
• Pascal-P Code• Pascal P Compiler --- portable compiler producing P_CODE
for an abstract machine(P_Machine).
• P_Machine ----- hypothetical stack machine designed for
Pascal language.(1) Instruction --- closely related to the PASCAL language.
(2) Registers ┌ PC --- program counter
│ NP --- new pointer
│ SP --- stack pointer
└ MP --- mark pointer
(3) Memory ┌ CODE --- instruction part
└ STORE --- data part(constant area, stack, heap)
Compiler Lecture Note, Intermediate Language Page 21
PL&C Lab, DongGuk University
CODE PC
STOREstack
heap
MP current activation record
SP
NP
constant area
Compiler Lecture Note, Intermediate Language Page 22
PL&C Lab, DongGuk University
Ucode Ucode
the intermediate form used by the Stanford Portable Pascal compiler. stack-based and is defined in terms of a hypothetical stack machine. Ucode Interpreter : Appendix B.
Addressing stack addressing ===> a tuple : (B, O)
B : the block number containing the address O : the offset in words from the beginning of the block,
offsets start at 1.
label to label any Ucode instruction with a label field. All targets of jumps and procedures must be labeled. All labels must be unique for the entire program.
Compiler Lecture Note, Intermediate Language Page 23
PL&C Lab, DongGuk University
Example :
Consider the following skeleton :
program main procedure P procedure Q var i : integer; j : integer;
block number main : 1 P : 2 Q : 3
variable addressing i : (3,1) j : (3,2)
Compiler Lecture Note, Intermediate Language Page 24
PL&C Lab, DongGuk University
Ucode Operations(35 개 )
Unary --- notop, neg Binary --- add, sub, mult, divop, modop, swp
andop, orop, gt, lt, ge, le, eq, ne
Stack Operations --- lod, str, ldr, ldp Immediate Operation --- ldc
Control Flow --- ujp, tjp, fjp, cal, ret
Range Checking --- chkh, chkl
Indirect Addressing--- ixa, sta
Procedure Specification --- proc, endop Program Specification --- bgn
Procedure Calling Sequence --- cal Symbol Table Information --- sym
Compiler Lecture Note, Intermediate Language Page 25
PL&C Lab, DongGuk University
Example : x = a + b * c; lod 1 1 /* a */ lod 1 2 /* b */ lod 1 3 /* c */ mult add str 1 4 /* x */
if (a>b) a = a + b; lod 1 1 /* a */ lod 1 2 /* b */ gt fjp next lod 1 1 /* a */ lod 1 2 /* b */ add str 1 1 /* a */
next
Compiler Lecture Note, Intermediate Language Page 26
PL&C Lab, DongGuk University
Indirect Addressing
is used to access both array elements and var parameters.
ixa --- indirect load replace stacktop by the value of the item at location stacktop. to retrieve A[i] :
lod i /* actually (Bi, Oi)) */
ldr A /* also (block number, offset) */
add /* effective address */
ixa /* indirect load gets contents of A[i] */
to retrieve var parameter x :
lod x /* loads address of actual - since x is var */
ixa /* indirect load */
Compiler Lecture Note, Intermediate Language Page 27
PL&C Lab, DongGuk University
• sta --- indirect store– sta stores stacktop into the address at stack[stacktop-1],
both items are popped.
– A[i] = j;
lod i
ldr A
add
lod j
sta
– x := y, where x is a var parameter
lod x
lod y
sta
Compiler Lecture Note, Intermediate Language Page 28
PL&C Lab, DongGuk University
Procedure Calling Sequence
procedure definition : procedure A(var a : integer; b,c : integer);
procedure call : A(x, expr1, expr2);
calling sequence :ldp
ldr x /* load the address of actual for var parameter */
… /* code to evaluate expr1 --- left on the stack */
… /* code to evaluate expr2 --- left on the stack */
cal A
Compiler Lecture Note, Intermediate Language Page 29
PL&C Lab, DongGuk University
Ucode Interpreter
The Ucode interpreter is called ucodei, it’s source is on plac.dongguk.ac.kr.
The interpreter uses the following files : *.ucode : file containing the Ucode program. *.lst : Ucode listing and output from the program.
Ucode format
label-field op-code operand-field
1-10 12-m m+2
m is exactly enough to hold opcode. label field --- a 10 character label(make sure its 10 characters pad with blanks) op-code --- starts at 12 column.
Compiler Lecture Note, Intermediate Language Page 30
PL&C Lab, DongGuk University
Programming Assignment #3
• 부록 B 에 수록된 Ucode 인터프리터를 각자 PC 에 설치하고 100 이하의 소수 (prime number) 를 구하는 프로그램을 Ucode 로 작성하시오 .
– 다른 문제의 프로그램을 작성해서 제출해도 됨 .
– Ucode 인터프리터 출력 리스트를 제출 .
• 참고 :– #1 : recursive-decent parser
– #2 : MiniPascal LR parser
Compiler Lecture Note, Intermediate Language Page 31
PL&C Lab, DongGuk University
• IL criteria
– intermediate level– input language --- high level
– output machine --- low level
– efficient processing– translation --- source to IL, IL to target
– interpretation
– optimization
– extensibility
– external representation
– clean separation– language dependence & machine dependence
Concluding Remarks
Compiler Lecture Note, Intermediate Language Page 32
PL&C Lab, DongGuk University
PolishNotation
Three AddressCode
Tree StructuredCodeIL
CriteriaPost IR Quadra Triple AST TCOL
AbstractMachine
Code
intermediate level C B B B C A B
source to ILtransration
A C B B A B C
IL to targettranslation
C A B B C A A
interpretation B B B B C C A
efficient
processing
optimization C B A C A A B
externalrepresentation
A A A A C B A
extensibility A A A A A A B
clean separation C B B B C A A
A : 좋다B : 보통이다C : 나쁘다