052-syntaxdirectedtranslation.ppt
TRANSCRIPT
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
1/57
Syntax DirectedTranslation
Professor Yihjia TsaiTamkang University
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
2/57
2
Phases of a Compiler
1. Lexical Analyzer (Scanner)
Takes source Program and Convertsinto tokens
2. Syntax Analyzer (Parser)
Takes tokens and constructs a parsetree.
3. Semantic AnalyzerTakes a parse tree and constructs an
abstract syntax tree with attributes.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
3/57
3
Phases of a Compiler-Contd
4.
Takes an abstract syntax tree and
produces an Interpreter code(Translation output)
5. Intermediate-code Generator
Takes an abstract syntax tree andproduces un- optimizedIntermediate code.
Syntax Directed Translation
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
4/57
4
Motivation: Parser asTranslator
Syntax-directedtranslation
Parser
Syntax + translation rules(often hardcoded in the parser)
Stream oftokens
ASTs, byte codeassembly code,etc
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
5/57
5
Important
Syntax directed translation: attaching actionsto the grammar rules (productions).
The actions are executed during thecompilation (not during the generation of thecompiler, not during run time of the program!).Either when replacing a nonterminal with itsrhs (LL, top-down)or a handle with a nonterminal (LR, bottom-up).
The compiler-compiler generates a parserwhich knows how to parse the program(LR,LL). The actions are implanted in theparser and are executed according to theparsing mechanism.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
6/57
6
Example :Expressions
E E + T
E T
T T * F
T F
F
( E ) F num
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
7/57
7
Synthesized Attributes
The attribute value of the terminal atthe left hand side of a grammar ruledepends on the values of the attributeson the right hand side.
Typical for LR (bottom up) parsing.
Example: TT*F{$$.val=$1.val$3.val}.
T.val
T.val F.val
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
8/57
8
Example :Expressions InLEX E E + T
{$$.val:=$1.val+$3.val;}
E T {$$.val:=$1.val;}
T T * F {$$.val:=$1.val*$3.val;}
T F {$$.val:=$1.val;}
F
( E ) {$$.val:=$2.val;} F num {$1.val:=$1.val;}
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
9/57
9
Example 2:Typedefinitions
D T L
T int
T real
L id , L
L
id
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
10/57
10
Inherited attributes
The value of the attributes of one of thesymbols to the right of the grammar ruledepends on the attributes of the other
symbols (left or right). Typical for LL parsing (top down).
D T {$2.type:=$1.type} L
L id , {$3.type:=$1.type} LD.type
,id
L.type
L.typeT.type L.type
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
11/57
11
Type definitions
D T {$2.type:=$1.type} L T int {$$.type:=int;} T real {$$:=real;}
L id , L {gen(id.name,$$.type); $3.type:=$$.type;} L id {gen(id.name,$$.type); }
T.type
int
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
12/57
12
Type definitions: LL(1)
D T {$2.type:=$1.type} L T int {$$.type:=int;} T real {$$:=real;}
L
id {gen(id.name,$$.type); $2.type:=$$.type;} R R , id {gen(id.name,$$.type); } R
T.type
int
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
13/57
13
How to arrange things forLL(1) on stack?
Include on the stack, except forthe grammar symbol also the
actions, and a shadow copy foreach nonterminal.
Each time one sees an action onthe stack, execute it.
Shadow copies are used to getsynthesized values and pass themfurther to the right of the rule.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
14/57
LR parser
LR(k)parser
action goto
a + b $
Given the currentstate on top andcurrent token, consultthe action table.
Either, shift, i.e.,read a new token, putin stack, and pushnew state, or
or Reduce, i.e.,remove some
elements from stack,and given the newlyexposed top of stackand current token, to
be put on top of stack,consult the goto tableabout new state ontop of stack.
s0
sn-1
sn
X0
Xn-1
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
15/57
15
LR parser adapted.
LR(k)parser
action goto
a + b $ Same as before, plus:
Whenever reducestep, execute the
action associated withgrammar rule.If left-to rightinherited attributesexist, can alsoexecute actions in
middle of rule.
Can put record ofattributes, associatedwith a grammarsymbol, on stack.
s0
sn-1
sn
X0
Xn-1
Attributes
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
16/57
16
LL parser
LL(k)parser
$
Z
X
Y
Parsing table
a + b $ If top symbol Xaterminal, must matchcurrent token m.
If not, pop top ofstack. Then look attable T[X, m] andpush grammar rulethere in reverse order.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
17/57
$ 2+3*4$ num.type:=2
$num +3*4$ Fnum F.type:=2
$F +3*4$ TF T.type:=2
$T +3*4$ ET E.type:=2
$E +3*4$ shift
$E+ 3*4$ shift num.type:=3$E+num *4$ Fnum F.type:=3
$E+F *4$ TF F.type:=3
$E+T *4$ shift
$E+T* 4$ shift num.type:=4
$E+T*num $ Fnum F.type:=4
$E+T*F $ TT*F T.type:=12
$E+T $ E
E*T E.type:=14
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
18/57
LL parserAdapted
LL(k)parser
$
Z
X
Y
Parsing table
a + b $
If top symbol Xaterminal, must matchcurrent token m.
Put actions into stackas part of rules. Holdfor each nonterminala record withattributes.
If nonterminal,replace top of stackwith shadow copy.Then look at tableT[X, m] and pushgrammar rule there in
reverse order.If shadow copy,remove. This waynonterminal candeliver values downand up.
Attributes
Actions
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
19/57
On stack to beread
rule action
$D int a,b$$(D)L{}T int a,b$ DT{}L$(D)L{}(T)int{}
int a,b$ Tint{}
$(D)L{}(T)
a,b$ T.type:=int
$(D)L a,b$ L.type:=int
$(D)(L)R{}id
a,b$ Lid{}R
$(D)(L)R ,b$ Gen(a,int),R.type:=int
$(D)(L)(R){} b R id
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
20/57
20
Expressions in LL:Eliminating left recursion
E E + T
E T
T T * F T F
F ( E )
F num
E T E
E + T E
E T F T
T * F T
T
F ( E )
F num
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
21/57
3
(2+3)*4E
E
E
E
E
T
T
T
T
T
F
F
F
2
4
+
*( )
T
F T
E T E
E + T E
E
T F T
T * F T
T
F ( E )
F num
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
22/57
22
Actions in LL
E T {$2.down:=$1.up;}E {$$.up:=$2.up;}
E + T
{$3.down:=$$.down+$2.up;}E {$$.up:=$3.up;} E {$$.up:=$$.down;} T F {$2.down:=$1.up;}
T {$$.up:=$2.up;} T * F
{$3.down:=$$.down+$2.up;}T {$$.up:=$3.down;}
T {$$.up:=$$.down;} F ( E ) {$$.up:=$2.up;} F num {$$.up:=$1.up;}
E
E
E
E
E
T
T
T
T
T
F
F
F
2
4
+
*( )
T
F T
3
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
23/57
23
Syntax Directed TranslationScheme
A syntax directed translation schemeis a syntax directed definition in whichthe net effect of semantic actions is toprint out a translation of the input to a
desired output form. This is accomplished by including emit
statements in semantic actions thatwrite out text fragments of the output,
as well as string-valued attributes thatcompute text fragments to be fed intoemit statements.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
24/57
24
Syntax-DirectedTranslation1. Values of these attributes are evaluated by the semantic
rules associated with the production rules.
2. Evaluation of these semantic rules:
may generate intermediate codes
may put information into the symbol table may perform type checking
may issue error messages
may perform some other activities
in fact, they may perform almost any activities.
3. An attribute may hold almost any thing. a string, a number, a memory location, a complex record.
4. Grammar symbols are associated with attributes toassociate information with the programming languageconstructs that they represent.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
25/57
25
Syntax-Directed Definitionsand Translation Schemes
1. When we associate semantic rules withproductions, we use two notations:
Syntax-Directed Definitions
Translation Schemes
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
26/57
26
Schemes
A. Syntax-Directed Definitions: give high-level specifications for translations
hide many implementation details such as order ofevaluation of semantic actions.
We associate a production rule with a set ofsemantic actions, and we do not say when they will
be evaluated.
B. Translation Schemes: indicate the order of evaluation of semantic actions
associated with a production rule.
In other words, translation schemes give a little bitinformation about implementation details.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
27/57
27
Syntax-Directed Definitions1. A syntax-directed definition is a generalization of a context-free
grammar in which:
Each grammar symbol is associated with a set of attributes.
This set of attributes for a grammar symbol is partitioned into twosubsets called
synthesized and
inherited attributes of that grammar symbol.
Each production rule is associated with a set of semantic rules.
2. Semantic rules set up dependencies between attributes which can berepresented by a dependency graph.
3. This dependency graph determines the evaluation order of thesesemantic rules.
4. Evaluation of a semantic rule defines the value of an attribute. But a
semantic rule may also have some side effects such as printing avalue.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
28/57
28
Annotated Parse Tree
1. A parse tree showing the values ofattributes at each node is called anannotated parse tree.
2. The process of computing the attributesvalues at the nodes is called annotating(or decorating) of the parse tree.
3. Of course, the order of these computationsdepends on the dependency graphinduced by the semantic rules.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
29/57
29
Syntax-DirectedDefinition
In a syntax-directed definition, each production A is
associated with a set of semantic rules of the form:
b=f(c1,c2,,cn)
where f is a function and b can be one of the followings:
b is a synthesized attribute of A and c1,c2,,cn are
attributes of the grammar symbols in the production( A ).
OR b is an inherited attribute one of the grammarsymbols in (on the right side of the production), andc1,c2,,cn are attributes of the grammar symbols in the
production ( A ).
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
30/57
30
Attribute Grammar
So, a semantic rule b=f(c1,c2,,cn) indicates
that the attribute b depends on attributesc1,c2,,cn.
In a syntax-directed definition, a semanticrule may just evaluate a value of anattribute or it may have some side effectssuch as printing values.
An attribute grammar is a syntax-directeddefinition in which the functions in thesemantic rules cannot have side effects (theycan only evaluate values of attributes).
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
31/57
31
Syntax-Directed Definition --Example
Production Semantic RulesL E return print(E.val)
E E1 + T E.val = E1.val + T.val
E T E.val = T.valT T1 * F T.val = T1.val * F.val
T F T.val = F.val
F ( E ) F.val = E.val
F digit F.val = digit.lexval
1. Symbols E, T, and F are associated with a synthesizedattribute val.
2. The token digit has a synthesized attribute lexval(it is
assumed that it is evaluated by the lexical analyzer).
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
32/57
32
Annotated Parse Tree --
ExampleInput: 5+3*4 L
E.val=17 return
E.val=5 + T.val=12
T.val=5 T.val=3 * F.val=4
F.val=5 F.val=3 digit.lexval=4
digit.lexval=5 digit.lexval=3
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
33/57
33
Dependency Graph
Input: 5+3*4 L
E.val=17
E.val=5 T.val=12
T.val=5 T.val=3 F.val=4
F.val=5 F.val=3 digit.lexval=4
digit.lexval=5 digit.lexval=3
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
34/57
34
Syntax-Directed Definition
Example2Production Semantic RulesE E1 + T E.loc=newtemp(), E.code = E1.code || T.code ||
add E1.loc,T.loc,E.loc
E T E.loc = T.loc, E.code=T.code
T T1 * F T.loc=newtemp(), T.code = T1.code || F.code
|| mult T1.loc,F.loc,T.loc
T F T.loc = F.loc, T.code=F.code
F ( E ) F.loc = E.loc, F.code=E.code
F id F.loc = id.name, F.code=
1. Symbols E, T, and F are associated with synthesizedattributes locand code.
2. The token id has a synthesized attribute name (it is assumedthat it is evaluated by the lexical analyzer).
3. It is assumed that || is the string concatenation operator.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
35/57
35
Syntax-Directed Definition
Inherited Attributes
ProductionSemantic Rules
D T L L.in = T.type
T int T.type = integer
T real T.type = real
L L1id L1.in = L.in,addtype(id.entry,L.in)
L
id addtype(id.entry,L.in)
1. Symbol T is associated with a synthesized attributetype.
2. Symbol L is associated with an inherited attribute in.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
36/57
36
A Dependency Graph
Inherited AttributesInput: real p q
D L.in=real
T L T.type=real L1.in=real addtype(q,real)
real L id addtype(p,real)id.entry=q
id id.entry=p
parse tree dependency graph
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
37/57
37
Syntax Trees
1. Decoupling Translation from Parsing-Trees.2. Syntax-Tree: an intermediate representation of the
compilers input.
3. Example Procedures:mknode, mkleaf
4. Employment of the synthesized attribute nptr(pointer)
PRODUCTION SEMANTIC RULE
E E1+ T E.nptr=mknode(+,E1.nptr,T.nptr)
E E1- T E.nptr= mknode(-,E1.nptr,T.nptr)E T E.nptr= T.nptrT (E) T.nptr= E.nptrT id T.nptr= mkleaf(id, id.lexval)
T num T.nptr= mkleaf(num, num.val)
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
38/57
38
Draw the Syntax Tree
a-4+c
id num 4
id
to entry for a
to entry for c
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
39/57
39
Directed Acyclic Graphs for
Expressionsa + a * ( b
c ) + ( b
c ) * d
+
+ *
*
-a
b c
d
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
40/57
40
1. Postfix and Prefix notations:
We have already seen how to generate
them. Let us generate Java Byte code.
E -> E+ E { emit(iadd);}
E-> E*
E { emit(
imul
);}
E-> T
T -> ICONST { emit(sipush
ICONST.string);}
T-> E
Examples
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
41/57
41
The abstract machine code for an expressionsimulates a stack evaluation of the postfixrepresentation for the expression. Expressionevaluation proceeds by processing the postfix
representation from left to right.
Abstract Stack Machine
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
42/57
42
Evaluation
1. Pushing each operand onto the stackwhen encountered.
2. Evaluating a k-ary operator by usingthe value located k-1 positions below
the top of the stack as the leftmostoperand, and so on, till the value on thetop of the stack is used as therightmost operand.
3. After the evaluation, all k operandsare popped from the stack, and theresult is pushed onto the stack (or therecould be a side-effect)
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
43/57
43
Example
Stmt -> ID = expr { stmt.t = expr.t || istorea}
Applied to a = 3*b c
bipush 3
iload bimul
iload c
isub
istore a
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
44/57
44
Java Virtual Machine
Analogous to the abstract stackmachine, the Java Virtual machine is anabstract processor architecture that
defines the behavior of Java Bytecodeprograms.
The stack (in JVM) is referred to as theoperand stack or value stack. Operands
are fetched from the stack and theresult is pushed back on to the stack.
Advantages: VM code is compact as theoperands need not be explicitly named.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
45/57
45
Data Types
The int data type can hold 32 bit signedintegers in the range -2^31 to 2^(31) -1.
The long data type can hold 64 bitsigned integers.
Integer instructions in the Java VM arealso used to operate on Boolean values.
Other data types that Java VM supportsare byte, short, float, double. (Yourproject should handle at least threedata types).
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
46/57
46
Selected Java VMInstructions
Java VM instructions are typed i.e., theoperator explicitly specifies what operandtypes it expects.
Expression Evaluation
sipush n push a 2 byte signed int on to stack
iload v load/push a local variable v
istore v store top of stack onto local var v iadd pop two elements and push their sum
isub pop two elements and push theirdifference
Selected Java
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
47/57
47
Selected JavaVM Instructions
imul pop two elements and push their product
iand pop two elements and push their bitwiseand
ior pop two elements and push their bitwiseor
ineg pop top element and push its negation
lcmp pop two elements (64 bit integers), pushthe comparison result. 1 if Vs[0]
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
48/57
48
Selected Java VMInstructions
Branches:
GOTO L unconditional transfer to label l
ifeq L transfer to label L if top of stack is 0
ifne L transfer to label L if top of stack !=0 Call/Return: Each method/procedure has
memory space allocated to hold localvariables (vars register), an operand stack(optop register) and an executionenvironment (frame register)
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
49/57
49
Selected Java VMInstructions
Invokestatic p invoke method p. popargs from stack as initial values offormal parameters (actual parameters
are pushed before calling). Return return from current procedure
ireturn return from current procedurewith integer value on top of stack.
Areturn return from current procedurewith object reference return value ontop of stack.
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
50/57
50
Selected JavaVM Instructions
Array Manipulation: Java VM has anobject data type reference to arraysand objects
newarray int create a new arrayof integers using the top of the stack asthe size. Pop the stack and push areference to the newly created array.
Iaload pop array subscript expressionon top of stack and array pointer (nextstack element). Push value contained inthis array element.
iastore
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
51/57
51
Selected JavaVM Instructions
Object Manipulation
new c create a new instance of class C(using heap) and push the reference
onto stack. getfield f push value from object
field f of object pointed by objectreference at the top of stack.
putfield f store value from vs[1] intofield f of object pointed by the objectreference vs[0]
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
52/57
52
Selected Java VMInstructions
Simplifying Instructions:
ldc constant is a macro which will
generate either bipush or sipushdepending on c.
B t C d (JVM
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
53/57
53
Byte Code (JVMInstructions)
No-arg operand: (instructions needingno arguments hence take only onebyte.)
examples: aaload, aastore,aconsta_null,aload_0, aload_1, areturn, arraylength,astore_0, athrow, baload, iaload, imul
etc One-arg operand: bipush, sipush,ldc etc
methodref op:
invokestatic, invokenonvirtual,
B t C d (JVM
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
54/57
54
Byte Code (JVMInstructions)
Fieldref_arg_op:
getfield, getstaic, putfield, pustatic.
Class_arg_op: checkcast, instanceof, new
labelarg_op (instructions that use labels)
goto, ifeq, ifne, jsr, jsr_w etc Localvar_arg_op:
iload, fload, aload, istore
T l ti If
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
55/57
55
Translating an Ifstatement
Stmt -> if expr then stmt1 {
out = newlabel();
stmt.t = expr.t|| ifnnonnull || out|| stmt1.t ||
label out: nop }
example:if ( a +90==7) { x = x+1; x =
x+3;}
T l ti hil
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
56/57
56
Translating a whilestatement
Stmt -> WHILE (expr) stmt1 {
in =newlabel();
out= newlabel();
emit( stmt.t = label || in|| nop ||expr.t || ifnonnull|| out|| stmt1.t
|| goto || in|| label || out) }
-
7/27/2019 052-SyntaxDirectedTranslation.ppt
57/57
57
References
Compilers Principles, Techniques and Tools, Aho,Sethi, and Ullman , Chapter 5
Appel, Chapter 4 and 5