translation code translation converted code in compilers, this is the task of creating executable...
TRANSCRIPT
Translation
code translation convertedcode
In compilers, this is the task of creatingexecutable from source code.
How is it done?
So far, we have analysed the code, identifyingthe "words" and the syntactical form.
Together, these help us understand themeaning of the code, and so we will use thestructure we have identified to create thetarget code.
Postfix Notation
Postfix notation is a method for writingexpressions which is unambiguous, and corresponds to the processing order we usein bottom-up parsing.
a+b is written as ab+a*b is written as ab*: :
In general, we define it recursively:
postfix for E1 <op> E2 is (postfix for E1)(postfix for E2)<op>
and
postfix for (E) is (postfix for E)
Postfix exampleWe can write a+(b*c)*(b*(a+b)) as
abc*bab+**+
To read a postfix expression, start from the leftand move right. By the time we reach an operator,we take the correct number of operands we havemost recently recognised, to get a new expression.
Above, b and c are the operands of the first *,a and b are the operands of the first +, etc.
Labelling the operators, we get:
a+1(b*1c)*2(b*3(a+2b))
which translates to
abc*1bab+2*3*2+1
Translating while parsing1) S -> S + T2) S -> T3) T -> T * F4) T -> F5) F -> ( S )6) F -> a
print ("+")
print("*")
print(a)
Parsing a+a*a+a produces the followingsequence:
a+a*a+a <=6 F+a*a+a <=T+a*a+a <= S+a*a+a <=6
S+F*a+a <=S+T*a+a <=6 S+T*F+a <=3 S+T+a <=1
S+a <=6 S +F <= S+T <=1 S
The order in which the productions wereapplied is 6, 6, 6, 3, 1, 6, 1, which causes theoutput of
aaa*+a*
The Value StackAll we are able to do using the previous methodis execute an action whenever a rule is used.We can't store up actions for future use.
However, we can extend the idea, by associatingvalues with each symbol on the stack. Theactions we then carry out can use those values that have been stored.
Suppose that we are about to reduce byA -> x1 x2 .... xn
This means that the furthest right symbols onthe symbol stack are:
x1 x2 ... xn
Call the values associated with those symbols$1, $2, ..., $n
When we carry out the reduction, we removethose symbols, and replace by A.
Changing the Value StackRemove the first n symbols from the stack, andreplace by a new value for A, which we will call$$. The value we want to store for A will dependon the values we stored for the xi
That is, $$ = f($1, $2, ..., $n), for some function f.
The only other case we need to consider iswhen we place a terminal on the stack. Wheredo we get its value?
Generally, we expect the lexical analyser tofind the value for us.
This means that in your Lex script, every time you recognise an integer or a real, you must translate it into a number of the appropriate form.
Computing the value of expressions
1) S -> S + T2) S -> T3) T -> T * F4) T -> F5) F -> ( S )6) F -> 17) F -> 28) F -> 3
$$ := $1 + $3$$ := $1$$ := $1 * $3$$ := $1$$ := $2$$ := 1$$ := 2$$ := 3
Note: this is a simplification
In practice, we would have 6) F -> a, andexpect the lexical analyser to return thedifferent integer values
Parsing the expressions
Symbol
1FTSS+S+2S+FS+TS+T*S+T*3S+T*FS+TS
Values
11111•1•21•21•21•2•1•2•31•2•31•67
Stack0050302010160165016301690169701697501697 10016901
Input1+2*3#
+2*3#+2*3#+2*3#+2*3#
2*3#*3#*3#*3#3#
####
ActionS5R6R4R2S6S5R6R4S7S5R6R3R1A
Value Stack in Lex
Lex must place the values in yylval
1. digit string - compute the value, place in yylval
2. char string - copy to a string array, place the index of its start point in yylval
3. real string - convert to a floating point, store in an array of reals, place the index of its start point in yylval
4. identifier - store as for strings
Lex and yylval
%{#include "y.tab.h"#include <stdlib.h>extern int yylval;%}
%%[0-9]+ {yylval = atoi(yytext);
/* convert string to integer */ return INT_T;}
[ \t] ; /* ignore space */
: :: :: :
%%
Value Stack in Yacc
Yacc allows an action after each production.The action will be performed immediatelybefore the reduction.
Values are represented using the $$ and $inotation.
When the statement is reached by Yacc, it will translate the different $i's into theirappropriate types
Using Yacc's Value Stack
%%Finish : Expr {printf("%d",$1);} ;
Expr : Expr PLUS_T Term{$$ = $1 + $3;}
| Term;
Term : Term MUL_T Factor{$$ = $1 * $3;}
| Factor;
Factor : OB_T Expr CB_T{$$ = $2;}
| INT_T;
%%
Syntax-Directed Translation
Yacc allows us to use the value stack.
However, this method only allows us toassociate a single value with each symbol.
We may want to record more information:
data types
places in the symbol table
code fragments
We will extend the idea of the value stackby associating multiple values with symbols
AttributesWith each symbol in the grammar, associatea set of attributes.
The attributes can be of any type, and represent any information we can express.
With each production in the grammar,associate a set of semantic rules, determininghow the values of the attributes are to becomputed.
The computation can modify the values ofthe attributes, or can have side-effects,modifying some external structure - e.g. thesymbol table - or can output results to thescreen or to a file.
Formal attribute definition
p) A -> is a grammar rule.
p) has associated with a set of semanticfunctions of the form
b := f(c1, c2, ..., cn)
where b, c1, c2, ..., cn are attributes of any symbol appearing in p).
If b is an attribute of A, then b is a synthesised attribute.
If b is an attribute of one of the symbols in , then b is an inherited attribute
Syntax-directed Definition: Example
1)
2)
3)
4)
5)
6)
7)
S -> E
E1 -> E2 + T
E -> T
T1 -> T2 * F
T -> F
F -> ( E )
F -> digit
print(E.val)
E1.val := E2.val + T.val
E.val := T.val
T1.val := T2.val * F.val
T.val := F.val
F.val := E.val
F.val := digit.lexval
Synthesised Attributes
The value of a synthesised attribute eithercomes from the child nodes, or from theproperties of the symbol itself.
As soon as a symbol is recognised in bottom-upparsing, the values of its synthesisedattributes can be obtained.
Thus, if a derivation of a string uses onlysymbols with synthesised attributes, we canevaluate all the attributes as we carry outthe parse.
A syntax-directed definition which uses onlysynthesised attributes is called anS-attributed definition.
6 + 2 * 3
S
E
T
F
val = 12
E
T
F
6
val = 6
val = 6
val = 6
lexval = 6
T
F
2
val = 2
lexval = 2
3lexval = 3
val = 2 val = 3
val = 6
+
*
12
Annotated Parse Tree
Inherited AttributesAn inherited attribute has its value determinedby the attribute values of its parent orsiblings.
Inherited attributes are useful for describingthe way in which the meaning of a symboldepends upon the context in which itappears.
For example, the meaning of the identifier "num" is different in the two cases below:
real num;
int num;
Thus a "type" attribute cannot be determinedfrom the symbol alone, but must be derivedfrom the attribute of parent or sibling symbols.
Inherited Attribute Example
1)
2)
3)
4)
5)
D -> T L
T -> int
T -> real
L1 -> L2, id
L -> id
L.t := T.t
T.t := integer
T.t := real
L2.t := L1.t, addtype(id.entry, L1.t)
addtype(id.entry, L.t)
D
T L
real , idL
, idL
id
t = real t = real
t = real
t = real
entry
entry
entry
addentry(...)
addentry(...)
addentry(...)
Augmented Parse Tree
real id1 , id2 , id3
Information FlowD
T L
real , idL
, idL
id
t = real t = real
t = real
t = real
entry
entry
entry
real id1 , id2 , id3
addentry(...)
addentry(...)
addentry(...)
Dependency Graph
The augmented parse tree on the previous slide is called a dependency graph.
We use dependency graphs to determinethe order in which we must evaluate theattributes to get a completely evaluatedparse tree.
A topological sort is an ordering of theattributes of a graph which is a valid orderin which to evaluate the attributes.
Topological SortD
T L
real , idL
, idL
id
t = real t = real
t = real
t = real
entry
entry
entry
real id1 , id2 , id3
addentry(...)
addentry(...)
addentry(...)
12
3
4
5
6
7
8
9
10
Evaluation methods
parse-tree based At compile time,construct a parse tree, then a dependencygraph, then a topological sort. Evaluatethe attributes in that order.
rule based When the compiler isconstructed, analyse the rules fordependencies between attributes, andfix the order of evaluation beforecompilation begins.
oblivious Use a fixed evaluation orderwithout analysing the dependencies.This limits the class of grammars thatcan be implemented.
Syntax Trees
A syntax tree is a condensed parse tree,where the operators and keywords do notappear as leaves, but with the parent nodesthat would have been their parents in theparse tree.
S => if B then S1 else S2
Example:
has the syntax tree:
if then else
B S1 S2
6 + 2 * 3
E
E T+
T *T F
F F 3
26
+
6 *
32
Using Syntax Trees
A syntax tree allows the translation processto be separated from the parsing process.
A grammar that is best for parsing mightnot explicitly represent the hierarchicalnature of the programs it describes
The parsing method imposes an orderin which the nodes are considered, whichmight not be the best order for translation.
Constructing Syntax Trees
We can use a syntax-directed definition tocreate syntax trees in a similar way to theway we created postfix expressions.
We will represent each node as a simpledata structure.
Operator structures will have a name and a number of fields containing pointers to eachoperand.
Simple operand structures will have a typeand a value.
E.g. 2+3 will be represented by:
+
num 2 num 3
Functions We require the following three functions:
mknode(op,left,right): creates aninternal node for the operator "op", withtwo fields for pointers to the left andright operands.
mkleaf_id(id,entry): creates a leafnode for the identifier "id", and a field fora pointer to the symbol table entry for "id".
mkleaf_num(num,val): creates a leafnode, labelled "num", with a field forthe value of the number.
Each function returns a pointer to the nodejust created.
Example Definition
1)
2)
3)
4)
5)
6)
7)
E1 -> E2 + T
E -> T
T1 -> T2 * F
T -> F
F -> ( E )
F -> id F -> num
E1.ptr := mknode("+", E2.ptr,T.ptr)
E.ptr := T.ptr
T1.ptr := mknode("*",T2.ptr, F.ptr)
T.ptr := F.ptr
F.ptr := E.ptr
F.ptr := mkleaf_id(id, id.entry)
F.ptr := mkleaf_num(num,num.val)
Constructing 6+2*x
E
E T+
T *T F
F F 3
26
ptr =
ptr = ptr =
ptr =ptr = ptr =
ptr =ptr =
+
num 6
id
*
num 2
Compound Statements
CStat -> Stat ; CStat
CStat -> Stat
Stat -> s
s ; s ; s ; s
CStat
CStat
CStat
CStat
Stat
Stat
Stat
Stat
;
;
;
s
s
s
s
Parse Tree
CStat1.ptr := mknode(";", Stat.ptr,CStat2.ptr)
CStat.ptr := Stat.ptr
Stat.ptr := mkleaf_id(id, s)
;
;
;
s
s
s s
Syntax tree
CStat1.ptr := CStat2.ptr; addChild(CStat1.ptr,Stat.ptr)
CStat.ptr := mkXnode(Stat.ptr)
Stat.ptr := mkleaf_id(id, s)
s s s s
seq
id
...(s)...
Seq -> CStat Seq.ptr := CStat.ptr;
seq ...
CStat1.ptr := Stat.ptr; addSib(Stat.ptr,CStat2.ptr)
CStat.ptr := Stat.ptr
Stat.ptr := mkleaf_id(id, s)
s s s s
seq
id
...(s)...
sibling
Seq -> CStat Seq.ptr := CStat.ptr;
Sample Program
int a b c;int g[5];
int testFunc(int x) { real y;
y := (x+a)/2; print(y); return a;}
main() {
a := 1; while (a < 3) do { testFunc(a); a := a + 1; }}