dhanalaksmi college of engineering, chennai … · dhanalaksmi college of engineering, chennai...

27
DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CS6660- COMPILER DESIGN UNIT I: Introduction to Compliers 1. What is a compiler? (M 10) A compiler is a program that reads a program written in one language (source language) and translates it into an equivalent program in another language (target language). The compiler reports to its user the presence of errors in the source program. 2. What are the two parts of a compilation? Explain briefly. Analysis and Synthesis are the two parts of compilation. The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. The synthesis part constructs the desired target program from the intermediate representation. 3. List the subparts or phases of analysis part. Analysis consists of three phases: Linear Analysis. Hierarchical Analysis. Semantic Analysis. 4. Depict diagrammatically how a language is processed. Skeletal source program Preprocessor Source program Compiler Target assembly program Assembler Relocatable machine code

Upload: lambao

Post on 21-Apr-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CS6660- COMPILER DESIGN

UNIT – I: Introduction to Compliers 1. What is a compiler? (M – 10)

A compiler is a program that reads a program written in one language (source language) and translates it

into an equivalent program in another language (target language). The compiler reports to its user the presence of

errors in the source program.

2. What are the two parts of a compilation? Explain briefly.

Analysis and Synthesis are the two parts of compilation. The analysis part breaks up the source program into

constituent pieces and creates an intermediate representation of the source program. The synthesis part constructs

the desired target program from the intermediate representation.

3. List the subparts or phases of analysis part.

Analysis consists of three phases:

Linear Analysis.

Hierarchical Analysis.

Semantic Analysis.

4. Depict diagrammatically how a language is processed.

Skeletal source program

Preprocessor

Source program

Compiler

Target assembly program

Assembler

Relocatable machine code

Page 2: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

Loader/ link editor ←library, relocatable object files

Absolute machine code

5. What is linear analysis?

Linear analysis is one in which the stream of characters making up the source program is read from left to

right and grouped into tokens that are sequences of characters having a collective meaning. Also called lexical

analysis or scanning.

6. List the various phases of a compiler.

The following are the various phases of a compiler:

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer

Intermediate code generator

Code optimizer

Code generator

7. What are the classifications of a compiler?

Compilers are classified as:

Single- pass

Multi-pass

Load-and-go

Debugging or optimizing

8. What is a symbol table? (M – 14)

A symbol table is a data structure containing a record for each identifier, with fields for the attributes of the

identifier. The data structure allows us to find the record for each identifier quickly and to store or retrieve data from

that record quickly. Whenever an identifier is detected by a lexical analyzer, it is entered into the symbol table. The

attributes of an identifier cannot be determined by the lexical analyzer.

Page 3: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

9. Mention some of the cousins of a compiler. (M – 10)

Cousins of the compiler are:

Preprocessors

Assemblers

Loaders and Link-Editors

10. List the phases that constitute the front end of a compiler.

The front end consists of those phases or parts of phases that depend primarily on the source

Language and are largely independent of the target machine. These include Lexical and Syntactic

analysis

The creation of symbol table

Semantic analysis

Generation of intermediate code

A certain amount of code optimization can be done by the front end as well. Also includes error

handling that goes along with each of these phases.

11. Mention the back-end phases of a compiler.

The back end of compiler includes those portions that depend on the target machine and generally

those portions do not depend on the source language, just the intermediate language. These include

Code optimization

Code generation, along with error handling and symbol- table operations.

12. Define compiler-compiler. (N – 11)

Systems to help with the compiler-writing process are often been referred to as compiler-compilers, compiler-

generators or translator-writing systems. Largely they are oriented around a particular model of languages, and they

are suitable for generating compilers of languages similar model.

13. List the various compiler construction tools. (M - 12)

The following is a list of some compiler construction tools:

Parser generators

Scanner generators

Syntax-directed translation engines

Automatic code generators

Data-flow engines

Page 4: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

14. Differentiate tokens, patterns and lexeme. (M – 14)

Tokens- Sequence of characters that have a collective meaning.

Patterns- There is a set of strings in the input for which the same token is produced as output. This

set of strings is described by a rule called a pattern associated with the token

Lexeme- A sequence of characters in the source program that is matched by the pattern for a token.

15. List the operations on languages.

Union - L U M ={s | s is in L or s is in M}

Concatenation – LM ={st | s is in L and t is in M}

Kleene Closure – L* (zero or more concatenations of L)

Positive Closure – L+ ( one or more concatenations of L)

16. Write a regular expression for an identifier. (N - 13)

An identifier is defined as a letter followed by zero or more letters or digits. The regular expression for an

identifier is given as

letter (letter | digit)*

17. Mention the various notational shorthands for representing regular expressions.

One or more instances (+)

Zero or one instance (?)

Character classes ([abc] where a,b,c are alphabet symbols denotes the regular expressions

a | b | c.)

Non regular sets

18. What is the function of a hierarchical analysis?

Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with

collective meaning. Also termed as Parsing.

19. What does a semantic analysis do?

Semantic analysis is one in which certain checks are performed to ensure that components of a program fit

together meaningfully. Mainly performs type checking.

Page 5: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

20. List the various error recovery strategies for a lexical analysis. (M – 2015)

Possible error recovery actions are:

Panic mode recovery

Deleting an extraneous character

Inserting a missing character

Replacing an incorrect character by a correct character

Transposing two adjacent characters

Page 6: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

PART - B (16 Marks)

1. Explain the phases of compiler and how the following statement will be translated in every

phase. (N – 13)

a. Position: = initial + rate * 60.

b. 4 : * + = c b a

2. (i).Write in detail about the cousins of the compiler.

(ii).Explain in detail about the role of Lexical analyzer with the possible error recovery

actions. (M – 13)

3. Compare NFA and DFA.Construct a DFA directly from an augmented regular expression

(a|b)* abb. (M – 15)

4. (i).Explain compiler construction tools (M–13) (N – 14)

(ii).Discuss the Input buffering techniques in detail.

5. (i).What are the issues in lexical analysis? (ii).Elaborate specification of tokens.

6. Convert the following regular expression into minimized DFA

(i).(a/b)*baa

(ii).(0+1)*(0+1) 10.

7. Explain in detail about lexical analyzer generator.

8. Explain the phases of compiler and how the following statement will be translated in every

phase

a) a:=b+c*50.

b) a=b*c-d

9. Draw the DFA for the augmented regular expression (a|b)*# directly using syntax tree.

10. (i).Elaborate Recognition of tokens.

(ii).Explain in detail about the language for specifying lexical analyzer.

Page 7: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

UNIT – II: Lexical Analysis

1. Define parser. (M – 15)

Hierarchical analysis is one in which the tokens are grouped hierarchically into nested

Collections with collective meaning. Also termed as Parsing.

2. Mention the basic issues in parsing.

There are two important issues in parsing.

Specification of syntax

Representation of input after parsing.

3. Why lexical and syntax analyzers are separated out?

Reasons for separating the analysis phase into lexical and syntax analyzers:

Simpler design.

Compiler efficiency is improved.

Compiler portability is enhanced.

4. Define a context free grammar.

A context free grammar G is a collection of the following

V is a set of non-terminals

T is a set of terminals

S is a start symbol

P is a set of production rules

G can be represented as G = (V,T,S,P)

Production rules are given in the following form

Non terminal → (V U T)*

5. Briefly explain the concept of derivation.

Derivation from S means generation of string w from S. For constructing derivation two things are

important.

a) Choice of non-terminal from several others.

b) Choice of rule from production rules for corresponding non terminal.

Instead of choosing the arbitrary non terminal one can choose

i) Either leftmost derivation – leftmost non terminal in a sentinel form

Or

ii) Rightmost derivation – rightmost non terminal in a sentinel form

Page 8: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

6. Define ambiguous grammar. (M – 14)

A grammar G is said to be ambiguous if it generates more than one parse tree for some sentence of

language L (G). i.e. both leftmost and rightmost derivations are same for the given sentence.

7. What is a operator precedence parser?

A grammar is said to be operator precedence if it possess the following properties:

1. No production on the right side is ε.

2. There should not be any production rule possessing two adjacent non terminals at the right hand

side.

8. List the properties of LR parser.

1. LR parsers can be constructed to recognize most of the programming languages for

which the context free grammar can be written.

2. The class of grammar that can be parsed by LR parser is a superset of class of grammars

that can be parsed using predictive parsers.

3. LR parsers work using non backtracking shift reduce technique yet it is efficient one.

9. Mention the types of LR parser.

SLR parser- simple LR parser

LALR parser- lookahead LR parser

Canonical LR parser

10. What are the problems with top down parsing?

The following are the problems associated with top down parsing:

Backtracking

Left recursion

Left factoring

Ambiguity

11. Write the algorithm for FIRST and FOLLOW. (M – 10)

FIRST ( ):

1. If X is terminal, then FIRST(X) IS {X}.

2. If X → ε is a production, then add ε to FIRST(X).

3. If X is non terminal and X → Y1, Y2..Yk is a production, then place a in FIRST(X) if for some i , a is in

FIRST(Yi) , and ε is in all of FIRST(Y1),…FIRST(Yi-1);

Page 9: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

FOLLOW ( ):

1. Place $ in FOLLOW(S), where S is the start symbol and $ is the input right end marker.

2. If there is a production A → αBβ, then everything in FIRST (β) except for ε is placed in FOLLOW (B).

3. If there is a production A → αB, or a production A→ αBβ where FIRST (β) contains ε , then everything in

FOLLOW(A) is in FOLLOW(B).

12. List the advantages and disadvantages of operator precedence parsing.

Advantages

This type of parsing is simple to implement.

Disadvantages

i. The operator like minus has two different precedence (unary and binary).Hence it is hard to

handle tokens like minus sign.

ii. This kind of parsing is applicable to only small class of grammars.

13. What is dangling else problem?

Ambiguity can be eliminated by means of dangling-else grammar which is show below:

stmt → if expr then stmt

| if expr then stmt else stmt

| Other

14. Write short notes on YACC.

YACC is an automatic tool for generating the parser program.

YACC stands for Yet Another Compiler Compiler which is basically the utility available from UNIX.

Basically YACC is LALR parser generator.

It can report conflict or ambiguities in the form of error messages.

15. What is meant by handle pruning?

A rightmost derivation in reverse can be obtained by handle pruning.

If w is a sentence of the grammar at hand, then w = γn, where γn is the nth right-sentential form of some as

yet unknown rightmost derivation

S = γ0 => γ1…=> γn-1 => γn = w

Page 10: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

16. Define LR(0) items.

An LR(0) item of a grammar G is a production of G with a dot at some position of the right side.

Thus, production A → XYZ yields the four items

A→.XYZ

A→X.YZ

A→XY.Z

A→XYZ.

17. What is meant by viable prefixes?

The set of prefixes of right sentential forms that can appear on the stack of a shift-reduce parser are

called viable prefixes. An equivalent definition of a viable prefix is that it is a prefix of a right sentential form that

does not continue past the right end of the rightmost handle of that sentential form.

18. Define − Handle

A handle of a string is a substring that matches the right side of a production, and whose reduction to

the nonterminal on the left side of the production represents one step along the reverse of a rightmost derivation.

A handle of a right – sentential form γ is a production A→β and a position of γ where the string β

may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of γ. That

is, if S =>αAw =>αβw, then A→β in the position following α is a handle of αβw.

19. What are kernel and non-kernel items?

Kernel items, which include the initial item, S'→ .S, and all items whose dots are not at the left end.

Non-kernel items, which have their dots at the left end.

20. What is phrase level error recovery?

Phrase level error recovery is implemented by filling in the blank entries in the predictive parsing

table with pointers to error routines. These routines may change, insert, or delete symbols on the input and

issue appropriate error messages. They may also pop from the stack.

21. What are the components of LR parser?

An input.

An output.

A stack.

A driver program.

A parsing table.

Page 11: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

22. List the different techniques to construct an LR parsing table?

Simple LR (SLR).

Canonical LR.

Look ahead LR (LALR).

23. Define LR (0) item.

An LR (0) item of a grammer G is a production of G with a dot at some position of the right side.

Eg: A-->. XYZ A->X.YZ

A--> XY. Z A->XYZ.

24. Do left factoring in the following grammar.

A-->aBcC | aBb | aB | a

B-->ε

C-->ε

After applying left factoring,

A-->A′

A′-->BcC | Bb |B

B-->ε

C-->ε

Page 12: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

PART – B (16 Marks)

1. Consider the following grammar

S → AS|b

A→SA|a.

Construct the SLR parse table for the grammar. Show the actions of the parser for the input

String “abab”

2. (i). What is an ambiguous grammar? Is the following grammar ambiguous?

Prove E→E+E | E*E | (E)|id.

(ii). Draw NFA for the regular expression ab*/ab.

3. (i).Construct the predictive parser for the following grammar. (D – 13, D – 14)

S→ (L) | a

L→ L, S | S

(ii). Write the short notes on role of the parser.

4. Construct predictive parsing table and parse the string id+id*id. (D – 14)

E→E+T | T

T→T*F | F

F→ (E) | id

5. Construct SLR parsing table for the grammar. (M – 15)

E→E+T|T

T→TF|F F→F*|a|b

6. Explain in detail about the different storage allocation strategies.

7. (i).Describe the conflicts that may occur during shift reduce parsing. (ii).Explain the detail about the specification of

a simple type checker.

8. Explain in detail about run time storage management.

9. Explain the importance of type checking.

10. Construct a canonical parsing table for the grammar given below.

E→E+T |T

T→T*F | F

F→ (E) | id

Page 13: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

UNIT – III: Syntax Analysis

1. What are the benefits of intermediate code generation? (M – 14)

A Compiler for different machines can be created by attaching different back end to the existing front

ends of each machine.

A Compiler for different source languages can be created by proving different front ends for

corresponding source languages t existing back end.

A machine independent code optimizer can be applied to intermediate code in order to optimize the

code generation.

2. What are the various types of intermediate code representation? (M – 14)

There are mainly three types of intermediate code representations.

1. Syntax tree

2. Post fix

3. Three address code

3. Define − Backpatching

Backpatching is the activity of filling up unspecified information of labels using appropriate semantic

actions in during the code generation process. In the semantic actions the functions used are

mklist(i),merge_list(p1,p2) and backpatch(p,i)

4. Mention the functions that are used in backpatching. (N – 13)

mklist(i) creates the new list. The index i is passed as an argument to this function where I is an

index to the array of quadruple.

merge_list(p1,p2) this function concatenates two lists pointed by p1 and p2. It returns the pointer to

the concatenated list.

backpatch(p,i) inserts i as target label for the statement pointed by pointer p.

5. What is the intermediate code representation for the expression a or b and not c?

(M – 15)

The intermediate code representation for the expression a or b and not c is the three address

sequence

t1 := not c

t2 := b and t1

t3 := a or t2

Page 14: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

6. What are the various methods of implementing three address statements? (M– 14, M – 15)

The three address statements can be implemented using the following methods.

1. Quadruple : a structure with atmost four fields such as operator(OP),arg1,arg2,result.

2. Triples : the use of temporary variables is avoided by referring the pointers in the

symbol table.

3. Indirect triples: the listing of triples has been done and listing pointers are used instead of

using statements.

7. Give the syntax-directed definition for if-else statement.

1. S → if E then S1

E.true := new_label()

E.false :=S.next

S1.next:=S.next

S.code :=E.code | | gen_code(E.true „: „) | | S1.code

2. S → if E then S1 else S2

E.true := new_label()

E.false := new_label()

S1.next :=S.next

S2.next :=S.next

S.code :=E.code | | gen_code(E.true „: „) | | S1.code| | gen_code(„go to‟,S.next) |

|gen_code(E.false „:‟) | | S2.code

8. Draw syntax tree for the expression a=b*-c+b*-c (N – 13, M – 14)

=

a +

* *

b uminus b uminus

c c

Page 15: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

10. What is postfix notation? (N – 13)

It is the linearized representation of syntax tree .It is a list of nodes of the syntax tree in which a node

appears immediately after its children.

Eg, for the above syntax tree, a b c uminus * b c uminus * + =

11. What are the different types of three address statements?

1. Assignment statements of the form x=y op z.

2. Assignment statements of the form x= op y.

3. Copy statements of the form x=y.

4. An unconditional jump, go to L.

5. Conditional jumps, if x relop y goto L.

6. Procedure call statements, param x and call p , n.

7. Indexed assignments of the form, x=y[i] , x[i]=y.

8. Address and pointer assignments.

12. In compiler, how three address statements can be represented as records with fields for the operator and

operands? (M – 14)

Quadruples.

Triples.

Indirect triples.

13. Define quadruple and give one example.

A quadruple is a data structure with 4 fields like operator, argument-1, argument-2 and result.

Example: a=b*-c

operator Argument-1 Argument-2 Result

(0) uminus c T1

(1) * b T1 T2

(2) = T2 a

Page 16: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

14. Define triple and give one example.

A triple is a data structure with 3 fields like operator, argument-1, argument-2. Example: a=b*-c

operator Argument-1 Argument-2

(0) uminus c

(1) * b (0)

(2) = a (1)

15. What is the merit of quadruples?

If we move a statement computing a variable, 'A' then the statement using „A‟ requires no change. That is „A‟

can be computed anywhere before the utilization of A.

16. What are the two methods of translation to find three-address statements in Boolean expression?

• Numerical method

• Control flow method

17. Generate three address code for “if A<B then 1 else 0”, using numerical method.

1. if A<B go to (4)

2. T1=0

3. go to (5)

4. T1=1

5. …

18. What are the different storage allocation strategies?

Static allocation. It lays out storage for all data objects at compile time.

Stack allocation: It manages the runtime storage as a stack.

Heap allocation: It allocates and deal locates storage as needed c at Runtime from a data

area.

19. What is cross compiler? (M – 14)

A cross compiler is a compiler capable of creating executable code for a platform other than the one

on which the compiler is running. For example, a compiler that runs on a Windows 7 PC but generates code

that runs on Android smartphone is a cross compiler.

Page 17: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

PART-B (16 Marks)

1. Define three address code. Describe the various methods of implementing three address statements with an

example. (M – 14, M – 15, N – 13)

2. Discuss the various methods for translating Boolean expression. (M – 12)

3. Explain about back patching with an example. (M – 14)

4. Explain Procedure calls with a neat example. (M – 12)

5. Give the translation scheme for converting the assignments into three address code.

6. How would you generate intermediate code for the flow of control statements?

Explain the sequence of stack allocation processes for a function call.

8. How would you convert the following into intermediate code? Give suitable example. (M– 14)

(i). Assignment statements

(ii). Case statements. (N– 11)

9. (i).How can Back patching be used to generate code for Boolean expression and flow control

statements?

(ii). Explain the need for annotated parse tree.

10. Describe the method of generating syntax-directed definition for control statements.

Page 18: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

UNIT – IV: Syntax Directed Translation & Run Time Environment

1. Mention the properties that a code generator should possess.

The code generator should produce the correct and high quality code. In other words, the code

generated should be such that it should make effective use of the resources of the target machine.

Code generator should run efficiently.

2. List the terminologies used in basic blocks.

Define and use – the three address statement a: =b+c is said to define a is use b and c.

Live and dead – the name in the basic block is said to be live at a given point if its value is used after

that point in the program. And the name in the basic block is said to be dead at a given point if its

value is never used after that point in the program.

3. What is a flow graph?

A flow graph is a directed graph in which the flow control information is added to the basic blocks.

The nodes to the flow graph are represented by basic blocks

The block whose leader is the first statement is called initial block.

There is a directed edge from block B1 to block B2 if B2 immediately follows B1 in the given

sequence. We can say that B1 is a predecessor of B2.

4. What is a DAG? Mention its applications. (M – 14)

Directed acyclic graph (DAG) is a useful data structure for implementing transformations on basic blocks.

DAG is used in

Determining the common sub-expressions.

Determining which names are used inside the block and computed outside the block.

Determining which statements of the block could have their computed value outside the block.

Simplifying the list of quadruples by eliminating the common su-expressions and not performing the

assignment of the form x: = y unless and until it is a must.

5. Define − Peephole Optimization

Peephole optimization is a simple and effective technique for locally improving target code. This technique is

applied to improve the performance of the target program by examining the short sequence of target instructions and

replacing these instructions by shorter or faster sequence.

Page 19: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

6. List the characteristics of peephole optimization.

Redundant instruction elimination

Flow of control optimization

Algebraic simplification

Use of machine idioms

7. How do you calculate the cost of an instruction? (M– 14)

The cost of an instruction can be computed as one plus cost associated with the source and destination

addressing modes given by added cost.

MOV R0,R1 = 1

MOV R1,M = 2

SUB 5(R0),*10(R1) = 3

8. What is a basic block?

A basic block is a sequence of consecutive statements in which flow of control enters at the beginning and

leaves at the end without halt or possibility of branching.

Eg. t1:=a*5

t2:=t1+7

t3:=t2-5

t4:=t1+t3

t5:=t2+b

9. What is mean by syntax directed definition?

It is a generalization of a CFG in which each grammar symbol has an associated set of attributes like,

synthesized attribute and inherited attribute.

10. How the value of synthesized attribute is computed?

It was computed from the values of attributes at the children of that node in the parse tree.

Page 20: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

11. Define Annoted parse tree and give one example.

A parse tree showing the values of attributes at each node is called an annoted parse tree.

Example: 3*5+4 E

E.val=19

E.val=15 + T. al=4

T.val=15 F.val=4

F.val=5

T.val=3

digit.lexval=4

*

F.val=3 digit.lexval=5

digit.lexval=3

12. Define − Backpatching

To generate three address code, 2 passes are necessary. In first pass, labels are not specified.

These statements are placed in a list. In second pass, these labels are properly filled, is called back

patching.

13. What are the three functions used for back patching?

1. Makelist (i)

2. Merge (p1, p2)

3. Backpatch (p, i)

14. When procedure call occurs, what are the steps to be takes placed?

1. State of the calling procedure must be saved, so that it can resume after completion of procedure.

2. Return address is saved, in this location called routine must transfer after completion of procedure

Page 21: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

15. What are the demerits in generating 3-address code for procedure calls?

Consider, b=abc (I, J) .It is very difficult to identify whether it is array reference or it is a call to

the Procedure abc.

16. Which information‟s are entered into the symbol table? (M – 14)

The string of characters denoting the name. Attributes of the name.

Parameters.

An offset describing the position in storage to be allocated for the

name

17. What is the demerit in structure of symbol table?

Length of the n me should not exceed upper bound or limit of name field.

If length of name is small, then remaining space was wasted.

18. What are the different data structures used for symbol table? (M – 14)

Lists.

Self-organizing lists.

Search tree.

Hash table

19. What is meant by scope of declaration?

The portion of the program to which a declaration applies is called the scope of that declaration. An

occurrence of a name in a procedure is said to be local to the procedure if it is in the scope of declaration within the

procedure; otherwise the occurrence is said to be no local.

Page 22: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

PART – B (16 Marks)

1. Explain in detail about the various issues in design of code generator. (N– 14 , M– 14)

2. Write an algorithm to partition a sequence of three address statements into basic blocks.

3. Explain code generation algorithm and various issues in code generation algorithm in detail.

4. Construct the DAG for the following basic block (M – 14)

d:= b*c

e:= a+b

b: = b*c

a:= e-d

5. Explain the concept of register allocation and assignment. . (M – 12)

6. Explain labeling algorithm with an example.

7. Generate code for the following assignment using code generator algorithms

t:=(a-b) + (a-c) + (a-c)

8. How to generate a code for a basic block from its dag representation? Explain.

9. Define a Directed Acyclic Graph. Construct a DAG and write the sequences of instructions for

the expression a+ a*(b-c) + (b-c) *d.

10. (i).Write short notes on runtime storage management of a code generator.

(ii). Explain in detail about primary structure preserving transformations on basic blocks.

Page 23: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

UNIT – V: Code Optimization and Code Generation

1. Mention the issues to be considered while applying the techniques for code optimization.

The semantic equivalence of the source program must not be changed.

The improvement over the program efficiency must be achieved without changing the algorithm of

the program.

2. What are the basic goals of code movement?

To reduce the size of the code i.e. to obtain the space complexity.

To reduce the frequency of execution of code i.e. to obtain the time complexity.

3. What do you mean by machine dependent and machine independent optimization?

The machine dependent optimization is based on the characteristics of the target machine for the

instruction set used and addressing modes used for the instructions to produce the efficient target

code.

The machine independent optimization is based on the characteristics of the programming

languages for appropriate programming structure and usage of efficient arithmetic properties in order

to reduce the execution time.

4. What are the different data flow properties?

Available expressions

Reaching definitions

Live variables

Busy variables

5. List the different storage allocation strategies.

The strategies are:

Static allocation

Stack allocation

Heap allocation

Page 24: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

6. What are the contents of activation record? (N–13, M – 14)

The activation record is a block of memory used for managing the information needed by a single execution

of a procedure. Various fields f activation record are:

Temporary variables

Local variables

Saved machine registers

Control link

Access link

Actual parameters

Return values

7. What is dynamic scoping?

In dynamic scoping a use of non-local variable refers to the non-local data declared in most recently called

and still active procedure. Therefore each time new findings are set up for local names called procedure. In dynamic

scoping symbol tables can be required at run time.

8. Define − Symbol Table (M – 14)

Symbol table is a data structure used by the compiler to keep track of semantics of the variables. It stores

information about scope and binding information about names.

9. What is code motion?

Code motion is an optimization technique in which amount of code in a loop is decreased. This transformation is applicable to the expression that yields the same result independent of the number of times the loop is executed. Such an expression is placed before the loop.

10. What are the properties of optimizing compiler? (N – 13) The source code should be such that it should produce minimum amount of target code. There should not be

any unreachable code. Dead code should be completely removed from source language. The optimizing compilers

should apply following code improving transformations on source language.

i) Common sub expression elimination

ii) Dead code elimination

iii) Code movement

iv) Strength reduction

Page 25: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

11. What are the various ways to pass a parameter in a function?

Call by value

Call by reference

Copy-restore

Call by name

12. Suggest a suitable approach for computing hash function.

Using hash function we should obtain exact locations of name in symbol table. The hash function should

result in uniform distribution of names in symbol table. The hash function should be such that there will be minimum

number of collisions. Collision is such a situation where hash function results in same location for storing the names.

13. Define − Code Generation (M - 15)

The code generation is the final phase of the compiler. It takes an intermediate representation of the source

program as the input and produces an equivalent target program as the output.

14. Define −Target Machine (M - 15)

The target computer is byte-addressable machine with four bytes to a word and n-general purpose registers.

R0, R1……….R n-1. It has two address instructions of the form Op, source, destination in which Op is an op-code, and

source and destination are data fields.

15. How do you calculate the cost of an instruction? (M – 14)

We take the cost of an instruction to be one plus the costs associated with the source and destination

address modes. This costs corresponds to the length (in words) of the instruction. Address modes involving registers

have cost zero, while those with a memory location or literal in them have cost one, because such operands have to

be stored with the instruction.

16. What is meant by optimization?

It is a program transformation that made the code produced by compiling algorithms run faster or

takes less space.

17. Define − Optimizing Compilers (N – 13)

Compilers that apply code-improving transformations are called optimizing compilers.

18. When do you say a transformation of a program is local?

A transformation of a program is called local, if it can be performed by looking only at the

statement in a basic block.

Page 26: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

19. Write a note on function preserving transformation.

A complier can improve a program by transformation without changing the function it compliers.

20. List the function –preserving transformation.

Common sub expression elimination.

Copy propagation.

Dead code elimination.

Constant folding.

21. Define − Common Sub Expression

An occurrence of an expression E is called a common sub expression if E was previously

computed and the values of variables in E have not changed since the previous computation.

22. Define − Live Variable (M – 15)

A variable is live at a point in a program if its value can be used subsequently.

23. What is meant by loop optimization?

The running time of a program may be improved if we decrease the number of instructions in a

inner loop even if we increase the amount of code outside that loop.

24. Define activation tree.

An activation tree depicts the way control enters and leaves activations.

25. What is the use of control stack?

It keeps track of live procedure activations. Push the node for activation onto the control stack as

the activation begins and to pop the node when the activation ends.

26. Define − Activation Record (N – 13, M – 14)

Information needed by a single execution of a procedure is managed using a contiguous block of

storage called an activation record.

Page 27: DHANALAKSMI COLLEGE OF ENGINEERING, CHENNAI … · dhanalaksmi college of engineering, chennai department of computer science and engineering cs6660- compiler design ... * abb. (m

PART – B (16 Marks)

1. Discuss briefly about peephole optimization. (N – 14)

2. Discuss in detail the process of optimization of basic blocks. Give an example. (N – 14)

3. What is data flow analysis? Explain data flow abstraction with examples. (N – 14)

4. Explain in detail about code improving transformations.

5. Write in detail about function –preserving transformations.

6. Explain the principal sources of optimization in detail. (N – 14)

7. Explain the common sub expression elimination, copy propagation, and transformation for

improving loop invariant computations in detail. (N – 14)

8. Explain the three techniques for loop optimization with examples.

9. Discuss about the Dead code elimination and code motion.

10. Discuss about the Loops in flow graphs.