bhaskar bagchi (11cs10058) lecture slides( 9 th sept. 2013)

12
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Upload: corey-adams

Post on 18-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Bhaskar Bagchi (11CS10058)

Lecture Slides( 9th Sept. 2013)

Page 2: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

In LR Parser L stands for Left to right and R stands for Rightmost derivation

LR Parsers are also known as Shift-Reduce ParserWhenever we have a transition in the input symbol

we either perform a SHIFT or perform a REDUCE.For the shift and reduce operations we need an

FSM, to construct which we need1.State2.Transition3.A small augmentation in grammar

Page 3: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

There are various types of LR parsers, namely1. SLR (Simple LR) Parser2. CLR (Canonical LR) Parser3. LALR (look-Ahead LR) Parser

For all these parsers we need State transition machineThe construction of STM will be further discussed

Page 4: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Augment the grammar G to G’1. Start symbol S2. Introduce new non-terminal S’ S

Find CLOSURE()Find GOTO()

Page 5: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Let there be a production A XYZWe can construct 4 items from this

production, as listed below1. A .XYZ2. A X.YZ3. A XY.Z4. A XYZ.

Page 6: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Each of the item tell us to how much the input string has been recognized.

A X.YZ implies that the prefix of the input parsed has been reduced to non-terminal X and the remaining string may be expected to match with YZ.

Similarly A XY.Z says that XY has been recognized and Z is expected.

Also, A .XYZ means that no prefix has been matched yet but it may match to XYZ.

Page 7: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

In LR parser to know whether to shift or to reduce we need a parsing table, and in order to construct this parsing table we first need a State Transition Machine, the LR(0) automaton.

Each state of this automaton is a set of itemsAs mentioned earlier, to construct this

automaton we need two functions:1. CLOSURE()2. GOTO()

Page 8: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Helps us find state of the automaton.Algorithm to find CLOSURE

1. Put all items of I0 in CLOSURE I0

2. If there is a production of the form A B.CD in I0 and there is a production of the form C E then add the production C .E to the CLOSURE I0

Let we have an example grammar as follows E E + T | T T T * F | F F id

Page 9: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

First augment the grammar by introducing a new non terminal asE’ E

To find the state S0 add the item E’ .E to the state. Now for all productions in the augmented grammar G’ with E as its head, say, E E + T | T, add the corresponding items E .E + T and E .T to the state S0

Again we get an item with T appearing to the right of the dot, add the corresponding items.

Proceed similarly…

Page 10: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

GOTO(Ii , X) Ii , where Ii is the current state(set of items), X is the terminal/non-terminal symbol and Ij is the next state.

If there is a production of the form A A.XB in Ii and a production of the form A AX.B in Ij then if the next input matches X in the state Ii then the GOTO() function returns Ij

Let the input string is of the form x1 x2 …xm xm=1 …xn-1 xn then x1 x2 …xm has already been matched to B and the remaining xm=1 …xn-1 xn can be matched to X.

Page 11: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Let I0 be a start state for the above mentioned example grammar, then the GOTO() functions will be as follows: GOTO(I0 , E) I1

Item E’ E is available so we add the item E’ E. in I1 and take the CLOSURE of the items. So we get state I1 as E’E. and E’E. + T

GOTO(I0 , T) I2

For the item E .T and T .T* F we add T T. * F to I2 and find its CLOSURE. So, we get I2 as E T. and TT. * F

Page 12: Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Similarly we getGOTO(I0 , F) I3 : T F.GOTO(I0 , id) I4 : F id.

Each of the above set of items I1 , I2 , I3 , I4 corresponding to new states of the LR(0) automaton. For each of these states we use GOTO() function for the terminals and non-terminals appearing in the items in the states and right shift the dot, i.e. match one symbol in at least one of the items present in the state.