lr parsers - e-studythirdyearengineering.weebly.com/uploads/3/8/2/8/... · lr parsers most powerful...
TRANSCRIPT
LR Parsers
1 Aditi Raste, CCOEW
LR Parsers
Most powerful shift-reduce parsers and yet
efficient.
LR(k) parsing
L : left to right scanning of input
R : constructing rightmost derivation in reverse
k : number of symbols of lookahead that are used
in making parsing decisions ( when k is omitted, k
is assumed to be 1)
Aditi Raste, CCOEW 2
Why LR parsers?
Most general table driven non-backtracking and efficient shift-reduce parsing.
Class of grammars that can be parsed using LR method is a proper superset of class of grammars that can be parsed with predictive parsers
LL(1) grammars C LR(1) grammars
Detects a syntactic error as soon as it is possible to do so on a left to right scan of the input.
LR parsers can be constructed to recognize virtually all programming language constructs for whi h CFG’s a e writte .
Aditi Raste, CCOEW 3
LR parser
The principle drawback of the LR method is that it is too much work to construct an LR parser by hand for a typical programming language grammar.
Fortunately this process can be automated. Many parser generators are available.
Parser generators can locate ambiguous constructs in the grammar or constructs that are difficult to parse in a left-to-right scan of the input and also can provide detailed diagnostic messages.
Aditi Raste, CCOEW 4
Flavours of LR parsers
SLR :- Simple left and right parser
LR(1):- Canonical LR. Most general LR parser
LALR:- Lookahead LR parser. Intermediate LR
SLR, LR(1) and LALR use the same algorithm for
parsing but differ only in their parsing tables.
Powers relative to each other
SLR 1 ≤ LALR 1 ≤ LR 1 Aditi Raste, CCOEW 5
SLR (Simple LR)
Aditi Raste, CCOEW 6
SLR Parser
• How does a shift-reduce parser know when
to shift and when to reduce?
Aditi Raste, CCOEW 7
Right Sentential form Handle Production
id * id id F -> id
F * id F T -> F
T * id id F -> id
T * F T * F T -> T * F
T T E-> T
T cannot be a
handle here
SLR Parser
An LR parser makes shift-reduce decisions by
maintaining states to keep track of where we
are in a parse.
States represent set of items.
An item indicates how much of a production
we have seen at a given point in the parsing
process.
Aditi Raste, CCOEW 8
Model of LR Parser
Aditi Raste, CCOEW 9
a + b $
LR Parsing Program X
Y
Z
$
Input
Output Stack
ACTION GOTO
s0
….
sn
LR parsing
table
Driver program
Tasks of the driver program
Invoke lexical analyzer for next token
Initialize stack with start symbol
Act like an FA
Determine the state Sj on tos and ai the current
input symbol
Determine the action corresponding to [Sj, ai]
Aditi Raste, CCOEW 10
Actions of LR parser
Si : Shift and stack state i
rj : Reduce by production rule numbered j
Accept
Error
Aditi Raste, CCOEW 11
LR Parser Example
Aditi Raste, CCOEW 12
a + b $
LR Parsing Program X
Y
Z
$
Input
Output Stack
ACTION GOTO
s0
….
sn
LR parsing
table
LR Parser Example
Consider expression grammar
1. E -> E + T
2. E -> T
3. T -> T * F
4. T -> F
5. F -> (E)
6. F -> id
Aditi Raste, CCOEW 13
Parsing table for expression grammar
Aditi Raste, CCOEW 14
State ACTION GOTO
id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Parse the input string
id + id * id
Working of the LR parser
Aditi Raste, CCOEW 15
Working of the LR parser
Aditi Raste, CCOEW 16
Construction of LR(0) items
Construction of LR parsing
table
Parsing of input string
CFG
Input
string
Output
LR(0) Items
An LR(0) item is a string [α] where, α is a
production from Grammar with a . at some
position in the RHS.
The indicates how much of the item we have
seen at a given state in the parse.
[A = XYZ] indicates that the parser is looking
for a string that can be derived from XYZ.
[A = XY Z] indicates that the parser has seen a
string derived from XY and is looking for one
derivable from Z.
Aditi Raste, CCOEW 17
LR(0) Items (no lookahead)
A => XYZ generates 4 LR(0) items
1. [ A => XYZ]
2. [ A => X YZ]
3. [ A => XY Z]
4. [ A => XYZ ]
Aditi Raste, CCOEW 18
Canonical LR(0) Collection of Items
The SLR table construction algorithm uses a
specific set of sets of LR(0) items.
These sets are called canonical collection of
sets of LR(0) items for grammar G.
The canonical collection represents the set of
valid states for an LR parser.
Aditi Raste, CCOEW 19
Canonical LR(0) Collection of Items
To construct the canonical LR(0) collection for
a grammar, we define
Augmented grammar
Two functions:-
CLOSURE function
GOTO function
Aditi Raste, CCOEW 20
Canonical LR(0) Collection of Items
Augmented Grammar
If G is a gra ar with start s ol “ the G’, the augmented grammar for G, is G with a new
start s ol “’ a d produ tio “’ => “
Purpose: Augmented grammar tells the parser
when to stop parsing and announce the
acceptance of the input. Acceptance only occurs
when and only when the parser is about to reduce
“’ => “.
Aditi Raste, CCOEW 21
States of the PDA
(Closure of Item sets)
Each LR(0) item corresponds to a point in the
parse.
To generate a parser state from an LR(0) item
we take its closure.
Aditi Raste, CCOEW 22
Closure of set of items
Suppose I is a set of items, we define CLOSURE(I)
as,
(i) Every item in I is in CLOSURE(I)
(ii) If A => α Bβ is in COSURE(I) and B => ϒ is a
production then add the item B => ϒ to I (if not
already in I)
Apply this rule until no more new items can be
added to CLOSURE(I).
Aditi Raste, CCOEW 23
Set of Items
Two classes
a) Kernel items:-
(i) I itial ite “’ => “ (ii) All items whose dots are not at the left
end
b) Non Kernel items:-
All items with their dots at the left end
e ept for “’ => “
Aditi Raste, CCOEW 24
Transitions of PDA ( GOTO function)
There will be a transition from one state to
another state for each grammar symbol in an
item that immediately follows the marker in
an item in that state.
If an item in the state is [A => α Xβ ] then
transition from this state occurs when X is
processed.
transition is to the state that is the closure of the
item [ A => αX β ]
Aditi Raste, CCOEW 25
GOTO ( I,X) function
GOTO(I, X)
I : set of items
X: grammar symbol
GOTO(I,X) is defined to be the closure of the
set of all items [ A => αX β ] such that
[ A => α Xβ ] is in I.
Aditi Raste, CCOEW 26