yu-chen kuo1 chapter 4 syntax analysis. yu-chen kuo2 4.1 the role of the parser a parser obtains a...

135
Yu-Chen Kuo 1 Chapter 4 Syntax Analysis

Upload: kenneth-williams

Post on 13-Jan-2016

232 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 1

Chapter 4

Syntax Analysis

Page 2: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 2

4.1 The Role of The Parser

• A parser obtains a string of tokens from the lexical analyzer and verifies the string can be generated by the grammar for the source language.

• We expect the parser to report any syntax errors in an intelligible fashion. It should also recover from commonly occurring errors so that it can continue processing the remainder of its input.

Page 3: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 3

4.1 The Role of The Parser

Page 4: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 4

Three Types of Parsers

1. CYK algorithm and Early’s algorithm: inefficient to use in production compilers

2. Top-down method

3. Bottom-up method

Page 5: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 5

Syntax Error Handling

• Lexical error: – misspelling an identifier, keyword, or operator

• Syntactic error:– an arithmetic expression with unbalanced

parentheses

• Semantic error: – an operator applied to an incompatible operand

• Logical error: – an infinitely recursive call

Page 6: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 6

Syntax Error Handling (Cont.)

• The error handler in a parser has simple-to-state goals:– It should report the presence of errors clearly

and accurately– It should recover from each error quickly

enough to be able to detect subsequence errors– It should not significantly slow down the

processing of correct programs

Page 7: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 7

Error-Recovery Strategies

• Panic mode– Discard the input symbol until one of a designated set

of synchronizing tokens is found– synchronizing token: ; end – Guarantee not to go into an infinite loop

• Phrase level– Parser may perform local correction– replace a prefix of the remaining input by some allowed

string; – replace , by ; – delete an extraneous ; or insert missing ; – May lead to an infinite loop if we always insert

something on the input ahead the current input symbol

Page 8: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 8

Error-Recovery Strategies (cont.)

• Error production– Grammars to produce errors

• Global correction– Given an incorrect input string x and grammar

G, find a parse tree for a related string y, such that the number of insertions, deletions, and changes of tokens required to transform x into y is as small as possible

– Too costly

Page 9: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 9

4.2 Context-Free Grammars

• stmt if expr then stmt else stmt1. Terminals: tokens• if, then, else

2. Noterminals: set of strings• expr, stmt

3. Start symbol• stmt

4. Productions

Page 10: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 10

Example 4.2

expr expr op exprexpr (expr)expr - exprexpr id op +op -op *op /op • Terminals: id, +, -, *, / . • Noterminals: expr, op• Start symbol: expr

Page 11: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 11

Notational Conventions

1. There symbols are terminals:i) Lower-case letters: a, b, c

ii) Operator symbols: +, -

iii) Punctuation symbols: parentheses, comma

iv) Digits: 0, 1, …,9

v) Boldface strings: id, if

2. There symbols are nonterminal:i) Upper-case letters: A, B, C

ii) The letter S: start symbol

iii) Lower-case italic names: expr, stmt

Page 12: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 12

Notational Conventions (cont.)

3. Upper-case letters late in alphabet: X, Y, Z, represent grammar symbols (terminals or nonterminal)

4. Lower-case letters late in alphabet: u, v,…z, represent string of terminals

5. Lower-case Greek letters : , , , represent string of grammar symbols

6. A-productions (all productions): A 1| 2|…| k

7. Start symbol: the left side of the first production

Page 13: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 13

Example 4.3

E EAE | (E) | - E | id

A + | - | * | / | By notational conventions −

Nonterminals: E, A

Terminals: remaining symbols

Page 14: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 14

Derivations

E E+E | E*E| (E) | - E | id• E derives -E

– E - E

• The derivation of -(id) from E− E - E - (E) -(id)

A , if A : one step derivation• : zero or more steps derivations• : one or more steps derivations

*

Page 15: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 15

Derivations (cont.)

, for any string • If , , then • L(G) denotes the language generated by G. L(G)

contains all terminal symbols. w L(G), if S w. String w is call a sentence of G.

• S , may contain nonterminals. We call is a sentential form of G.

• E.g., -(id + id) is a sentence of the grammar, because E -(id + id)

*

*

*

*

Page 16: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 16

Leftmost & Rightmost Derivations

• Leftmost derivation ( )– -(E+E) -(id+E) -(id+id)

• Rightmost derivation ( )– -(E+E) -(E+id) -(id+id)

• S . We call is a left-sentential form of G.• S . We call is a cannonical-sentential for

m of G.

lm

lm

rm

rm

Page 17: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 17

Parse Tree and Derivations

Page 18: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 18

Parse Tree and Derivations (cont.)

Page 19: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 19

Ambiguity

• More than one parse tree for some sentences• More than leftmost derivation for some

sentences• More than rightmost derivation for some

sentences

Page 20: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 20

4.3 Regular Expression vs. Context-free grammar

• Every language that can be described by a regular expression can also be described by a context-free grammar

– (a|b)*abb

– A0 aA0 | bA0 | aA1

A1 bA2

A2 bA3

A3

• Every regular set is a context-free language

Page 21: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 21

Why use regular expression to define the lexical syntax of a language ?

• Why not use CFG for the lexical syntax

1. Lexical rules of a language are frequently quite simple. We do not need a powerful grammar.

2. Regular expression provide a more concise and easier to understand notation for tokens

3. An efficient lexical analysis can be constructed automatically from regular expressions

4. Separating the syntactic structure of a language into lexical and nonlexical parts

Page 22: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 22

Why use regular expression to define the lexical syntax of a language ?

• Regular expressions are most useful for describing structure of lexical constructs such as identifies, constants, keywords

• Grammars are most useful for describing nested structure of lexical constructs such balanced parentheses, matching begin-end’s, corresponding if-then-else’s.

• Nested structures can not be described by regular expressions.

Page 23: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 23

Verifying the Language Generated by a Grammar

• Proof that L(G) = L– Every string generated by G is in L– Every string in L can be generated by G

• S (S)S | , generates all string of balanced ( )– Every sentence derived from S is balanced by induction

• S (S)S * (x)S * (x)y (n steps)• S * x (less than n setps and must be balanced)• S * y (less than n setps and must be balanced)

– Every balanced string length 2n is derivable from S• w = (x)y of length 2n• x and y are length of less than 2n. They are both balanced and deri

vable from S• S (S)S * (x)S * (x)y =w

Page 24: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 24

Eliminating Ambiguity

stmt if expr then stmt

| if expr then stmt else stmt

| other

Page 25: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 25

Eliminating Ambiguity (cont.)

• Disambiguating rule: match each else with the closest previous unmatched then

• The statement between a then and an else must be matched

stmt matched_stmt | unmatched_stmtmatched_stmt if expr then matched_stmt else matched_stmt

| otherunmatched_stmt if expr then matched_stmt else unmatched_stmt

| if expr then stmt

Page 26: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 26

Eliminating Immediate Left Recursion

• A grammar is left recursive if it has a production A+A

• Top-down parsing methods cannot handle left-recursion grammars because top-down parsing is corresponding to the leftmost derivation.

nmAAAA |...||||...|| 2121

|'|...|'|''

'|...|'|'

21

21

AAAA

AAAA

m

n

Page 27: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 27

Eliminating Immediate Left Recursion (cont.)

• Non-immediate left recursionS Aa | b

A Ac | Sd |

S Aa Sda

Page 28: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 28

Eliminating General Left Recursion

• Input Grammar G with no cycle (A+A) or -production

Page 29: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 29

Eliminating General Left Recursion (cont.)

• Non-immediate left recursion S Aa | b A Ac | Sd |

A Ac | Aad | bd | S Aa | b

A bdA’ A’ cA’ | adA’ |

Page 30: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 30

Eliminating Left Factoring

• When it is not clear which of two alternative productions to use to expand a nonterminal A. We rewrite A-production to defer the decision until we have seen enough of the input.

stmt if expr then stmt | if expr then stmt else stmt

stmt if expr then stmt S’ S’ else stmt |

• A 1 | 2 |…| n | A A’| A’ 1 | 2 |…| n

Page 31: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 31

Non-Context-Free Language Constructs

• L1={wcw | w is in (a|b)*} is not context-free• L1’={wcwR | w is in (a|b)*} is context-free

– S aSa | bSb | c• L2 = is not context-free• L2’ = is context-free

– S aSd | aAd

A bAc | bc• L2’’= is context-free

}1,1|{ mndcba mnmn

}1,1|{ mndcba nmmn

}1,1|{ mndcba mmnn

Page 32: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 32

Non-Context-Free Language Constructs

• L3 = is not context-free• L3’= is context-free

– S aSb | ab

• Context-free grammar can keep count of two items but not three.

• Regular expression cannot keep count.

}0|{ ncba nnn

}1|{ nba nn

Page 33: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 33

Top-Down Parsing

• Top-down parsing can be viewed as an attempt to find a leftmost derivation for an input string.

• It constructs a parser tree for the input string to root and creating the nodes of the parser tree in preorder.

Page 34: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 34

Recursive Descent Parsing

• A general top-down parsing that may involve backtracking

• E.g., S cAd

A ab| a , w=cad

Page 35: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 35

Predictive Parsers

• By carefully writing a grammar, eliminating left recursion, and left factoring, we obtain a grammar that can be parsed by a recursive-descent parser that needs no backtracking. (predictive parser)

• Predictive Parser is implemented by recursive procedures

Page 36: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 36

Predictive Parsers (cont.)

type simple | id | array [simple] of type

simple integer | char | num dotdot num

Page 37: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 37

Transition Diagrams for Predictive Parsers

• We can create a transition diagram for a predictive parsers

• For each nonterminal A:1. Create an initial and final state2. For each production A X1X2…Xn, create a path

from the initial to the final state, with edges labeled X

1, X2, …, Xn

• Based on transition diagram to match terminals again lookahead input symbols

Page 38: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 38

Transition Diagrams for Predictive Parsers (cont.)

Page 39: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 39

Transition Diagrams for Predictive Parsers (cont.)

Page 40: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 40

Transition Diagrams for Predictive Parsers (cont.)

Page 41: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 41

Nonrecursive Predictive Parsing

• It is possible to build a nonrecursive predictive parser by maintaining a stack explicitly, rather than via recursive calls.

• The key problem during predictive parsing is that of determining the production to be applied for a nonterminal. The nonrecursive parser looks up the production to be applied in a parsing table.

Page 42: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 42

Nonrecursive Predictive Parsing(Cont.)

Page 43: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 43

Nonrecursive Predictive Parsing(Cont.)

• The parser has an input buffer, a stack, a parsing table, and an output stream.

• The input buffer contains the strings to be parsed followed by $, a symbol used to indicate the end of the input string.

• The stack contains a sequence of grammar symbol with $ on the bottom, indicating the bottom of the stack, Initially, the stack contains the start symbol S of the grammar on top of $.

Page 44: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 44

Nonrecursive Predictive Parsing(Cont.)

• The output stream show the derivation steps for the grammar to produce the input string.

• The parser table is a two-dimensional array M[A, a] to show the stack action for a nonterminal A in the top of stack to meet a terminal a or the symbol $.

Page 45: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 45

Predictive Parsing Algorithm

• Input. A string w and a parsing table M for G

• Output. A leftmost derivation of w, if wL(G)

• Method. – Put $S on stack where S is the start symbol of G– Put w$ in the input buffer– Execute the predictive parsing program (Fig. 4.14)

Page 46: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 46

Predictive Parsing Program

Page 47: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 47

Example

• Consider non-left-recursive grammar for arithmetic expression

E TE’

E’ + TE’ |

T FT’

T’ * FT’ | F (E) | id

Page 48: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 48

Example (parsing table M)

Page 49: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 49

Example (Stack Moves)

Page 50: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 50

FIRST and FOLLOW

• The construction of a predictive parser is aided by FIRST and FOLLOW functions.

• These functions help us to construction the predictive parser table.

• FOLLOW function can also be used as synchronizing tokens during panic-mode error recovery.

Page 51: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 51

FIRST function

• If is a string of a grammar symbols, FIRST() be the set of terminals that begin the strings derived from .

• If * , FIRST().

Page 52: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 52

FIRST Sets

• Compute FIRST(X) for all grammar symbols X, by the following rules until no terminals or can be added to any FIRST(X)

1. If X is a terminal, then FIRST(X)={X}.

2. If X , FIRST(X)

3. If XY1Y2…Yk, then aFIRST(X) where a FIRST(Yi) and FIRST(Y1), FIRST(Y2) ,…,FIRST(Yi-1), Y1Y2…Yi-1 *

Page 53: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 53

FIRST sets (cont.)

3. If FIRST(Yj), for all j=1,2,..,k, then FIRST(X).

• Everything in FIRST(Y1) is also in FIRST(X). If Y1 does not derive , nothing more added to FIRST(X). Otherwise, we add FIRST(Y2), and so on.

• For FIRST(X1X2…Xn), FIRST(X1) FIRST(X1X2…Xn), FIRST(X2) FIRST(X1X2…Xn), if FIRST(X1) and

so on. FIRST(X1X2…Xn) if FIRST(Xi) for all i.

Page 54: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 54

FOLLOW function

• Define FOLLOW(A), for nonterminal A, to be the set of terminal a that can appear immediately to right of A.

• S*Aa, aFOLLOW(A).

• If A can be the rightmost symbol in some sentential form, then $ FOLLOW(A).

Page 55: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 55

FOLLOW Sets

• Compute FOLLOW(A), for nonterminal A, by the following rules until nothing can be added to any FOLLOW set.

1. If S is a start symbol, $FOLLOW(S).2. If AB, FIRST()FOLLOW(B).3. If AB or AB and FIRST(), FO

LLOW(A) FOLLOW(B). (FOLLOW(B) may not FOLLOW(A))

Page 56: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 56

Example

E TE’E’ + TE’ | T FT’

T’ * FT’ | F (E) | id

• FIRST(E)=FIRST(T)=FIRST(F)={(, id}• FIRST(E’)={+, }• FIRST(T’)={*, }• FOLLOW(E)=FOLLOW(E’)={ ), $}• FOLLOW(T)=FOLLOW(T’)={+, ), $}• FOLLOW(F)={+, *, ), $}

Page 57: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 57

Construction of Predictive Parsing Table

• Suppose A , aFIRST(). Then, the parser will expand A by when the current input symbol is a.

• If A , = or *, then the parser will expand A by when the input symbol is in FOLLOW(A) or if $ on the input has been reached and $FOLLOW(A).

Page 58: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 58

Construction of Predictive Parsing Table (Algorithm)

• Input. Grammar G.• Output. Parsing table M.• Method.1. For each production A , do steps 2 and 3.2. For each terminal aFIRST(), add A to

M[A, a]3. If FIRST(), add A to M[A, b] for each ter

minal bFOLLOW(A). If FIRST(), $ FOLLOW(A), add A to M[A, $].

4. Make each undefined entry of M be error.

Page 59: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 59

Example

Page 60: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 60

LL(1)

• A grammar whose parsing table has no multiply-defined entries is said to be LL(1).

• The first “L” means scanning the input form left to right.

• The second “L”, means a leftmost derivation.

• And, “1” means using one input symbol of lookahead at each step.

Page 61: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 61

Example (multiply-defined entry)

S iEtSS’ | a (ambiguous)

S’ eS | FIRST(S)={i,a}, FIRST(S’)={e, }

E b FOLLOW(S)={e,$}, FOLLOW(S’}={e,$}

Page 62: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 62

LL(1) Properties

• No ambiguous or left-recursive grammar can be LL(1).

• A grammar G is LL(1) if and only if A | , the following conditions hold:

1. FIRST() FIRST()=.

2. At most one of * or *.

3. If *, then FIRST() FOLLOW(A) )=. • If-then-else statement violates the condition 3, so

not LL(1).

Page 63: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 63

Error Recovery in Predictive Parsing

• In nonrecursive predictive parsing, an error is detected in one of the following two situations:

1. When the terminal on top of the stack does not match the next input symbol

2. When nonterminal A is on top of the stack, a is the next input symbol, and the parsing table entry M[A,a] is empty.

Page 64: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 64

Error Recovery in Predictive Parsing (cont.)

• Panic-mode error recovery is based on the idea of skipping symbols on the input until a token in a selected set of synchronizing tokens appears.

• Its effectiveness depends on the choice on the synchronizing set.

• Some heuristics are as follows.

Page 65: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 65

Error Recovery in Predictive Parsing (cont.)

1. We place all symbols in FOLLOW(A) into the synchronizing set for nonterminal A. If we skip tokens until an element of FOLLOW(A) is seen and pop A from the stack, it is likely that parsing can continue.

2. There is hierarchical structure on constructs in a language; e.g., expressions within blocks, and so on. We can add to the synchronizing set of a lower construct the symbols that begin higher constructs.

Page 66: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 66

Error Recovery in Predictive Parsing (cont.)

3. If we add symbols in FIRST(A) to the synchronizing set of nonterminal A, then it may be possible to resume parsing according to A if a symbol in FIRST(A) appears in the input.

4. If a nonterminal can generate the empty string, then the production deriving can be used as a default. Doing so may postpone some error detection, but cannot cause an error to be missed.

Page 67: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 67

Error Recovery in Predictive Parsing (cont.)

5. If a terminal on top of the stack, cannot be matched, a simple idea is to pop the terminal, issue a message saying that terminal was inserted, and continue parsing.

Page 68: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 68

Example

• Add “sync” to indicate synchronizing tokens obtained from FOLLOW sets.

Page 69: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 69

Example

• If M[A,a]=, skip a.

• If M[A,a]=sync, A is popped.

• If a token on top of stack does not match input, pop it.

Page 70: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 70

Bottom-Up Parsing

• Shift-reduce parsing is a general style of bottom-up parsing.

• It attempts to construct a parse tree for an input string beginning at the leaves and working up towards the root.

• At each reduction step a particular substring matching the right side of a production is replaced by the nonterminal on the left side of that production.

Page 71: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 71

Bottom-Up Parsing (cont.)

• If the substring is chosen correctly at each reduction step, a rightmost derivation is traced out in reverse.

Page 72: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 72

Example

• Consider the following grammar

S aABe

A Abc | b

B d

• The sentence “a b b c d e” cab be reduced to S by the following reduction steps:

Page 73: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 73

Example

• Consider the following grammar

S aABe

A Abc | b

B d

• The sentence “a b b c d e” cab be reduced to S by the following reduction steps:

Page 74: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 74

Example

1. a b b c d e (A b) (handle at position 2)2. a A b c d e (A Abc)3. a A d e (B d)4. a A B e (S aABe)5. S• The reductions trace out the following rightmost de

rivation in reverse:

S rm a A B e a A d e a A b c d e a b b c d e

Page 75: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 75

Handles

• Informally, a handle of a string is a substring that matched the right side of a production, and whose reduction to the nonterminal on the left side of the production represents one step along the reverse of a rightmost derivation.

• Formally, a handle of a right-sentential form is a production A and a position of where the string may be found and replace by A to produce the previous right-sentential form in a rightmost derivation .

Page 76: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 76

Handles (cont.)

• If S * Aw * w, then A in the position following is a handle of w.

• Note:1. The string w to the right of a handle contains only

terminal symbols.2. If a grammar is unambiguous, then every right-

sentential form of the grammar has exactly one handle; otherwise, some right-sentential forms may have more than one handle.

Page 77: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 77

Example

• Consider the following ambiguous grammar• EE+E | E*E | (E) | id• Two rightmost derivation of id1+id2*id31. E E+E E+ E*E E+ E*id3 E+ id2*id3

id1+ id2*id3 – id1 is a handle of the right-sentential form id1+id2*id3– E id, replace id1 by E becomes E+ id2*id3

2. E E*E E*id3 E+ E*id3 E+ id2*id3 id1+ id2*id3 two possible handles

Page 78: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 78

Handles (cont.)

Page 79: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 79

Handles (cont.)

• The handle represents the leftmost complete subtree consisting of a node and all its children.

• Reducing to A in w can be thought of as “pruning the handle”, removing the children of A from the parse tree.

Page 80: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 80

Stack Implementation of Shift-Reduce Parsing

• We implement shift-reduce parsing by using a stack to hold grammar symbols and an input buffer to hold input string w.

• We use $ to mark the end of the stack and the input buffer

STACK INPUT

$ w$

Page 81: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 81

Stack Implementation of Shift-Reduce Parsing

• The parser shifts input symbols onto the stack until a handle is on top of stack.

• It reduces to the left side of production A.• Repeats this cycle until it has an error or the

stack contains S and the input buffer is empty.

STACK INPUT$S $

Page 82: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 82

Example

Page 83: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 83

Conflicts during Shift-Reduce Parsing

• There are context-free grammars for which shift-reduce parsing cannot be used.

• It’s possible to reach a configuration such that knowing the entire stack contents and the next input symbol, cannot decide whether to shift or to reduce (a shift/reduce conflict), or which of several reductions to make (a reduce/reduce conflict)

Page 84: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 84

Example of Shift/Reduce Conflict

stmt if expr then stmt

| if expr then stmt else stmt

| other

STACK INPUT

$ … if expr then stmt else …$

• Note that if we resolve the conflict in favor of shifting, the parser will behave naturally.

Page 85: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 85

Example of Reduce/Reduce Conflict

(1) stmt → id (parameter_list)(2) stmt → expr := expr(3) parameter_list→ parameter_list , parameter(4) parameter_list→ parameter(5) parameter → id(6) expr → id (expr_list)(7) expr → id(8) expr_list → expr_list , expr(9) expr_list → expr

STACK INPUT$ … id ( id , id )…$

Page 86: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 86

LR Parsers

• LR(k) parsing is an efficient, bottom-up parsing technique.

• The “L” stands for left-to-right scanning of the input, the “R” for constructing a rightmost derivation in reverse, and the “k” for the number of input symbols of lookahead that are used in making parsing decisions.

• When (k) is omitted, k is assumed to be 1.

Page 87: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 87

LR Parsers (cont.)

• LR parsing can be used to parse a large class of context-free grammars than LL parsing.

• The principal drawback of LR parsing is that is too much work to construct an LR parser by hand for a typical programming language grammar.

• We need a specialized tool an LR parser generator. Fortunately, many such generators are available.

Page 88: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 88

The LR Parsing Algorithm

• The schematic form of an LR parser:

Page 89: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 89

The LR Parsing Algorithm (cont.)

• An LR parser consists of an input, an output, a stack, a driver program, and a parsing table that has two parts( action and goto)

• The driver program is the same for all LR parsers.• The parsing table changes from one parser to anothe

r.• The parsing program reads tokens from an input buf

fer one at a time.

Page 90: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 90

The LR Parsing Algorithm (cont.)

• The parser uses a stack to store a string of the form s0X1s1X2s2…Xmsm, where sm is on top. Each Xi is a grammar symbol and each si is a state symbol. Each state symbol summarizes that information contained in the stack below it.

• The combination of the state symbol on top of the stack and the current input symbol are used to index the parsing table to determine the parsing shift-reduce decision.

Page 91: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 91

The LR Parsing Algorithm (cont.)

• The parsing table consists of two parts: a parsing action function and a goto function

• An action table entry can have one of four values:1. shift s, where s is a state2. reduce by a grammar production A 3. accept4. error• The function goto takes a state and a grammar sym

bol arguments and produces a state.

Page 92: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 92

The LR Parsing Algorithm (cont.)

• A configuration of an LR parser is a pair whose first component is the stack contents and whose second component is the unexpended input:

(s0X1s1X2s2…Xmsm , ai ai+1…an$)

• The next move of the parser is determined by reading ai, the current input symbol, and sm, the state on top of the stack, and then consulting the parsing action table entry action[sm, ai]

Page 93: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 93

The LR Parsing Algorithm (cont.)

• A configurations resulting after each of the four types of move are as follows:

1. If action[sm , ai ] = shift s, the parser executes a shift move, entering the configuration

(s0X1s1X2s2…Xmsm ai s , ai+1…an$)

Here the parser has shifted both the current input symbol ai, and the next state s, which is given in action[sm , ai ], onto the stack; ai+1 becomes the current input symbol.

Page 94: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 94

The LR Parsing Algorithm (cont.)

2. If action[sm , ai ] = reduce A, the parser executes a reduce move, entering the configuration

(s0X1s1X2s2…Xm-rsm-r A s, ai ai+1…an$)

where s =goto[sm-r, A] and r is the length of .

Here the parser first popped 2r symbols off the stack ( r state symbols and r grammar symbols), exposing state sm-r. The parser then pushed both A and s, the entry for goto[sm-r, A], onto the stack.

Page 95: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 95

The LR Parsing Algorithm (cont.)

3. If action[sm , ai ] = accept, parsing is completed.

4. If action[sm , ai ] = error, the parser has discovered an error and calls an error recovery routine.

Page 96: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 96

The LR Parsing Program

Page 97: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 97

Example

(1) E E + T

(2) E T

(3) T T * F

(4) T F

(5) F (E)

(6) F id

Page 98: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 98

Example (cont.)

Page 99: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 99

Example (cont.)

See p.220

Page 100: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 100

Constructing LR Parsing Tables

• There are three methods for constructing an LR parsing table for a grammar.

(1) Simple LR (SLR) is the easiest to implement, but least powerful.

(2) Canonical LR is the most powerful, and the most expensive.

(3) Lookahead LR(LALR) is intermediate in power and cost.

Page 101: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 101

Constructing SLR Parsing Tables

1) FOLLOW(A) for every nonterminal A in G.

2) The augmented grammar G’

3) The canonical collection of sets of LR(0) items C

4) The transition diagram for viable prefixes

5) The parsing table action and goto function

Page 102: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 102

Example

(1) E E + T

(2) E T

(3) T T * F

(4) T F

(5) F (E)

(6) F id

Page 103: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 103

Step 1: FOLLOW sets for Nonterminals

• FOLLOW(E) = { +, ), $}

• FOLLOW(T)=FOLLOW(F)={+,*,),$}

Page 104: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 104

Step 2: The Augment Grammar

• If G is a grammar with start symbol S, then G’, the argument grammar for G, is G with a new start symbol S’ and productions S’S.

• The argument grammar is as follows:

E’ E

E E + T | T

T T * F | F

F (E) | id

Page 105: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 105

Step 3: Sets of LR(0) Items

• An LR(0) item( item for short) of a grammar G is a production of G with a dot at some position of the right side.

• The production AXYZ yields the four items• A •XYZ A X • YZ

A XY•Z A XYZ •• The production A generates only one item

A •

Page 106: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 106

The Closure Operation

• If I is a set of LR(0) items for a grammar G, then closure(I) is the set of items constructed from I by the following two rules:

1. Initinally, every item in I is added to closure(I).

2. If A•B is closure(I) and B is a production in G, then add the item B • to closure(I), if it is not already there.

3. We apply this rule until no more new items can be added to closure(I).

Page 107: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 107

Example

• If I is a set of one item {[E’•E]}, then closure(I) contains the items

E’ • E E • E + T E • T T • T * F T • F F • (E) F • id

Page 108: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 108

The Goto Operation

• If I is a set of items and X is a grammar symbol, then goto(I,X) is the closure of the set of all items [AX•] such that [A •X] is in I.

Page 109: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 109

The Goto Operation

• If I is a set of items {[E’•E], [E E •+T]}, then goto(I,+) consists of

E E + •T

T •T * F

T • F

F • (E)

F • id

Page 110: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 110

The Sets-of-Items Construction

• The algorithm to construct C, the canonical collection of sets of LR(0) items (all possible items) for a augumented grammar G’, is shown below.

Page 111: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 111

Example

• closure ({[E’•E]})=I0: E’ •E E • E + T

E • TT • T * FT • FF • (E) F • id

• goto(I0, E)=I1: E’ E •

E E • + T

Page 112: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 112

Example (cont.)

• goto(I0, T)= I2: E T •

T T • * F

• goto(I0, F)= I3: T F •

• goto(I0, ( )= I4: F (• E)

E • E + T E • T T • T * F

F • (E) F • id

• goto(I0, id)= I5: F id •

Page 113: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 113

Example (cont.)

• goto(I1, +)= I6: E E + • T

T • T * F T •F F • (E)

F • id• goto(I2, *)= I7: T T * • F

F • (E) F • id

Page 114: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 114

Example (cont.)

• goto(I4, E)= I8 : F • (E)

E E • + T

• goto(I4, T)= I2

• goto(I4, F)= I3

• goto(I4, ( )= I4

• goto(I4, id)= I5

• goto(I6, T)= I9 : E E + T •

T T • * F

Page 115: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 115

Example (cont.)

• goto(I6, F)= I3

• goto(I6, ( )= I4

• goto(I6, id)= I5

• goto(I7, F )= I10 : T T * F •

• goto(I7, ( )= I4

• goto(I7, id )= I5

• goto(I8, ) )= I11 : F (E) •

• goto(I8, + )= I6

• goto(I9, * )= I7

Page 116: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 116

Step 4: The Transition Diagram

• The goto functions for the canonical collection of sets of items be shown as a transition diagram.

Page 117: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 117

Step 4: The Transition Diagram (cont.)

Page 118: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 118

Step 4: The Transition Diagram (cont.)

Page 119: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 119

Step 5: The Parsing Table

• State i is construction form Ii.

1) The parsing actions for state i are determinated as follows:

a) If [A•a] is in Ii and goto(Ii, a)=Ij, set action[i,a] to “shift j”. Here a must be a terminal.

b) If [A•] is in Ii, set action [i, a] to “reduce A” for all a in FOLLOW (A). Here A must not S’.

c) If [ S’S•] in Ii, set action[i,$] to “accept”.

Page 120: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 120

Step 5: The Parsing Table (cont.)

2) The goto transactions for state i are constructed for all nonterminals A using the rule:

• If goto(Ii, A)=Ij, the goto[i,A]= j.

3) All entries not defined by 1) and 2) are set “error”

4) The start state of the parser is the one constructed from the set of items containing [S’• S]

Page 121: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 121

Step 5: The Parsing Table (cont.)

Page 122: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 122

Example for a not SLR(1) and unambiguous grammar

• S L = R

• S R

• L * R

• L id• R L

Page 123: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 123

Example for a not SLR(1) and unambiguous grammar (cont.)

Page 124: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 124

Example for a not SLR(1) and unambiguous grammar (cont.)

• Consider the set of items I2.• I2: S L = R

R L – action[2, =]= “shift 6”– FOLLOW(R) contains =,

set action[2, =]= “reduce R L”• Shift/Reduce Conflict but not ambiguous• In fact, no right-sentential form that begins R=

….. when the viable prefix L only. (*L)• Spilt the state accord to FOLLOW set.

Page 125: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 125

Construction Canonical LR Parsing Tables

• An LR(1) item is of the form [A, a] where A is a production and a is a terminal or $.

• The “1” refers to the length of the second component, called the lookahead of the item.

• The lookahead has no effect in an item of the form [A, a], where is not , but an item of the form [A, a] calls for a reduction by A only if the input symbol is a.

Page 126: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 126

Construction Canonical LR Parsing Tables

• Thus, we are compelled to reduce by A only on those input symbols a for which [A , a] is an LR(1) item in the state on top of the stack.

• The set of such a’s will always be a subset of FOLLOW(A), but it could be a proper subset.

• The method for constructing the collection of sets of LR(1) items is essentially the same as the way we built the collection of sets of LR(0) items. We only to modify two procedures closure and goto.

Page 127: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 127

Construction Canonical LR Parsing Tables (closure function )

Page 128: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 128

Construction Canonical LR Parsing Tables (item & goto)

Page 129: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 129

Example (cont.)

• Consider the following augmented grammarS’ SS CCC cC | d

• closure ( {[S’ S, $]})= I0: S’ S, $ S CC, $ C cC, c/d C d, c/d

• goto(I0, S)= I1: S’ S , $

Page 130: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 130

Example (cont.)

• goto (I0, C)= I2: S C C, $ C cC, $

C d, $• goto (I0, c)= I3: C c C, c/d

C cC, c/d C d, c/d

• goto(I0, d)= I4: C d , c/d• goto (I2, C)= I5: S C C , $

Page 131: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 131

Example (cont.)

• goto (I2, c)= I6: C c C, $ C cC, $ C d, $

• goto (I2, d)= I7: C d , $• goto(I3, C)=I8: C cC, c/d• goto(I3, c)=I3

• goto(I3, d)=I4

• goto(I6, C)=I9: C cC, $• goto(I6, c)=I6

• goto(I6,d)=I7

Page 132: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 132

Example (Transition Diagram)

• Compare I6 & I3 with different FOLLOW

Page 133: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 133

Example (Transition Diagram)

Page 134: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 134

Example (Parsing Table)

Page 135: Yu-Chen Kuo1 Chapter 4 Syntax Analysis. Yu-Chen Kuo2 4.1 The Role of The Parser A parser obtains a string of tokens from the lexical analyzer and verifies

Yu-Chen Kuo 135

Canonical LR Parser vs. SLR Parser

• Every SLR(1) grammar is a LR(1) grammar.• Canonical LR Parser (LR(1) grammar) may have

more states than SLR parser (SLR(1)) form the same grammar.

• Exercise: check the following grammar is a LR(1) grammar or not.S L = R

S R

L * R

L id

R L