chapter 3 chang chi-chung 2007.4.12. the role of the lexical analyzer lexical analyzer parser source...

58
Chapter 3 Chang Chi-Chung 2007.4.12

Upload: amanda-collins

Post on 30-Dec-2015

245 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Chapter 3

Chang Chi-Chung

2007.4.12

Page 2: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

The Role of the Lexical Analyzer

LexicalAnalyzer

ParserSource

Program

Token

Symbol Table

getNextToken

error error

Page 3: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

The Reason for Using the Lexical Analyzer Simplifies the design of the compiler

A parser that had to deal with comments and white space as syntactic units would be more complex.

If lexical analysis is not separated from parser, then LL(1) or LR(1) parsing with 1 token lookahead would not be possible (multiple characters/tokens to match)

Compiler efficiency is improved Systematic techniques to implement lexical analyzers by

hand or automatically from specifications Stream buffering methods to scan input

Compiler portability is enhanced Input-device-specific peculiarities can be restricted to the

lexical analyzer.

Page 4: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Lexical Analyzer Lexical analyzer are divided into a cascade of

two process. Scanning

Consists of the simple processes that do not require tokenization of the input. Deletion of comments. Compaction of consecutive whitespace characters into

one.

Lexical analysis The scanner produces the sequence of tokens as

output.

Page 5: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Tokens, Patterns, and Lexemes Token (符號單元 )

A pair consisting of a token name and optional arrtibute value.

Example: num, id Pattern (樣本 )

A description of the form for the lexemes of a token. Example: “non-empty sequence of digits”, “letter followed by

letters and digits” Lexeme ( 詞 )

A sequence of characters that matches the pattern for a token.

Example: 123, abc

Page 6: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Examples: Tokens, Patterns, and Lexemes

Token Pattern Lexeme

if characters i f if

else characters e l s e else

comparison < or > or <= or >= or == or != <=, !=

id letter followed by letters and digits

pi, score, D2

number any numeric constant 3.14, 0, 6.23

literal anything but “, surrounded by “’s “core dump”

Page 7: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

An Example

E = M * C ** 2 A sequence of pairs by lexical analyzer<id, pointer to symbol-table entry for E>

<assign_op>

<id, pointer to symbol-table entry for M>

<mult_op>

<id, pointer to symbol-table entry for C>

<exp_op>

<number, integer value 2>

Page 8: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Input Buffering

E = M * C * * 2 eof eof

eof

lexemeBegin forward

Sentinels

Page 9: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Lookahead Code with Sentinels switch (*forward++) { case eof: if (forward is at end of first buffer) { reload second buffer; forward = beginning of second buffer; } else if (forward is at end of second buffer) { reload first buffer; forward = beginning of first buffer; } else /* eof within a buffer marks the end of inout */ terminate lexical anaysis; break; cases for the other characters;}

Page 10: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Strings and Languages

Alphabet An alphabet is a finite set of symbols (characters)

String A string is a finite sequence of symbols from

s denotes the length of string s denotes the empty string, thus = 0

Language A language is a countable set of strings over some fixed

alphabet Abstract Language Φ {ε}

Page 11: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

String Operations Concatenation (連接 )

The concatenation of two strings x and y is denoted by xy Identity (單位元素 )

The empty string is the identity under concatenation. s = s = s

Exponentiation Define

s0 = si = si-1s for i > 0

By Define

s1 = s

s2 = ss

Page 12: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Language Operations Union

L M = { s s L or s M } Concatenation

L M = { xy x L and y M} Exponentiation

L0 = { }

Li = Li-1L Kleene closure (封閉包 )

L* = ∪i=0,…, Li

Positive closureL+ = ∪i=1,…, Li

Page 13: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Regular Expressions Regular Expressions

A convenient means of specifying certain simple sets of strings.

We use regular expressions to define structures of tokens.

Tokens are built from symbols of a finite vocabulary. Regular Sets

The sets of strings defined by regular expressions.

Page 14: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Regular Expressions Basis symbols:

is a regular expression denoting language L() = {} a is a regular expression denoting L(a) = {a}

If r and s are regular expressions denoting languages L(r) and M(s) respectively, then rs is a regular expression denoting L(r) M(s) rs is a regular expression denoting L(r)M(s) r* is a regular expression denoting L(r)*

(r) is a regular expression denoting L(r) A language defined by a regular expression is called

a regular set.

Page 15: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Operator Precedence

Operator Precedence Associative

* highest left

concatenation Second left

| lowest left

Page 16: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Algebraic Laws for Regular ExpressionsLaw Description

r | s = s | r | is commutative

r | ( s | t ) = ( r | s ) | t | is associative

r(st) = (rs)t concatenation is associative

r(s|t) = rs | rt

(s|t)r = sr | trconcatenation distributes over |

εr = rε = r ε is the identity for concatenation

r* = ( r |ε)* ε is guaranteed in a closure

r** = r* * is idempotent

Page 17: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Regular Definitions If Σ is an alphabet of basic symbols, then a regular

definitions is a sequence of definitions of the form: d1 r1

d2 r2

dn rn Each di is a new symbol, not in Σ and not the same as any

other of d’s. Each ri is a regular expression over the alphabet

{d1, d2, …, di-1 }

Any dj in ri can be textually substituted in ri to obtain an equivalent set of definitions

Page 18: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example: Regular Definitions Regular Definitions letter_ A | B | … | Z | a | b | … | z | _ digit 0 | 1 | … | 9 id letter_ ( letter_ | digit )*

Regular definitions are not recursive

digits digit digits digit wrong

Page 19: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Extensions of Regular Definitions One or more instance

r+ = rr* = r*r r* = r+ | ε

Zero or one instance r? = r |ε

Character classes [a-z] = abc…z [A-Za-z] = A|B|…|Z|a|…|z

Example digit [0-9] num digit+ (. digit+)? ( E (+-)? digit+ )?

Page 20: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Regular Definitions and GrammarsContext-Free Grammars

stmt if expr then stmt if expr then stmt else stmt

expr term relop term termterm id num

Regular Definitions digit [0-9]letter [A-Za-z] if if then then else elserelop < <= <> > >= = id letter ( letter | digit )*

num digit+ (. digit+)? ( E (+ | -)? digit+ )?

ws ( blank | tab | newline )+

Page 21: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

LEXEMES TOKEN NAME ATTRIBUTE VALUE

Any ws - -

if if -

then then -

else else -

Any id id Pointer to table entry

Any number number Pointer to table entry

< relop LT

<= relop LE

= relop EQ

<> relop NE

> relop GT

>= relop GE

Page 22: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Transition Diagrams

0 21

6

3

4

5

7

8

return(relop, LE)

return(relop, NE)

return(relop, LT)

return(relop, EQ)

return(relop, GE)

return(relop, GT)

start <

=

>

=

>

=

other

other

*

*

relop < <= <> > >= =

Page 23: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Transition Diagrams

9start letter

10 11*other

letter or digit

return (getToken(), installID() )

id letter ( letter | digit )*

Page 24: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

An Example: Implement of RELOP TOKEN getRelop(){ TOKEN retToken = new(RELOP); while (1) { case 0: c = nextChar(); if (c == ‘<‘) state = 1; else if (c == ‘=‘) state= 5; else if (c == ‘>‘) state= 6; else fail(); break; case 1: ... ... case 8: retract(); retToken.attribute = GT; return(retTOKEN); }}

Page 25: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Finite Automata Finite Automata are recognizers.

FA simply say “Yes” or “No” about each possible input string.

A FA can be used to recognize the tokens specified by a regular expression

Use FA to design of a Lexical Analyzer Generator Two kind of the Finite Automata

Nondeterministic finite automata (NFA) Deterministic finite automata (DFA)

Both DFA and NFA are capable of recognizing the same languages.

Page 26: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

NFA Definitions NFA = { S, , , s0, F }

A finite set of states S A set of input symbols Σ

input alphabet, ε is not in Σ A transition function

: S S A special start state s0

A set of final states F, F S (accepting states)

Page 27: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Transition Graph for FA

is a state

is a transition

is a the start state

is a final state

Page 28: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example

0 1 2 3a b c

c

a

This machine accepts abccabc, but it rejects abcab.

This machine accepts (abc+)+.

Page 29: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Transition Table

0start a 1 32b b

a

b

STATE a b ε

0 {0, 1} {0} -

1 - {2} -

2 - {3} -

3 - - -

The mapping of an NFA can be represented in a transition table

(0, a) = {0,1}(0, b) = {0}(1, b) = {2}(2, b) = {3}

Page 30: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

DFA DFA is a special case of an NFA

There are no moves on input ε For each state s and input symbol a, there is

exactly one edge out of s labeled a. Both DFA and NFA are capable of

recognizing the same languages.

Page 31: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Simulating a DFA

Input An input string x terminated by

an end-of-file character eof. A DFA D with start state s0, accepting states F, and transition function move.

Output Answer “yes” if D accepts x;

“no” otherwise.

s = s0

c = nextChar();

while ( c != eof ) {

s = move(s, c);

c = nextChar();

}

if (s is in F )

return “yes”;

else

return “no”;

Page 32: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

NFA vs DFA

0start a 1 32b b

a

b

S = {0,1,2,3} = {a, b}s0 = 0F = {3}

0 1 2 3a b b

b

a

a

a

(a | b)*abb

Page 33: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

The Regular Language The regular language defined by an NFA is the

set of input strings it accepts. Example: (ab)*abb for the example NFA

An NFA accepts an input string x if and only if there is some path with edges labeled with symbols

from x in sequence from the start state to some accepting state in the transition graph

A state transition from one state to another on the path is called a move.

Page 34: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Theorem The followings are equivalent

Regular Expression NFA DFA Regular Language Regular Grammar

Page 35: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Convert Concept

Regular Expression

Nondeterministic Finite Automata

Deterministic Finite Automata

MinimizationDeterministic

Finite Automata

Page 36: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Construction of an NFA from a Regular Expression

Use Thompson’s Construction

s | t

N(s)

N(t)

s t N(s) N(t)

s* N(s)

a

a

ε

Page 37: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example ( a | b )* a b b

r11

r8

r10

r7

r9

r6r5

*r4 a

b

b

( r3 )

r2r1

a b

|r3 = r4

Page 38: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

0start a1 10

2

b

b

a

b

3

4 5

6 7 8 9

( a | b )* a b b

Example

Page 39: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Conversion of an NFA to a DFA The subset construction algorithm converts an NFA

into a DFA using the following operation.

Operation Description

ε- closure(s)Set of NFA states reachable from NFA state s on ε-transitions alone.

ε- closure(T)

Set of NFA states reachable from some NFA state s in set T on ε-transitions alone.

= ∪s in T ε- closure(s)

move(T, a)Set of NFA states to which there is a transition on input symbol a from some state s in T

Page 40: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Subset Construction(1)

Initially, -closure(s0) is the only state in Dstates and it is unmarked;

while (there is an unmarked state T in Dstates) {mark T;for (each input symbol a ) { U = -closure( move(T, a) ); if (U is not in Dstates) add U as an unmarked state to Dstates Dtran[T, a] = U}

}

Page 41: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Computing ε- closure(T)

Page 42: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Astart

B

C

D E

b

b

b

b

b

aa

a

a

a

0start a1 10

2

b

b

a

b

3

4 5

6 7 8 9

NFA State DFA State a b

{0,1,2,4,7} A B C

{1,2,3,4,6,7,8} B B D

{1,2,4,5,6,7} C B C

{1,2,4,5,6,7,9} D B E

{1,2,3,5,6,7,10} E B C

Example

( a | b )* a b b

Page 43: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example2a1

6a3 4 5b b

8b7

a b0

start

DstatesA = {0,1,3,7}B = {2,4,7}C = {8}D = {7}E = {5,8}F = {6,8}

a abb a*b+

0137 247

68

7

8 58

a

b

b

b b

b

a

b

Page 44: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Simulation of an NFA

Input An input string x terminated by an

end-of-file character eof. An NFA N with start state s0, accepting states F, and transition function move.

Output Answer “yes” if N accepts x; “no”

otherwise.

S = ε-closure(s0)c = nextChar();while ( c != eof ) {

S = ε-closure(s0) c = nextChar();}if (S∩F != ψ) return “yes”;else return “no”;

Page 45: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Minimizing the DFA Step 1

Start with an initial partition II with two group: F and S-F (aceepting and nonaccepting)

Step 2 Split Procedure

Step 3 If ( IInew = II )

IIfinal = II and continue step 4 else

II = IInew and go to step 2 Step 4

Construct the minimum-state DFA by IIfinal group. Delete the dead state

Page 46: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Split Procedure

Initially, let IInew = II for ( each group G of II ) { Partition G into subgroup such that two states s and t are in the same subgroup if and only if for all input symbol a, states s and t have

transition on a to states in the same group of II.

/* at worst, a state will be in a subgroup by itself */

replace G in IInew by the set of all subgroup formed}

Page 47: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example

initially, two sets {1, 2, 3, 5, 6}, {4, 7}. {1, 2, 3, 5, 6} splits {1, 2, 5}, {3, 6} on c. {1, 2, 5} splits {1}, {2, 5} on b.

Page 48: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Minimizing the DFA

Major operation: partition states into equivalent classes according to final / non-final states transition functions

( A B C D E )( A B C D ) ( E )( A B C ) ( D ) ( E )( A C ) ( B ) ( D ) ( E )

a bA B CB B DC B CD B EE B C

a bA C B A CB B DD B EE B A C

Page 49: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Important States of an NFA The “important states” of an NFA are those

without an -transition, that is if move({s}, a) for some a then s is an

important state The subset construction algorithm uses only

the important states when it determines-closure ( move(T, a) )

Augment the regular expression r with a special end symbol # to make accepting states important: the new expression is r#

Page 50: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Converting a RE Directly to a DFA Construct a syntax tree for (r)# Traverse the tree to construct functions

nullable, firstpos, lastpos, and followpos Construct DFA D by algorithm 3.62

Page 51: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Function Computed From the Syntax Tree nullable(n)

The subtree at node n generates languages including the empty string

firstpos(n) The set of positions that can match the first symbol of a

string generated by the subtree at node n lastpos(n)

The set of positions that can match the last symbol of a string generated be the subtree at node n

followpos(i) The set of positions that can follow position i in the tree

Page 52: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Rules for Computing the Function

Node n nullable(n) firstpos(n) lastpos(n)

A leaf labeled by

true

A leaf with position i

false {i} {i}

n = c1 | c2

nullable(c1)or

nullable(c2)firstpos(c1) firstpos(c2) lastpos(c1) lastpos(c2)

n = c1 c2

nullable(c1) and

nullable(c2)

if ( nullable(c1) )

firstpos(c1) firstpos(c2)else firstpos(c1)

if ( nullable(c2) ) lastpos(c1) lastpos(c2)

else lastpos(c2)

n = c1* true firstpos(c1) lastpos(c1)

Page 53: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Computing followpos

for (each node n in the tree){ //n is a cat-node with left child c1 and right child c2 if ( n == c1. c2) for (each i in lastpos(c1) )

followpos(i) = followpos(i) firstpos(c2); else if (n is a star-node)

for ( each i in lastpos(n) ) followpos(i) = followpos(i) firstpos(n);

}

Page 54: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Converting a RE Directly to a DFAInitialize Dstates to contain only the unmarked state firstpos(n0), where n0 is the root of syntax tree T for (r)#;

while ( there is an unmarked state S in Dstates ) {

mark S; for ( each input symbol a ) {

let U be the union of followpos(p)

for all p in S that correspond to a;if (U is not in Dstates )

add U as an unmarked state to DstatesDtran[S,a] = U;

}

}

Page 55: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example

( a | b )* a b b #

nullable(n) = false

firstpos(n) = { 1, 2, 3 }

lastpos(n) = { 3 }

followpos(1) = {1, 2, 3 }

b

#

b○

a*4

5

6

|

a b

3

21

n

n = ( a | b )* a

Page 56: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example{6}{1, 2, 3}

{5}{1, 2, 3}

{4}{1, 2, 3}

{3}{1, 2, 3}

{1, 2}{1, 2} *

{1, 2}{1, 2} |

{1}{1} a {2}{2} b

{3}{3} a

{4}{4} b

{5}{5} b

{6}{6} #

nullable

firstpos lastpos

1 2

3

4

5

6

( a | b )* a b b #

Page 57: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Example

1,2,3 a 1,2,3,4

1,2,3,61,2,3,5

b b

b b

a

a

a

Node followpos

1 {1, 2, 3}

2 {1, 2, 3}

3 {4}

4 {5}

5 {6}

6 -

1

2

3 4 5 6

( a | b )* a b b #

Page 58: Chapter 3 Chang Chi-Chung 2007.4.12. The Role of the Lexical Analyzer Lexical Analyzer Parser Source Program Token Symbol Table getNextToken error

Time and Space Complexity

AutomatonSpace

(worst case)Time

(worst case)

NFA O(r) O(rx)

DFA O(2|r|) O(x)