4b 4b lexical analysis finite automata. finite automata (fa) fa also called finite state machine...

12
4b 4b Lexical analysis Finite Automata

Upload: alban-dawson

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

4b4b

Lexical analysis Finite Automata

Page 2: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Finite Automata (FA)• FA also called Finite State Machine (FSM)

– Abstract model of a computing entity.– Decides whether to accept or reject a string.– Every RE can be represented as a FA and vice versa

• Two types of FAs:– Non-deterministic (NFA): more than one action for same

input symbol – Deterministic (DFA): at most one action for a given input

symbol• Example: how do we write a program to recognize the Java

keyword “int”?

q0 q3tq2q1i n

Page 3: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

RE and Finite State Automaton (FA)• REs are a declarative way to describe the tokens

– Describes what is a token, but not how to recognize the token• FAs are used to describe how the token is recognized

– FAs are easy to simulate in a programs• A 1-1 correspondence between FAs & REs

– A scanner generator (e.g., lex) bridges the gap between regular expressions and FAs.

Scanner generator

Finiteautomaton

Regularexpression

scanner program

String stream

Tokens

Page 4: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Transition Diagram• FA can be represented using transition diagram• A transition diagram has:

– States represented by circles;– An Alphabet (Σ) represented by labels on edges;– Transitions represented by labeled directed edges

between states. The label is the input symbol;– One Start State shown as having an arrow head;– One or more Final State(s) represented by double circles.

• Example transition diagram to recognize (a|b)*abb

q0 q3bq2q1 ba

a

b

Page 5: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Simple examples of FA

a

a*

a+

(a|b)*

start

a

0

start

a

1a0

start

a

0

b

start

a, b

0

start1

a0

Page 6: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Defining a DFA/NFA• Define input alphabet and initial state• Draw the transition diagram• Check

– Do all states have out-going arcs labeled with all the input symbols (DFA)

– Any missing final states?– Any duplicate states?– Can all strings in the language can be accepted?– Are any strings not in the language accepted?

• Optionally name the states

Page 7: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Example of constructing a FA

• Construct a DFA accepting a language L over the alphabet {0, 1} where L is set of strings with any number of 0s followed by any number of 1s

• Regular expression: 0*1* = {0, 1}• Draw initial state of the transition diagram

Start

Page 8: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Example of constructing a FA

• Draft the transition diagram

Start 1

0 1

0

Start 1

0 1

0

1

• Is 111 accepted?• Leftmost state has missed an arc for input 1

Page 9: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Example of constructing a FA

• Is 00 accepted? • The leftmost two states are also final states

– First state from the left: is also accepted– Second state from the left:

strings with “0”s only are also accepted

Start 1

0 1

0

1

Page 10: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Example of constructing a FA

• The leftmost two states are duplicate– their arcs point to the same states with same symbols

Start 10 1

• Check that they are correct– All strings in the language can be accepted

, the empty string, is accepted» strings with 0s/1s only are accepted

– No strings not in language are accepted

• Naming all the states

Start 1

0 1

q0 q1

Page 11: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

Transition table• A transition table is a good way to implement a FSA

– One row for each state, S– One column for each symbol, A– Cell (S,A) is set of states can reachable from S on input A

• NFAs have at least one cell with more than one state• DFAs have a singe state in every cell

STATES

INPUT

a b

>Q0 {q0, q1} q0

Q1 q2

Q2 q3

*Q3

q0 q3bq2q1 ba

a

b

(a|b)*abb

Page 12: 4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides

DFA to program• NFA is more concise but not as

easy to implement;• DFAs easily simulated via

algorithm• Every NFA can be converted to an

equivalent DFA– What does equivalent mean?

• There are general algorithms to ‘minimize’ a DFA

– Minimal in what sense?• There are systems to convert REs

to programs using a minimal DFA to recognize strings defined by the RE

• Learn more in 451 (automata theory) and 431 (Compiler design)

RE

NFA

DFA

Minimized DFA

Program

Thompson construction

Subset construction

DFA simulationScanner generator

Minimization