cs490 presentation: automata & language theory thong lam ran shi

32
CS490 CS490 Presentation: Presentation: Automata & Automata & Language Theory Language Theory Thong Lam Thong Lam Ran Shi Ran Shi

Upload: gordon-tyler

Post on 12-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

CS490 Presentation:CS490 Presentation:Automata & Language Automata & Language

TheoryTheory

Thong LamThong Lam

Ran ShiRan Shi

Page 2: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Components of a Formal LanguageComponents of a Formal Language

SymbolSymbol AlphabetAlphabet StringString GrammarGrammar

Page 3: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Components of a Formal LanguageComponents of a Formal Language

Symbol Symbol

A character, glyph, mark. An abstract A character, glyph, mark. An abstract entity that has no meaning by itself, entity that has no meaning by itself, often called uninterpreted. Letters often called uninterpreted. Letters from various alphabets, digits and from various alphabets, digits and special characters are the most special characters are the most commonly used symbols. commonly used symbols.

Page 4: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Components of a Formal LanguageComponents of a Formal Language

Alphabet Alphabet A finite set of symbols. An alphabet is A finite set of symbols. An alphabet is

often denoted by sigma(often denoted by sigma(ΣΣ)), yet can be , yet can be given any name. B = {0, 1} Says B is given any name. B = {0, 1} Says B is an alphabet of two symbols, 0 and 1. an alphabet of two symbols, 0 and 1. C = {a, b, c} Says C is an alphabet of C = {a, b, c} Says C is an alphabet of three symbols, a, b and c. Sometimes three symbols, a, b and c. Sometimes space and comma are in an alphabet space and comma are in an alphabet while other times they are meta while other times they are meta symbols used for descriptions. symbols used for descriptions.

Page 5: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Components of a Formal LanguageComponents of a Formal Language

String or Word String or Word A finite sequence of symbols from an alphabet. A finite sequence of symbols from an alphabet.

01110 and 111 are strings from the alphabet 01110 and 111 are strings from the alphabet B above. aaabccc and b are strings from the B above. aaabccc and b are strings from the alphabet C above. A null string is a string with alphabet C above. A null string is a string with no symbols, usually denoted by epsilon(no symbols, usually denoted by epsilon(εε)). . The null string has length zero. The null string The null string has length zero. The null string is usually denoted epsilon (is usually denoted epsilon (εε)). Vertical bars . Vertical bars around a string indicate the length of a string around a string indicate the length of a string expressed as a natural number. For example |expressed as a natural number. For example |00100| = 5, |aab| = 3, | epsilon | = 0 00100| = 5, |aab| = 3, | epsilon | = 0

Page 6: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Components of a Formal LanguageComponents of a Formal Language

GrammerGrammerA way to define a language by giving a finite A way to define a language by giving a finite

set of rules that describe how valid strings set of rules that describe how valid strings may be constructed.may be constructed.

Grammer(G) consists of an alphabet of Grammer(G) consists of an alphabet of terminals(terminals(ΣΣ)), variables(V), production , variables(V), production rules(P) and a start symbol(S).rules(P) and a start symbol(S).

G = (V, G = (V, ΣΣ, P, S)., P, S).

Page 7: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Formal Language Formal Language A set of strings from an alphabet.A set of strings from an alphabet. The set may be empty, finite or infinite. The set may be empty, finite or infinite. L(M) is the notation for a language defined by a L(M) is the notation for a language defined by a

machine M. The machine M accepts a certain set of machine M. The machine M accepts a certain set of strings, thus a language. strings, thus a language.

L(G) is the notation for a language defined by a L(G) is the notation for a language defined by a grammar G. The grammar G recognizes a certain set grammar G. The grammar G recognizes a certain set of strings, thus a language.of strings, thus a language.

M(L) is the notation for a machine that accepts a M(L) is the notation for a machine that accepts a language. The language L is a certain set of strings.language. The language L is a certain set of strings.

G(L) is the notation for a grammar that recognizes a G(L) is the notation for a grammar that recognizes a language. The language L is a certain set of strings. language. The language L is a certain set of strings.

Page 8: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Formal Languages (cont.)Formal Languages (cont.)

The union of two languages is a The union of two languages is a language. L = L1 union L2 language. L = L1 union L2

The intersection of two languages is The intersection of two languages is a language. L = L1 intersect L2 a language. L = L1 intersect L2

The complement of a language is a The complement of a language is a language. L = sigma* - L1 language. L = sigma* - L1

The difference of two languages is a The difference of two languages is a language. L = L1 - L2 language. L = L1 - L2

Page 9: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Regular LanguageRegular Language

A set of strings from an alphabet. A set of strings from an alphabet. The set may be empty, finite or The set may be empty, finite or

infinite. infinite. Must be able to be represented as Must be able to be represented as

finite automatafinite automata

Page 10: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Regular LanguageRegular Language

Uses:Uses:• Symbols/OperationsSymbols/Operations

Concatenation – represented by appending to Concatenation – represented by appending to other symbols, concatenation of a and b is = abother symbols, concatenation of a and b is = ab

Union (+) – which represents or a+b = a or bUnion (+) – which represents or a+b = a or b Kleene star(*) – represents 0 or more Kleene star(*) – represents 0 or more

occurances, for example a*b = {b, ab, aab, ...}occurances, for example a*b = {b, ab, aab, ...} ? – represents 0 or 1 occurance, for example a? ? – represents 0 or 1 occurance, for example a?

= a or 0.= a or 0.

Page 11: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Regular ExpressionsRegular Expressions

Must be part of a regular languageMust be part of a regular language Example:Example:

1+0*011 is a regular expression.1+0*011 is a regular expression.

Page 12: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite AutomataFinite Automata Finite automata is also known as the finite sFinite automata is also known as the finite state machinetate machine It is a 5-tuple {Q, It is a 5-tuple {Q, , , , q, q00, F}where, F}where Q is a finite set of states in the machineQ is a finite set of states in the machine is the input alphabetis the input alphabet is the transition from one state to the next is the transition from one state to the next statestate qq00 is the initial state is the initial state F is the set of all accepting statesF is the set of all accepting states

Page 13: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite Automata (cont.)Finite Automata (cont.)

For example: language consist of all strings have For example: language consist of all strings have even number of 1’seven number of 1’s

L={L={, 0, 00…00, 11, 011, 101, 110, 0011, 00…, 0, 00…00, 11, 011, 101, 110, 0011, 00…011 …}011 …}

evenodd

1

10 0

Page 14: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite Automata (cont.)Finite Automata (cont.)

Finite Automaton, M:Finite Automaton, M: M ={Q, M ={Q, , , , q, q00, F}, F} Q={even, odd}Q={even, odd} ={0,1}={0,1} is described asis described as 0 10 1 even {even} {odd}even {even} {odd} odd {odd} {even}odd {odd} {even}

qq00 =Even =Even F={even}F={even}

Page 15: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite Automata (cont.)Finite Automata (cont.)

The type of finite automaton described in the precediThe type of finite automaton described in the preceding example is a deterministic finite automaton (DFA).ng example is a deterministic finite automaton (DFA).

An easer model to work with is the non-deterministic fAn easer model to work with is the non-deterministic finite automaton (NFA)inite automaton (NFA)

Page 16: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite automata (cont.)Finite automata (cont.)

While in a DFA, every state must have While in a DFA, every state must have exactly one transition path for each symbol exactly one transition path for each symbol in the automaton’s alphabet, an NFA can in the automaton’s alphabet, an NFA can have as one, none, or as many transition have as one, none, or as many transition path as it needs for each symbol in the path as it needs for each symbol in the alphabet. alphabet.

An NFA can also have a transition path(s) An NFA can also have a transition path(s) for the empty input, for the empty input, ..

DFA ‘s and NFA’s are equivalentDFA ‘s and NFA’s are equivalent

Page 17: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Finite AutomataFinite Automata

Example of NFA:Example of NFA:

A

B

C

DD

0,10

1

0,1

Page 18: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Regular Languages: Finite AutomataRegular Languages: Finite Automata

Finite Automaton, N:Finite Automaton, N: N={Q, N={Q, , , , q, q00, F}, F} Q={A,B,C,D}Q={A,B,C,D} ={0,1}={0,1} is described as:is described as: 0 1 0 1 A {C} A {C} {B} {B} B B {C} {C} C {D} {D} C {D} {D} D {D} {D} D {D} {D} qq00 = A = A F={D}F={D}

Page 19: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-free LanguagesContext-free Languages

A set of strings from an alphabet. A set of strings from an alphabet. The set may be empty, finite or The set may be empty, finite or

infinite.infinite. Includes a pumping Lemma.Includes a pumping Lemma. Push Down Automata(PDA).Push Down Automata(PDA). Context-free Grammer(CFG).Context-free Grammer(CFG).

Page 20: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-Free Languages:Context-Free Languages:Pushdown AutomatonPushdown Automaton

Pushdown automaton is a 6-tuple {Q,S,U,P,I,F}, where:Pushdown automaton is a 6-tuple {Q,S,U,P,I,F}, where: Q is a finite set of statesQ is a finite set of states S is input alphabetS is input alphabet U is stack alphabetU is stack alphabet P is transition stateP is transition state I is initial stateI is initial state F is final stateF is final state

Page 21: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-Free Languages:Context-Free Languages:Pushdown automataPushdown automata

For example, we have a string aabbaFor example, we have a string aabba Input string (read in opposite to the string that we havInput string (read in opposite to the string that we hav

e)e)aa bb bb aa aa

CU

Read input stringTransition occurs

stack

CU can read or write to stack

Page 22: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-Free Language:Context-Free Language:Pushdown AutomataPushdown Automata

For example : L = {aFor example : L = {annbbnn | n>0} | n>0}

1 6

5

42

3

Push B Scan b

Scan

aPush a

Scan

b Pop

a

Pop B

Page 23: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Show the configuration sequence on input aabbShow the configuration sequence on input aabb

Q: { 1, 2, 3, 4, 5, 6} (1, Q: { 1, 2, 3, 4, 5, 6} (1, , , ) ) S: { a, b} (2, S: { a, b} (2, , B), B)U: { B, a} (3, a, B)U: { B, a} (3, a, B)I : {1} (2, a, Ba)I : {1} (2, a, Ba)F: {6} (3, aa, Ba)F: {6} (3, aa, Ba)P: 1) push (B, 2) (2, aa, Baa)P: 1) push (B, 2) (2, aa, Baa) 2) scan (a, 3) (b, 4) (4, aab, Baa)2) scan (a, 3) (b, 4) (4, aab, Baa) 3) write (a, 2) (5, aab, Ba)3) write (a, 2) (5, aab, Ba) 4) read (a ,5) (B, 6) (4, aabb, Ba)4) read (a ,5) (B, 6) (4, aabb, Ba) 5) scan (b, 4) (5, aabb, B)5) scan (b, 4) (5, aabb, B) 6) (6, aabb, 6) (6, aabb, ))

Page 24: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-Free Language:Context-Free Language:Context-Free GrammarsContext-Free Grammars

Context-free grammar is a 4-tuple {V, Context-free grammar is a 4-tuple {V, , R,S}, where:, R,S}, where: V is a finite set of variablesV is a finite set of variables is a finite set of terminals is a finite set of terminals R is a finite set of rules, andR is a finite set of rules, and S is the start variableS is the start variable

Page 25: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Context-Free Languages:Context-Free Languages: Context-free grammars Context-free grammars

Example of CFGExample of CFG S S aSa aSa S S bSb bSb S S S S aSa aSa aaSaa aaSaa aabSbaa aabSbaa aabbaa aabbaa L = { wwL = { wwr : r : ww { a, b} { a, b}** } }

Page 26: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Turing MachinesTuring Machines

Turing machines (TM) are similar to PDA’s, but Turing machines (TM) are similar to PDA’s, but with memory that is unlimited and unrestricted.with memory that is unlimited and unrestricted.

The Turing machine uses an infinite tape.The Turing machine uses an infinite tape. The tape head can read and write symbols on the The tape head can read and write symbols on the

tape. It can also move to the left or right over the tape. It can also move to the left or right over the tape.tape.

Turing machines have accept states and reject Turing machines have accept states and reject states, which take immediate effect.states, which take immediate effect.

Page 27: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Turing MachinesTuring Machines

B B x1 x2 xi xn B B

Finite control

Page 28: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Turing machinesTuring machines

The Tm we construct will accept the language { 0The Tm we construct will accept the language { 0nn11nn | n | n1}1} 1. It is given a finite sequence of 0’s and 1’s on its tape, precede1. It is given a finite sequence of 0’s and 1’s on its tape, precede

d and followed by an infinity of blanks. d and followed by an infinity of blanks. 2.The TM will change a 0 to an X and then a 1 to a Y, until all 0’s a2.The TM will change a 0 to an X and then a 1 to a Y, until all 0’s a

nd 1’s have been matched.nd 1’s have been matched. 3. Starting at the left end of the input, it repeatedly changes a 0 to 3. Starting at the left end of the input, it repeatedly changes a 0 to

an X and moves to the right over whatever 0’s and Y’s it sees, an X and moves to the right over whatever 0’s and Y’s it sees, until it comes to a 1.until it comes to a 1.

Page 29: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

4. It changes the 1 to a Y, and moves left, over 4. It changes the 1 to a Y, and moves left, over Y’s and 0’s , until it finds an X. At that point, it Y’s and 0’s , until it finds an X. At that point, it looks for a 0 immediately to the right, and if it looks for a 0 immediately to the right, and if it finds one, changes it to X and repeats the finds one, changes it to X and repeats the process, changing a matching 1 to a Y.process, changing a matching 1 to a Y.

5. If the nonblank input is not in 05. If the nonblank input is not in 0**11**, then the , then the TM fail to have next move and will die without TM fail to have next move and will die without accepting.accepting.

6. If it finishes changing all the 0’s to X’s on 6. If it finishes changing all the 0’s to X’s on the same round it changes the last 1 to a Y, the same round it changes the last 1 to a Y, then it has found its input to be of the form then it has found its input to be of the form 00nn11nn and accepts. and accepts.

Page 30: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

The formal specification to the TM M is The formal specification to the TM M is M= ({qM= ({q0 0 , q, q1 1 , q, q2 2 ,, qq3 3 ,, qq4 4 } , {0,1}, {0,1,X,Y,B}} , {0,1}, {0,1,X,Y,B} , , , q, q00, B, B, ,

{q{q44}) where }) where is is

State 0 1 X Y State 0 1 X Y

B B q0 (q1, X, R) - - (q3, Y,R) q0 (q1, X, R) - - (q3, Y,R)

- - q1 (q1, 0, R) (q2, Y, L) - (q1, Y, R) q1 (q1, 0, R) (q2, Y, L) - (q1, Y, R)

- - q2 (q2, 0, L) - (q0, X, R) (q2, Y, L) q2 (q2, 0, L) - (q0, X, R) (q2, Y, L)

- - q3 - - - (q3, Y, q3 - - - (q3, Y,

R) (q4, B, R)R) (q4, B, R) q4 - - - - q4 - - - -

- -

Symbol

Page 31: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

DecidabilityDecidability

A problem is decidable if a program A problem is decidable if a program (usually for a Turing Machine) can (usually for a Turing Machine) can determine the answer and terminate determine the answer and terminate with a yes or a no in a finite number with a yes or a no in a finite number of steps.of steps.

It is undecidable otherwise.It is undecidable otherwise. ““Halting Problem” describes a Halting Problem” describes a

problem that cannot produce a result problem that cannot produce a result no matter how much time is given.no matter how much time is given.

Page 32: CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi

Resource:Resource:

Sipser, Michael. Introduction to the TheSipser, Michael. Introduction to the Theory of Computation. Boston: PWS Publisory of Computation. Boston: PWS Publishing Company, 1997.hing Company, 1997.

CS386 notesCS386 notes Daniel Firpo from spring CS490 Daniel Firpo from spring CS490 http://www.cs.okstate.edu/~marcin/http://www.cs.okstate.edu/~marcin/

mp/teach/summer03/5313/3mp/teach/summer03/5313/3 http://www.csee.umbc.edu/help/theorhttp://www.csee.umbc.edu/help/theor

y/lang_def.shtmly/lang_def.shtml