introduction to automata - postech mlgmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... ·...

20
Introduction to Automata Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea [email protected] 1 / 20

Upload: others

Post on 24-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Introduction to Automata

Seungjin Choi

Department of Computer Science and EngineeringPohang University of Science and Technology

77 Cheongam-ro, Nam-gu, Pohang 37673, [email protected]

1 / 20

Page 2: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Automata Theory?

I Automata theory = the study of abstract computing machines (ormodels of computation)

I Computation is a sequence of steps that can be performed by acomputer.

I 1930’s: Turing machines

I 1940’s and 50’s: Finite automata, formal grammars (Chomsky)

I 1960’s and 70’s: Extension of Turing’s (Cook), decidability (whatcould and what could not computed) and intractability (P,NP-complete, NP-hard)

2 / 20

Page 3: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Finite Automata

Finite automata are used as a model for

I Software for designing and checking the behavior of digital circuits

I Lexical analyzer of a compiler

I Searching for keywords in a file or on the web

I Software for verifying finite state systems, such as communicationprotocols.

3 / 20

Page 4: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Examples of Finite Automata

startonoff

psuh

push thenstart h e nt

t th the

(a) (b)

Figure: (a) a finite automaton modeling an on/off switch; (b) a finiteautomaton modeling recognition of then.

4 / 20

Page 5: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Sets

DefinitionA set is a collection of elements without any structure other thanmembership.

I Subset: A ⊆ B, proper subset: A ⊂ B.

I S = {0, 1, 2}, S = {i | i > 0, i is even}.I S1 ∪ S2 (union), S1 ∩ S2 (intersection), S1 − S2 (difference), S̄

(complementation).

I Disjoint: S1 ∩ S2 = φ (where φ denotes empty set).

I Power set of S , denoted by 2S , is the set of all subsets of a set S .For example, given S = {a, b}, the power set is2S = {φ, {a}, {b}, {a, b}}. Note that

∣∣2S∣∣ = 2|S|.

I Cartesian product: S = S1 × S2 = {(x , y) | x ∈ S1, y ∈ S2}.

5 / 20

Page 6: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Functions

DefinitionLet A and B be sets. A function, f from A to B is an assignment ofexactly one element of B to each element of A.

I f : A 7→ B (f maps A to B), where A is domain of f , B is codomainof f , and the range of f is the set of all images of elements of A.

I A function f is said to be one-to-one (injective) iff f (x) = f (y)implies x = y for all x and y in the domain of f .

I A function f is said to be onto (surjective) iff for ∀b ∈ B, thereexists a ∈ A with f (a) = b. That is, the range of f is equal to B.

I A function f is a one-to-one correspondence (bijection) if it is bothone-to-one and onto.

6 / 20

Page 7: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Graphs and Trees

I A graph, G(V ,E ) consists of two finite sets, V (a set of vertices)and E (a set of edges).

I Walk, trail, path, closed, cycle

I Tree: connected acyclic graph

I Probabilistic graphical model: Happy marriage between graph theoryand probability theory

7 / 20

Page 8: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Walk, Trail, Path

I Walk: A sequence of edges (v0, vj), (vj , vk), . . . , (vm, vn) is said to bea walk from v0 to vn.

I Trail: A walk in which all the edges are distinct.

I Path: A walk in which all the edges are distinct and the vertices aredistinct (except v0 = vn).

I Closed: A path is closed if v0 = vn.

I Cycle: A closed path containing at least one edge is a cycle.

8 / 20

Page 9: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Example

I A walk: v → w → x → y → z → z → y → w

I A trail: v → w → x → y → z → z → x

I A path: v → w → x → y → z

9 / 20

Page 10: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Proof Techniques

I Deductive proof: Consists of a sequence of statements whose truthleads us from the hypothesis to a conclusion statement.

I Proof by inductionI Basis: Prove that P(1) is true.I Induction step: For each i ≥ 1, assume that P(i) is true and use this

assumption to show that P(i + 1) is true.

I Proof by contradiction: Assume that the theorem is false and thenshow that this assumption leads to an obviously false consequence(contradiction)

10 / 20

Page 11: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Example: Proof by Induction

TheoremA binary tree is a tree in which no parent can have more than twochildren. Prove that a binary tree of height n has at most 2n leaves.

Proof.

11 / 20

Page 12: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Proof

Denote by l(n) the maximum number of leaves of a binary tree of heightn. Then we want to show that

l(n) ≤ 2n.

Basis: One can easily see that l(0) = 1 = 20.IH: l(i) ≤ 2i for i = 0, 1, . . . , n.

To get a binary tree of height n + 1 from the one of height n, we cancreate at most two leaves in place of each previous one. Therefore

l(n + 1) ≤ 2 l(n) = 2n+1.

12 / 20

Page 13: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Example: Proof by Contradiction

TheoremShow that

√2 is irrational.

Proof.

13 / 20

Page 14: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Proof

Assume that√

2 is a rational number, leading to

√2 =

n

m,

where n and m are integers without a common factor.Square both sides of this equation yields

2m2 = n2.

n2 must be even, so n = 2k . Then 2m2 = 4k2, i.e., m2 = 2k2, leading tom is even.But this contradicts our assumption that n and m have no commonfactors. Thus n and m can not exist and

√2 is not a rational number.

14 / 20

Page 15: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Central Concepts of Automata Theory

I Alphabet: A finite nonempty set Σ of symbols.I Σ = {0, 1}, the binary alphabet.I Σ = {a, b, . . . , z}, the set of all lower-case letters.

I String: A finite sequence of symbols chosen from some alphabetI Given Σ = {0, 1}, 0011 is a string from the binary alphabet and 101

is another string from this alphabet.

I Empty string: the string with zero occurrences of symbols, denotedby ε. Note that ε is a string that may be chosen from any alphabetwhatsoever.

I Length of string: the number of positions for symbols in the string.For example, for w = 001, |w | = 3 (cardinality). Note that |ε| = 0.

15 / 20

Page 16: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

I Powers of an alphabet: Σk = the set of strings of length k, each ofwhose symbols is in Σ. For example, given Σ = {0, 1},

I Σ1 = {0, 1}.I Σ2 = {00, 01, 10, 11}.I Σ0 = {ε}.I How many string are there in Σ3 ?

I Σ∗: the set of all strings over Σ.I Σ+ = Σ1 ∪ Σ2 ∪ Σ3 ∪ · · ·I Σ∗ = Σ+ ∪ {ε} = Σ0 ∪ Σ1 ∪ Σ2 ∪ · · ·

I Concatenation: Let w = a1a2 · · · an and v = b1b2 · · · bm, thenwv = a1a2 · · · anb1b2 · · · bm. Note that εw = wε = w .

I wn = ww · · ·w︸ ︷︷ ︸n

and w0 = ε.

16 / 20

Page 17: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Languages

Given an alphabet Σ, a language L is a set of strings all of which arechosen from Σ∗. In other words, L ⊆ Σ∗, then L is a language over Σ.Examples of languages include:

I The set of legal English words.

I The set of legal C programs.

I The set of strings consisting of n 0’s followed by n 1’s for somen ≥ 0, i.e., {ε, 01, 0011, 000111, . . .}.

I The set of strings of 0’s and 1’s with an equal number of each:{ε, 01, 10, 0011, 0101, . . .}.

17 / 20

Page 18: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

More Examples

I The set of binary numbers whose value is a prime:{10, 11, 101, 111, 1011, . . .}.

I Σ∗ is a language for any alphabet Σ.

I The empty language φ is a language over any alphabet.

I {ε}, the language consisting of only the empty string, is also alanguage over any alphabet. Note that φ 6= {ε}; the former has nostrings and the latter has one string.

18 / 20

Page 19: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Problems

If Σ is an alphabet and L is a language over Σ, then the problem is:Given a string w in Σ∗, decide whether or not w is in L.

Let Lp = {w |w is a binary number that is prime}. Is 11101 ∈ Lp? Whatcomputational resources are needed to answer the question?

Usually we think of problems not as yes/no decision, but as somethingthat transforms an input into an output. For instance, the task of theparser in a C compiler does more than decide. Nevertheless, the definitionof problems as languages has stood the test of time as the appropriateway to deal with the important questions of complexity theory.

19 / 20

Page 20: Introduction to Automata - POSTECH MLGmlg.postech.ac.kr/~seungjin/courses/automata/handouts/... · 2016-09-13 · Introduction to Automata Seungjin Choi Department of Computer Science

Structural Representations

These are alternatively ways of specifying a machine.

I Grammars

I Regular expressions

20 / 20