theory of computing, part 3. 1introduction 2theoretical background biochemistry/molecular biology...
TRANSCRIPT
1 Introduction
2 Theoretical background Biochemistry/molecular biology
3 Theoretical background computer science
4 History of the field
5 Splicing systems
6 P systems
7 Hairpins
8 Detection techniques
9 Micro technology introduction
10 Microchips and fluidics
11 Self assembly
12 Regulatory networks
13 Molecular motors
14 DNA nanowires
15 Protein computers
16 DNA computing - summery
17 Presentation of essay and discussion
Course outline
Introduction
Deterministic finite automata (DFA’s)
Non-deterministic finite automata (NFA’s)
NFA’s to DFA’s
Simplifying DFA’s
Regular expressions finite automata
Outline
Consider the control system for a
one-way swinging door:
There are two states: Open and
Closed
It has two inputs, person detected
at position A and person detected
at position B
If the door is closed, it should
open only if a person is detected
at A but not B
Door should close only if no one is
detected
A B
Automatic one way door
A finite automaton is usually represented
like this as a directed graph
Two parts of a directed graph:
The states (also called nodes or
vertices)
The edges with arrows which
represent the allowed transitions
One state is usually picked out as the
starting point
For so-called ‘accepting automata,’ some
states are chosen to be final states
Finite automaton
The input data are represented by a string
over some alphabet and it determines how the
machine progresses from state to state.
Beginning in the start state, the characters
of the input string cause the machine to
change from one state to another.
Accepting automata give only yes or no
answers, depending on whether they end up in
a ‘final state.’ Strings which end in a
final state are accepted by the automaton.
Strings and automata
The labeled graph in the figure above represents a FA over the alphabet Σ = {a, b} with start state 0 and final state 3.
Final states are denoted by a double circle.
Example
The previous graph was an example of a
deterministic finite automaton – every node
had two edges (a and b) coming out
A DFA over a finite alphabet Σ is a finite
directed graph with the property that each
node emits one labeled edge for each distinct
element of Σ.
Deterministic finite automata (DFA’s)
A DFA accepts a string w in Σ* if there is a
path from the start state to some final state
such that w is the concatenation of the labels
on the edges of the path.
Otherwise, the DFA rejects w .
The set of all strings accepted by a DFA M is
called the language of M and is denoted by
L(M)
More formally
Construct a DFA to recognize the regular
languages represented by the regular
expression (a + b)* over alphabet Σ = {a, b}.
This is the set {a, b}* of all strings over
{a, b}. This can be recognised by
Example: (a+b)*
Find a DFA to recognize the language
represented by the regular expression
a(a + b)* over the alphabet Σ = {a, b}.
This is the set of all strings in Σ*
which begin with a. One possible DFA
is:
Example: a(a+b)*
Build a DFA to recognize the regular language
represented by the regular expression
(a + b)*abb over the alphabet Σ = {a, b}.
The language is the set of strings that begin
with anything, but must end with the string
abb.
Effectively, we’re looking for strings which
have a particular pattern to them
Example: pattern recognition
The diagram below shows a DFA to recognize this language.
If in state 1: the last character was a
If in state 2 : the last two symbols were ab
If in state 3: the last three were abb
Solution: (a+b)*abb
We can also represent a DFA by a state transition
function, which we'll denote by T, where any state
transition of the form
is represented by: T(i,a) = j
To describe a full DFA we need to know:
what states there are,
which are the start and final ones,
the set of transitions between them.
State transition function
The class of regular languages is
exactly the same as the class of
languages accepted by DFAs!
Kleene (1956)
For any regular language, we can find
a DFA which recognizes it!
Regular languages
DFA’s are very often used for pattern matching,
e.g. searching for words/structures in strings
This is used often in UNIX, particularly by the
grep command, which searches for combinations of
strings and wildcards (*, ?)
grep stands for Global (search for) Regular
Expressions Parser
DFA’s are also used to design and check simple
circuits, verifying protocols, etc.
They are of use whenever significant memory is
not required
Applications of DFA’s
DFA’s are called deterministic because
following any input string, we know exactly
which state its in and the path it took to
get there
For NFA’s, sometimes there is more than one
direction we can go with the same input
character
Non-determinism can occur, because following
a particular string, one could be in many
possible states, or taken different paths to
end at the same state!
Non-deterministic finite automata
A non-deterministic finite automaton (NFA)
over an alphabet Σ is a finite directed graph
with each node having zero or more edges,
Each edge is labelled either with a letter
from Σ or with .
Multiple edges may be emitted from the same
node with the same label.
Some letters may not have an edge associated
with them. Strings following such paths are
not recognised.
NFA’s
If an edge is labelled with the empty string , then we can travel the edge without consuming an
input letter. Effectively we could be in either
state, and so the possible paths could branch.
If there are two edges with the same label, we
can take either path.
NFA’s recognise a string if any one of its many
possible states following it is a final state
Otherwise, it rejects it.
Non-determinism
Draw two NFAs to recognize the language of the
regular expression ab + a*a.
This NFA has a edge, which allows us to travel to state 2 without consuming an input letter.
The upper path corresponds to ab and the lower one
to a*a
Example
This NFA also recognizes the same language. Perhaps
it's easier to see this by considering the equality
ab + a*a = ab + aa*
An equivalent NFA
Since there may be non-determinism, we'll let the
values of this function be sets of states.
For example, if there are no edges from state k
labelled with a, we'll write
T(k, a) =
If there are three edges from state k, all labelled
with a, going to states i, j and k, we'll write
T(k, a) = {i, j, k}
NFA transition functions
All digital computers are deterministic; quantum
computers may be another story!
The usual mechanism for deterministic computers is
to try one particular path and to backtrack to the
last decision point if that path proves poor.
Parallel computers make non-determinism almost
realizable. We can let each process make a random
choice at each branch point, thereby exploring
many possible trees.
Comments on non-determinism
The class of regular languages is exactly the
same as the class of languages accepted by NFAs!
Rabin and Scott (1959)
Just like for DFA’s!
Every NFA has an equivalent DFA which recognises
the same language.
Some facts
We prove the equivalence of NFA’s and DFA’s by showing
how, for any NFA, to construct a DFA which recognises
the same language
Generally the DFA will have more possible states than
the NFA. If the NFA has n states, then the DFA could
have as many as 2n states!
Example: NFA has three states {A}, {B}, {C}
the DFA could have eight: {}, {A}, {B}, {C},
{A, B}, {A, C}, {B, C}, {A, B, C}
These correspond to the possible states the NFA could
be in after any string
From NFA’s to DFA’s
Begin in the NFA start state, which could be a
multiple state if its connected to any by
Determine the set of possible NFA states you could be
in after receiving each character. Each set is a new
DFA state, and is connected to the start by that
character.
Repeat for each new DFA state, exploring the possible
results for each character until the system is closed
DFA final states are any that contain a NFA final
state
DFA construction
The start state is A, but following an a you could be in A
or B; following a b you could only be in state A
A B Ca b
a,b
A A,Ba b
b
A,C
ab
a
NFA
DFA
Example (a+b)*ab
Regular expressions represent the regular languages.
DFA’s recognize the regular languages.
NFA’s also recognize the regular languages.
Summary
So far, we’ve introduced two kinds of automata:
deterministic and non-deterministic.
We’ve shown that we can find a DFA to recognise
anything language that a given NFA recognises.
We’ve asserted that both DFA’s and NFA’s recognise
the regular languages, which themselves are
represented by regular expressions.
We prove this by construction, by showing how any
regular expression can be made into a NFA and vice
versa.
Finite automata
Given a regular expression, we can find an automata
which recognises its language.
Start the algorithm with a machine that has a start
state, a single final state, and an edge labelled
with the given regular expression as follows:
Regular expressions finite automata
1. If an edge is labelled with , then erase the edge.
2. Transform any diagram like
into the diagram
Four step algoritm
Construct a NFA for the regular expression, a* + ab
Start with
Apply rule 2
a* + ab
ab
a*
Example a*+ab
1 Create a new start state s, and draw a
new edge labelled with from s to the original start state.
2 Create a new final state f, and draw new
edges labelled with from all the
original final states to f
Finite automata regular expressions
3 For each pair of states i and j that have
more than one edge from i to j, replace all
the edges from i to j by a single edge
labelled with the regular expression formed
by the sum of the labels on each of the
edges from i to j.
4 Construct a sequence of new machines by
eliminating one state at a time until the
only states remaining are s and the f.
Finite automata regular expressions
As each state is eliminated, a new machine is
constructed from the previous machine as follows:
Let old(i,j) denote the label on edge i,j of the current machine. If no edge exists, label
it .
Assume that we wish to eliminate state k. For
each pair of edges i,k (incoming edge) and
k,j (outgoing edge) we create a new edge label new(i, j)
Eliminating states
The label of this new edge is given by:
new(i,j) = old(i,j) + old(i, k) old(k, k)* old(k,j)
All other edges, not involving state k, remain the
same:
new(i, j) = old(i, j)
After eliminating all states except s and f, we wind up
with a two-state machine with the single edge s, f labelled with the desired regular expression new(s, f)
Eliminate state k
Eliminate state 2
(No path to f)
Eliminate state 0
Eliminate state 1
Final regular expression
Example
Sometimes our constructions lead to more
complicated automata than we need, having more
states than are really necessary
Next, we look for ways of making DFA’s with a
minimum number of states
Myhill-Nerode theorem:
‘Every regular expression has a unique* minimum
state DFA’
* up to a simple renaming of the states
Finding simpler automata
Two steps to minimizing DFA:
1 Discover which, if any, pairs of states are
indistinguishable. Two states, s and t, are
equivalent if for all possible strings w,
T(s,w) and T(t,w) are both either final or
non-final.
2 Combine all equivalent states into a single
state, modifying the transition functions
appropriately.
Finding minimum state DFA
States 1 and 2 are indistinguishable! Starting in either,
b* is rejected and anything with a in it is accepted.
a ab
b a
b
1
2
a,b
aa,b a,b
b
Consider the DFA
1. Remove all inaccessible states, where no path
exists to them from start.
2. Construct a grid of pairs of states.
3. Begin by marking those pairs which are clearly
distinguishable, where one is final and the other
non-final.
4. Next eliminate all pairs, which on the same
input, lead to a distinguishable pair of states.
Repeat until you have considered all pairs.
5. The remaining pairs are indistinguishable.
Part 1, finding indistinguishable pairs
1. Construct a new DFA where any pairs of
indistinguishable states form a single state
in the new DFA.
2. The start state will be the state containing
the original start state.
3. The final states will be those which contain
original final states.
4. The transitions will be the full set of
transitions from the original states (these
should all be consistent.)
Part 2, construct minimum DFA
a
a
a,b
a
a
b b b
0
3
2
1
4
What are the distinguishable pairs of states?
Clearly, {0, 4} {1, 4} {2, 4} {3, 4} are all
distinguishable because 4 is final but none of
the others are.
b
Example
We eliminate these as possible
indistinguishable pairs.
Next consider {0, 1}. With input
a, this becomes {3, 4} which is
distinguishable, so {0, 1} is as
well.
Similarly, we can show {0, 2} and
{0, 3} are also distinguishable,
leading to the modified grid…
1234
? ? ? ? ? ?x x x x
0 1 2 3
1234
xx ? x ? ?x x x x
0 1 2 3
Grid of pairs of state
We are left with
{1, 2} given a {4, 4}
given b {2, 1}
{2, 3} given a {4, 4}
given b {1, 2}
{1, 3} given a {4, 4}
given b {2, 2}
These do not lead to pairs we know to
be distinguishable, and are therefore
indistinguishable!
Remaining pairs
States 1, 2, and 3 are all indistinguishable, thus
the minimal DFA will have three states:
{0} {1, 2, 3} {4}
Since originally T(0, a) = 3 and T(0, b) = 1, the
new transitions are T(0, a) = T(0, b) = {1,2,3}
Similarly,
T({1,2,3}, a) = 4 and T({1,2,3}, b) = {1,2,3}
Finally, as before,
T(4, a) = 4 and T(4, b) = 4
Construct minimal DFA
The resulting DFA is much simpler:
aa,b a,b
b
0 1, 2, 3 4
This recognises regular expressions of the form, (a +
b) b* a (a + b)*
This is the simplest DFA which will recognise this
language!
Resulting minimal DFA
We now have many equivalent ways of representing
regular languages: DFA’s, NFA’s, regular expressions
and regular grammars.
We can also now simply(?!) move between these
various representations.
We’ll see next lecture that the automata
representation leads to a simple way of recognising
some languages which are not regular.
We’ll also begin to consider more powerful language
types and correspondingly more powerful computing
models!
conclusions