stanford

U.C. Berkeley — CS172: Automata, Computability and Complexity Solutions to Problem Set 1Professor Luca Trevisan 2/1/2007

Solutions to Problem Set 1

1. Prove that the following languages are regular, either by exhibiting a regular expressionrepresenting the language, or a DFA/NFA that recognizes the language:[10 x 3 = 30 points]

(a) all strings that do not contain the substring aba, for Σ = {a, b} (for instance, aabaacontains the substring aba, whereas abba does not)Solution: The following machine recognizes the given language by maintaining a statefor “how much” of the string aba it has seen. On seeing aba it goes into a non-acceptingstate and stays there.

b a

a b a

b

a,b

(7 points for the DFA and 3 for the explanation.)

(b) set of strings such that each block of 4 consecutive symbols contains at least two a’s, forΣ = {a, b}Solution: The following machine remembers the last four characters it has read fromthe string. The names of the states indicate the (length four) blocks they represent.

aaaa aaab aaba aabb

abaa abab abba

baaa baab baba bbaa

bbab

abbb,babb,

a

b

b

a

a ba

b

a,b

ab a

ba

b

b

a

b

a

a

ba b

(7 points for the DFA and 3 for the explanation.)

(c) set of binary strings (Σ = {0, 1}) which when interpreted as a number (with the mostsignificant bit on the left), are divisible by 5.Solution: We maintain the remainder of the number read so far, when divided by 5.To update the remainder, note that if x is the number read so far, and b is the newbit that is read then the new number is y = 2x + b and y mod 5 = ((2x mod 5) + b)mod 5. (6 points for the DFA and 4 for the explanation.)

0 1 2 3 4

0 1

1

0

0

0

1 01

1

1

2. (Sipser, problem 1.31) For any string w = w1w2 · · ·wn, the reverse of w, written as wR is thestring w in reverse order, wn · · ·w2w1. For any language A, let AR = {wR | w ∈ A}. Showthat if A is regular, so is AR.[20 points]

Solution: One solution is recursively (or inductively) define a reversing operation on regularexpressions, and apply that operation on the regular expression for A. In particular, given aregular expression R, reverse(R) is:

• a for some a ∈ Σ,• ε if R = ε,• ∅ if R = ∅,• (reverse(R1) ∪ reverse(R2)), if R = R1 ∪ R2,• (reverse(R2) ◦ reverse(R1)) if R = R1 ◦ R2, or• (reverse(R1)∗), if R = (R∗

1).

(8 points for saying reversing the regular expression, and 12 points for explaining how it’sdone. It’s important to point out that the operation is performed recursively.)

Another solution is to start with a DFA M for A, and build a NFA M ′ for AR as follows:reverse all the arrows of M , and designate the start state for M as the only accept state q′acc

for M ’. Add a new start state q′0 for M ′, and from q′0, add ε-transitions to each state of M ′

corresponding to accept states of M .

It is easy to verify that for any w ∈ Σ∗, there is a path following w from the state start toan accept state in M iff there is a path following wR from q′0 to q′acc in M ′. It follows thatw ∈ A iff wR ∈ AR.

(7 points for saying reversing the arrows; 3 points for explaining the new accept state, and 5points for explaining the new start state and the ε-transitions. 5 points for explaining, or atleast making the final observation about the paths/connectivity.)

3. We say a string w = w1w2 . . . wn is a shuffle of strings u and v if there exists J ⊆ {1, . . . , n}such that (wj)j∈J = u and (wj)j /∈J = v. For example CSS17PR2ING07 is a shuffle ofthe strings CS172 and SPRING07 and in fact, there are two sets J = {1, 2, 4, 5, 8} andJ = {1, 3, 4, 5, 8} which work here.

We then define the shuffle of two languages A and B as

S(A,B) = {w|∃ u ∈ A, v ∈ B s.t. w is a shuffle of u and v}Show that if A and B are regular languages over a common alphabet Σ, then so is S(A,B).[20 points]

Solution: Let MA = (QA,Σ, δA, q0A, FA) and MB = (QB,Σ, δB, q0B, FB) be two DFAsaccepting the languages A and B respectively. Then we define an NFA M = (Q,Σ, δ, q0, F )for S(A,B) as follows.

Let Q = QA×QB, q0 = (q0A, q0B) and F = FA×FB. Define δ((qA, qB), s) = {(δA(qA, s), qB)}∪{(qA, δB(qB, s))}, i.e., at each step, the machine changes qA according to δA or qB accordingto δB. It reaches a state in FA ×FB if and only if the moves according to δA take it from q0A

to a state in FA, and the ones according to δB take it from q0B to a state in FB. Hence Maccepts exactly the language S(A,B).(12 points for designing the machine and 8 for the argument.)

2



1. Let k be a positive integer. Let Σ = {0, 1}, and L be the language consisting of all stringsover {0, 1} containing a 1 in the kth position from the end (in particular, all strings of lengthless than k are not in L). [8 + 8 + 14 = 30 points]

(a) Construct a DFA with exactly 2k states that recognizes L.Solution: Construct a DFA with one state corresponding to every k-bit string. For-mally, let Q = {0, 1}k. We keep track of the last k bits read by the machine. Thus, for astate x1 . . . xk, we define the transition on reading the bit b as δ(x1 . . . xk, b) = x2 . . . xkb.Take q0 = 0k and F = {x1 . . . xk ∈ Q | x1 = 1}. Note that this does not accept anystrings of length less than k (as we start with the all zero state) since the kth positionfrom the end does not exist, and is hence not 1.

(b) Construct a NFA with exactly k + 1 states that recognizes L.Solution: We construct an NFA with Q = {0, 1, . . . , k}, with the names of the statescorresponding to how many of the last k bits the NFA has seen. Define δ(0, 0) = 0,δ(0, 1) = {0, 1} and δ(i − 1, 0/1) = i for 2 ≤ i ≤ k. We set q0 = 0 and F = {k}. Themachine starts in state 0, on seeing a 1 it may guess that it is the kth bit from theend and proceed to state 1. It then reaches state k and accepts if and only if there areexactly k − 1 bits following the one on which it moved from 0 to 1.

(c) Prove that any DFA that recognizes L has at least 2k states.Solution: Consider any two different k-bit strings x = x1 . . . xk and y = y1 . . . yk andlet i be some position such that xi 6= yi (there must be at least one). Hence, one of thestrings contains a 1 in the ith position, while the other contains a 0. Let z = 0i−1. Thenz distinguishes x and y as exactly one of xz and yz has the kth bit from the end as 1.Since there are 2k binary strings of length k, which are all mutually distinguishable bythe above argument, any DFA for the language must have at least 2k states.

2. [10 + 10 + 10 = 30 points]

(a) Let A be the set of strings over {0, 1} that can be written in the form 1ky where ycontains at least k 1s, for some k ≥ 1. Show that A is a regular language.Solution: It is easy to see that any string in A must start with a 1, and contain atleast one other 1 (in the matching y segment). Conversely, any string that starts witha 1 and contains at least one other 1 matches the description for k = 1. Hence, A isdescribed by the regular expression 1 ◦ 0∗ ◦ 1 ◦ (0 ∪ 1)∗, and is therefore regular.

(b) Let B be the set of strings over {0, 1} that can be written in the form 1k0y where ycontains at least k 1s, for some k ≥ 1. Show that B is not a regular language.Solution: Assume to the contrary that B is regular. Let p be the pumping lengthgiven by the pumping lemma. Consider the string s = 1p0p1p ∈ B. The pumping lemmaguarantees that s can be split into 3 pieces s = abc, where |ab| ≤ p. Hence, y = 1i forsome i ≥ 1. Then, by the pumping lemma, ab2c = 1p+i0p1p ∈ B, but cannot be writtenin the form specified, a contradiction.

1

(c) Let C be the set of strings over {0, 1} that can be written in the form 1kz where zcontains at most k 1s, for some k ≥ 1. Show that C is not a regular language.Solution: Assume to the contrary that C is regular. Let p be the pumping lengthgiven by the pumping lemma. Consider the string s = 1p0p1p ∈ B. The pumping lemmaguarantees that s can be split into 3 pieces s = abz, where |ab| ≤ p. Hence, b = 1i forsome i ≥ 1. Then, by the pumping lemma, ac = 1p−i0p1p ∈ C, but cannot be written inthe form specified, a contradiction.

3. Write regular expressions for the following languages: [12 + 8 = 20 points]

(a) The set of all binary strings such that every pair of adjacent 0’s appears before any pairof adjacent 1’s.Solution: Using R(L), to denote the regular expression for the given language L, wemust have R(L) = R(L1)R(L2), where L1 is the language of all strings that do notcontain any pair of 1’s and L2 is the language of all strings that do not contain any pairof 0’s. For a string in L1, every occurrence of a 1, except possibly the last one, must befollowed by a 0. Hence, R(L1) = (0 + 10)∗(1 + ε). Similarly, R(L2) = (1 + 01)∗(0 + ε).Thus, R(L) = (0+10)∗(1+ε)(1+01)∗(0+ε), which simplifies to (0+10)∗(1+01)∗(0+ε).

(b) The set of all binary strings such that the number of 0’s in the string is divisible by 5.Solution: Any string in the language must be composed of 0 or more blocks, each hav-ing exactly five 0’s and an arbitrary number of 1’s between them. This is given by the reg-ular expression (1∗01∗01∗01∗01∗1∗01∗). However, this does not capture the strings con-taining all 1’s, which can be included separately, giving the expression (1∗01∗01∗01∗01∗01∗)+1∗ for the language.

4. We say a string x is a proper prefix of a string y, if there exists a non-empty string z suchthat xz = y. For a language A, we define the following operation

NOEXTEND(A) = {w ∈ A | w is not a proper prefix of any string in A}

Show that if A is regular, then so is NOEXTEND(A).[20 points]

Solution: Given a DFA for the language A, we want to accept only those strings which reacha final state, but to which no string can be added to reach a final state again. Hence, wewant to accept strings ending in exactly those final states, from which there is no (directed)path to any final state (not even itself).

For a given state q ∈ F , we can check if there is a path from q to any state in F (or a cycleinvolving q) by a DFS. Let F ′ ⊆ F be the set of all the states from which there is no such path.Then changing the set of final states of the DFA to F ′ gives a DFA for NOEXTEND(A).

2



1. Define C to be all strings consisting of some positive number of 0’s, followed by some stringtwice, followed again by some positive number of 0. For example 1100 is not in C, since itdoes not start with at least one 0. However 0001011010000000 is in C since it is three 0’s,followed by 101 twice, followed by seven 0’s. Prove that C is not regular.[10 points]

Solution: We will show that there are infinitely many strings, any two of which are dis-tinguishable with respect to C. This will mean there are infinitely many indistinguishabilityclasses. By the Myhill-Nerode Theorem, we can then conclude that C is not regular.

Our strings will be 01k0 for each natural number k. Let k1 and k2 be distinct natural numbers.01k101k100 is in L. If 01k101k200 were in L, then it must be 0ss0 or 0ss00 for some string s.So s must contain at least one zero. Thus 01k101k200 must be 0ss0. So s must end with a 0,and that is the only 0 in s But then must s must be both 1k10 and 1k20. This is impossiblesince those strings have different lengths. So each 01k0 is in a different indistinguishabilityclass and C is not regular.

(4 points for giving the correct strings, 6 points for arguing distinguishability)

2. Let A be the set of all binary strings which, when interpreted as a number with the mostsignificant bit on the left, are divisible by 5. We know the language is regular from a previoushomework. Construct an optimal DFA for A and prove its optimality by giving pairwisedistinguishable strings, equal in number to the number of states in your DFA.[10 points]

Solution: The following DFA with 5 states recognizes the language (as proved in HW 1).We claim that any two binary strings, which when interpreted as numbers have differentremainders modulo 5, are distinguishable. This would imply that there must be at least 5equivalence classes for the indistinguishability relation and hence the DFA here is optimal.

0 1 2 3 4

0 1

1

0

0

0

1 01

1

To prove the claim, consider any two strings (thinking of them as numbers) x and y. Letx = r1 mod 5 and y = r2 mod 5, with r1 6= r2. Let w be the number 5 − r1, written usingfour bits. Then

xw ≡ (24x + 5− r1) ≡ [(24 mod 5)(x mod 5) + 5− r1] ≡ 0 mod 5

On the other hand,

yw ≡ [(24 mod 5)(y mod 5) + 5− r1] ≡ r2 − r1 6≡ 0 mod 5

Hence, the two strings can be distinguished.

(3 points for giving the DFA, 2 points for giving the equivalence classes and 5 points forshowing the distinguishability.)

1

3. Consider the language F = {aibjck | i, j, k ≥ 0 and if i = 1 then j = k}.[4 + 4 + 2 = 10 points]

(a) Show that F acts like a regular language in the pumping lemma i.e. give a pumpinglength p and show that F satisfies the conditions of the lemma for this p.Solution: The pumping lemma says that for any string s in the language, with lengthgreater than the pumping length p, we can write s = xyz with |xy| ≤ p, such that xyizis also in the language for every i ≥ 0.For the given language, we can take p = 2. Consider any string aibjck in the language.If i = 1 or i > 2, we take x = ε and y = a. If i = 1, we must have j = k and addingany number of a’s still preserves the membership in the language. For i > 2, all stringsobtained by pumping y as defined above, have two or more a’s and hence are always inthe language.For i = 2, we can take x = ε and y = aa. Since the strings obtained by pumping in thiscase always have an even number of a’s, they are all in the language. Finally, for thecase i = 0, we take x = ε, and y = b if j > 0 and y = c otherwise. Since strings of theform bjck are always in the language, we satisfy the conditions of the pumping lemmain this case as well.(1 point for handling each of the cases.)

(b) Show that F is not regular.Solution: We claim all strings of the form abi must be in distinct equivalence classesfor all i ≥ 0. This is because any two strings abi1 and abi2 can be distinguished by ci1 ,since abi1ci1 ∈ F , while abi2ci1 6∈ F . Since there are infinitely many equivalence classesof the indistinguishability relation, we conclude by the Myhill-Nerode theorem that noDFA can recognize F .(2 points for giving a set of distinguishable strings and 2 points for arguing the distin-guishability.)

(c) Why is this not a contradiction?Solution: The pumping lemma only says that if a language is regular, then it mustsatisfy the conditions of the lemma. However, this does not necessarily mean that nonon-regular language can satisfy these conditions. (2 points)

4. Show that for any positive integer m, there exists a language Am such that:

(a) There is a DFA with m states which recognizes Am.

(b) No DFA with less than m states recognizes Am.

[10 points]

Solution 1: Consider the language Am = {1m−2} over the alphabet Σ = {0, 1}. A DFAwith m states which has states 0, . . . ,m − 2 for counting the number of 1s seen so far, andan additional “crash state” into which it enters on ever seeing a 0 or more than m− 2 1s, isa DFA with m states which recognizes this language. On the other hand, any two strings ofthe form 1i, 1j for 0 ≤ i < j ≤ m− 1 are distinguished by 1m−2−i and hence, any DFA musthave at least m states.

Solution 2: Let Am = {1k | m divides k) over Σ = {1}. A DFA with m states which simplystores the number of 1s seen so far, modulo m recognizes this language. Also, for any two

2

strings 1k1 and 1k2 such that k1 6≡ k2 mod m, the string 1m−(k1 mod m) distinguishes the two.Hence, any two strings in which the number of 1s is different modulo m must be in differentequivalence classes, showing that no DFA with less than m states can recognize this language.

(5 points for exhibiting the first condition and 5 points for the second.)

3



1. (Sipser, Problem 3.13) A Turing machine with stay put instead of left is similar to an ordinaryTuring machine, but the transition function has the form

δ : Q× T → Q× T × {R,S}

At each point the machine can move its head right or let it stay in the same position. Showthat this Turing machine variant is not equivalent to the usual version. (Hint: Show thatthese machines only recognize regular languages). [20 points]

Solution: It is easy to see that we can simulate any DFA on a Turing machine with stayput instead of left. The only non-trivial modification is to add transitions from state in F toqaccept upon reading a blank, and from states outside F to qreject upon reading a blank.

Next, we start with a Turing machine M = (Q,Σ,Γ, δ, q0, qaccept, qreject) with stay put insteadof left, and show how we can construct a DFA (Q′,Σ′, δ′, q′

0, F ) that recognizes the samelanguage. The intuition here is that M cannot move left and cannot read anything it haswritten on the tape as soon as it moves right, and therefore it has essentially only one-wayaccess to its input, much like a DFA.

First, we modify M as follows; note that these changes do not affect the language it recognizes.

• Add a new symbol so that M never writes blanks on the tape; instead, M writes thenew symbol when it’s going to write blanks, and we extend the transition function sothat upon reading this new symbol, it behaves as though it read a blank.

• When M transitions into qreject or qaccept, the reading head moves right (and neverstays put).

Set Q′ = Q, Σ′ = Σ, q′0 = q0, and consider the transition function:

δ′(q, σ) =

q, if q ∈ {qaccept, qreject}qreject, if M starting at state q and reading σ keeps staying put.q′, where q′ is the state the M enters when it first moves right

upon starting at state q and reading σ.

(for q ∈ Q and σ ∈ Σ). Observe that there are finitely many state-alphabet pairs, M eitherends up either staying put and looping, or eventually moves right, and thus δ′ is well-defined.Finally, we define F to be the set containing qaccept and all states q ∈ Q, q 6= qaccept, qrejectsuch that M starting at q and reading blanks, eventually enters qaccept.

2. (Sipser, Problem 3.18) Show that a language is decidable iff some enumerator enumerates thelanguage in lexicographic order. [15 points]

Solution:If A is decidable by some TM M , the enumerator operates by generating the stringsin lexicographic order, testing each in turn for membership in A using M , and printing thestring if it is in A.

1

If A is enumerable by some enumerator E in lexicographic order, we consider two cases. IfA is finite, it is decidable because all finite languages are decidable (just hardwire each ofthe strings into the TM). If A is infinite, a TM M that decides A operates as follows. Onreceiving input w, M runs E to enumerate all strings in A in lexicographic order until somestring lexicographically after w appears. This must occur eventually because A is infinite. Ifw has appeared in the enumeration already, then accept; else reject.

Note: It is necessary to consider the case where A is finite separately because the enumeratormay loop without producing additional output when it is enumerating a finite language. Asa result, we end up showing that the language is decidable without using the enumerator forthe language to construct a decider. This is a subtle, but essential point.

3. Say that string x is a prefix of string y if a string z exists where xz = y, and say that x is aproper prefix of y if in addition x 6= y. A language is prefix-free if it doesn’t contain a properprefix of any of its members. Let

PrefixFreeREX = {R|R is a regular expression where L(R) is prefix-free}

Show that PrefixFreeREX is decidable. [15 points]

Solution:We construct a TM that decides PrefixFreeREX as follows1. On input R, reject if Ris not a valid regular expression. Otherwise, construct a DFA D for the language L(R) (referto chapter 1 of Sipser for the algorithm that constructs an equivalent NFA for L(R) from R,and for the algorithm that converts an NFA to a DFA). By running a DFS starting from q0,we can remove all states that are not reachable from q0 from the automaton.

Finally, for each accept state q, we run a DFS starting from q and check if another acceptstate (not equal to q) is reachable from q, or if there is a loop from q to itself. If any suchpaths or loops are found, reject. Otherwise, accept. Note that it is first required to removeall the states (actually, just accepting states) not reachable from q0 as these states cannotlead to any string being in the language.

4. Let Non− Empty be the following language

Non− Empty = {< M > | M accepts some string}.

Show that Non− Empty is Turing recognizable. [10 points]

Solution: We simply proceed as in the construction of an enumerator from a Turing machine:simulate M on all strings of length at most i for i steps, and keep increasing i. We acceptif the computation of M accepts some string. If L(M) is non-empty, we are certain that forsome i our machine will halt and accept.

1Note that PrefixFreeREX can contain infinite languages. For instance, take R = 0∗1.

2

stanford

Documents