notes on generating functions in automata...

22
Notes on generating functions in automata theory Benjamin Steinberg December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power series 5 3 Rational power series 9 3.1 Rational power series and linear recurrences .......... 10 3.2 Newton’s identities ........................ 12 4 Regular languages and generating functions 14 4.1 Unambiguous regular expressions ................ 14 4.1.1 Unambiguous regular expressions and rationality ... 16 4.2 A linear algebraic approach ................... 19 1 Introduction: Calculus can count Let L = {0, 1} * \{0, 1} * 11{0, 1} * . This is a regular language. Suppose you would like to know how many words of length n belong to this language. It turns out that Taylor Series from Calculus can help us. Let’s first try and use bare hands methods to count this. Let a n be the number of words of length n in L. Evidently a 0 = 1 since the empty word belongs to L. Also 0, 1 L, so a 1 = 2. How about length 2? Well, 00, 01, 10 L but 11 / L, so a 2 = 3. Next look at length 3. We have 000, 010, 100, 001, 101. So a 3 = 5. We can’t go on like this for ever, so let’s try and be smart. Any word w L must either end in 0 or end in 01. More precisely, w must be of the form u0 or v01 with 1

Upload: others

Post on 25-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

Notes on generating functions in automatatheory

Benjamin Steinberg

December 5, 2009

Contents

1 Introduction: Calculus can count 1

2 Formal power series 5

3 Rational power series 93.1 Rational power series and linear recurrences . . . . . . . . . . 103.2 Newton’s identities . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Regular languages and generating functions 144.1 Unambiguous regular expressions . . . . . . . . . . . . . . . . 14

4.1.1 Unambiguous regular expressions and rationality . . . 164.2 A linear algebraic approach . . . . . . . . . . . . . . . . . . . 19

1 Introduction: Calculus can count

Let L = {0, 1}∗ \ {0, 1}∗11{0, 1}∗. This is a regular language. Suppose youwould like to know how many words of length n belong to this language. Itturns out that Taylor Series from Calculus can help us. Let’s first try and usebare hands methods to count this. Let an be the number of words of lengthn in L. Evidently a0 = 1 since the empty word belongs to L. Also 0, 1 ∈ L,so a1 = 2. How about length 2? Well, 00, 01, 10 ∈ L but 11 /∈ L, so a2 = 3.Next look at length 3. We have 000, 010, 100, 001, 101. So a3 = 5. We can’t goon like this for ever, so let’s try and be smart. Any word w ∈ L must eitherend in 0 or end in 01. More precisely, w must be of the form u0 or v01 with

1

Page 2: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

u, v ∈ L. Now there are an−1 words of length n ending in 0 and an−2 words oflength n ending in 01. Therefore, we have

an = an−1 + an−2, n ≥ 2

a0 = 1, a1 = 2(1.1)

This is essentially the Fibonacci sequence, except the Fibonacci sequence isgiven by 1, 1, 2, 3, 5, 8, . . . so our sequence starts from the second element of theFibonacci sequence. It turns out to be more convenient to calculate a formulafor the Fibonacci sequence. We define the Fibonacci sequence {fn} formallyby,

fn = fn−1 + fn−2, n ≥ 2

f0 = 1, f1 = 1(1.2)

So an = fn+1. Therefore, to obtain a formula for an, we just need to get aformula for fn.

How can we get an explicit formula for fn? The extremely clever idea(essentially going back to the 1700s or 1800s) is to encode the sequence viaa Taylor series (or as mathematicians prefer to call it, a power series). Solet g(x) =

∑∞n=0 fnx

n. This is called the generating function of the sequence{fn}. Elementary calculus says that

fn =g(n)(0)

n!so if we can identify g, we may be able to use derivatives to calculate the fn.Actually, in most cases we can identity g with a function whose power serieswe know well. The most typical example is the geometric series

1

1− ax= 1 + ax+ (ax)2 + · · · =

∞∑n=0

(ax)n (1.3)

We can get more examples by differentiating or integrating.Ok, back to our “Fibonacci” sequence {fn}. Consider its generating func-

tion

g(x) =∞∑n=0

fnxn = 1 + x+ 2x2 + 3x3 + 5x4 + 8x5 + · · · .

For the heck of it, lets compute g(x)(1− x− x2). Since

xg(x) = f0x+ f1x2 + f2x

3 + · · · =∞∑n=0

fnxn+1 =

∞∑n=1

fn−1xn (1.4)

x2g(x) = f0x2 + f1x

3 + · · · =∞∑n=0

fnxn+2 =

∞∑n=2

fn−2xn (1.5)

2

Page 3: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

we get the equality

g(x)(1− x− x2) =∞∑n=0

fnxn −

∞∑n=1

fn−1xn −

∞∑n=2

fn−2xn

= f0 + f1x− f0x+∞∑n=2

(fn − fn−1 − fn−2)

(1.6)

Using the recursive formula (1.2) and f0 = 1 = f1, (1.6) becomes

g(x)(1− x− x2) = 1

yielding the formula

g(x) =1

1− x− x2(1.7)

Now you can see that the choice 1−x−x2 was not at all random. Accordingto (1.4), multiplying g(x) by x has the effect of lowering the indices by 1 while(1.5) shows multiplying by x2 lowers the indices by two. Since our recursionexpresses the coefficients of g(x) in terms of the previous two indices, ourpolynomial 1 − x − x2 does exactly the job of killing of all but the initialterms.

Now, let us find a partial fraction decomposition of 11−x−x2 . The roots of

1− x− x2 are −1±√

52

. Let α = −1+√

52

and β = −1−√

52

. Then

1

1− x− x2=

1

(x− α)(β − x)=

−1

(x− α)(x− β)=

A

x− α+

B

x− βThis gives us the equations

A+B = 0

Aβ +Bα = 1

So A = −B and B(−β + α) = 1. But −β + α =√

5, so B = 1√5

and A = −1√5.

Therefore, we obtain

1

1− x− x2=

1√5

(−1

x− α+

1

x− β

)=

1√5

(α−1

1− α−1x− β−1

1− β−1x

)=

1√5

(∞∑n=0

(α−1)n+1xn −∞∑n=0

(β−1)n+1xn

)

=1√5

∞∑n=0

[(α−1)n+1 − (β−1)n+1

]xn

3

Page 4: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

To obtain our final answer, we first need to compute α−1, β−1. In fact,

α−1 =2

−1 +√

5=

2

−1 +√

5

(−1−

√5

−1−√

5

)=

1 +√

5

2

β−1 =2

−1−√

5=

2

−1−√

5

(−1 +

√5

−1 +√

5

)=

1−√

5

2

Putting it all together, we obtain

g(x) =1√5

∞∑n=0

(1 +√

5

2

)n+1

(1−√

5

2

)n+1xn

The formula for the nth Fibonacci number is then given by

fn =1√5

(1 +√

5

2

)n+1

(1−√

5

2

)n+1

The amazing thing about this formula is that despite all the√

5’s, the answeris always an integer! The number ϕ = 1+

√5

2is called the Golden Mean (look it

up on Google!). This number fascinated ancient Greeks, as well as Leonardoda Vinci (it even appears in the da Vinci code!). Our formula says

fn =ϕn+1 − (1− ϕ)n+1

√5

. (1.8)

In fact the ratio of the Fibonacci numbers converges to the Golden Mean.

Theorem 1.1. The ratio of the Fibonacci numbers converges to the GoldenMean. That is,

limn→∞

fn+1

fn= ϕ

Proof. First observe that |(1− ϕ)| < 1. Therefore, by (1.8), we have

limn→∞

fn+1

fn= lim

n→∞

ϕn+2 − (1− ϕ)n+2

ϕn+1 − (1− ϕ)n+1

= limn→∞

ϕn+2

ϕn+1(since |1− ϕ| < 1)

= ϕ

as required.

4

Page 5: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

The plan for the rest of these notes is as follows. First we develop thegeneral theory of power series and generating functions. In particular wefocus on the class of rational generating functions. Then we show that thegenerating function of a regular language is a rational generating function.

2 Formal power series

We begin by defining properly a power series. In these notes we won’t beconcerned about the convergence of these series, although the radius of con-vergence does give you important information about the growth of the coeffi-cients.

Definition 2.1 (Formal power series). A formal power series is a formal sumf(x) =

∑∞n=0 anx

n where the an are real numbers.

Two power series f(x) =∑∞

n=0 anxn and g(x) =

∑∞n=0 bnx

n are said to beequal if their coefficients agree, that is, an = bn for all n ≥ 0. Since we don’tconsider convergence, it doesn’t make sense to evaluate f at a real number,with the exception of the point x = 0. The number f(0) = a0 is clearly welldefined.

A polynomial is a formal power series with only finitely many non-zeroterms. We often identify constant polynomials with real numbers. In partic-ular, the 0 power series is the power series with all coefficients 0 whereas thepower series 1 is the power series with constant term 1 and all other terms 0.

One can define the derivative of a formal power series in a clear way:f ′(x) =

∑∞n=1 nanx

n−1. Of course f (n)(x), then nth derivative of f , is definedby taking n derivatives.

It is then a formal calculation to verify that Taylor’s formula holds.

Theorem 2.2 (Taylor’s Formula). If f(x) =∑∞

n=0 anxn, then

an =f (n)(0)

n!

This formula should not be confused with Taylor’s theorem from Calculus,which gives a good bound on the error of approximating a function by a Taylorpolynomial.

One can add power series in the usual way. If f(x) = a0 + a1x+ a2x2 + · · ·

and g(x) = b0 + b1x+ b2x2 + · · · , then

f(x) + g(x) = a0 + b0 + (a1 + b1)x+ (a2 + b2)x2 + · · · .

5

Page 6: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

In formulas, we have

f(x) + g(x) =∞∑n=0

(an + bn)xn

The negative of a power series is obtained by negating all the terms:−f(x) = −a0 − a1x− a2x

2 − · · · .Multiplication of power series is a bit more complicated. If f(x) = a0 +

a1x+ a2x2 + · · · and g(x) = b0 + b1x+ b2x

2 + · · · , then

f(x)g(x) = a0b0 + (a0b1 + a1b0)x+ (a0b2 + a1b1 + a2b0)x2 + · · ·

This boils down to the formula

f(x)g(x) =∞∑n=0

n∑m=0

ambn−mxn (2.1)

What this formula says is that to get the coefficient of xn you look at all pairsof numbers k, ` with k + ` = n and add up the corresponding products akb`.

As an example, consider (1−x)(1+x+x2 + · · · ). Playing with the productsymbolically, we obtain 1 + x − x + x2 − x2 + · · · = 1. Let’s try to do thisrigorously using (2.1). Here we have a0 = 1, a1 = −1 and all bn = 1. Thecoefficient of x0 is just a0b0 = 1. For n ≥ 1, the coefficient of xn reduces toa0bn + a1bn−1 = bn − bn−1 = 1 − 1 = 0. Therefore, f(x)g(x) = 1. This showsthe power series 1− x is invertible, or more precisely

1

1− x=∞∑n=0

xn

Definition 2.3 (Invertible power series). We say that a power series f(x) isinvertible if there is a power series g(x) such that f(x)g(x) = 1.

Suppose that f(0) = 0. Then f(x)g(x) evaluated at 0 is f(0)g(0) = 0.Therefore, f(x)g(x) 6= 1. The upshot is that we have just shown that theconstant term of an invertible power series must be non-zero. It turns outthat a power series f(x) is invertible precisely when f(0) 6= 0.

To prove this we would like to show that if f is a power series, then

∞∑n=0

fn =1

1− f

6

Page 7: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

But let’s not be too hasty. For instance, if f(x) = 1− x then

∞∑n=0

fn = 1− x+ (1− 2x+ x2) + (1− 3x+ 3x2 − x3) + · · ·

and so the constant term is sum of infinitely many 1’s, an impossibility.The problem here is that f(x) has a non-zero constant term. Suppose that

f(x) = 0, so f(x) = a1x+ a2x2 + · · · . Then f(x)n = an1x

n + · · · where all theother terms have higher order than n. So if you try and compute 1+f+f 2+· · ·you will never have to add up infinitely many real numbers and so the powerseries

∑∞n=0 f

n makes sense. In fact, the coefficient of xn in 1 + f + f 2 + · · ·agrees with the coefficient of xn in 1 + f + · · · + fn because fn+1, fn+2, etc.,only contribute terms of higher order than n.

So assume that f(0) = 0 and let us computes (1 − f)(1 + f + f 2 + · · · ).The constant term is clearly 1 (since f(0) = 0). Formally, we have

1− f + f − f 2 + f 2 · · · = 1

More rigorously, if we want to show that the coefficient of xn is 0 in thisproduct, it suffices to compute the coefficient of xn in (1− f)(1 + f + · · · fn)since fn+1, etc., only contribute terms of higher order. But a telescopingargument yields

(1− f)(1 + f + · · ·+ fn) = 1− f + f − f 2 + · · · − fn + fn − fn+1 = 1− fn+1

and since all terms of fn+1 are at least order n+ 1, we see

(1− f)(1 + f + · · ·+ fn)

has 0 as the coefficient of xn. This allows us to rigorously conclude that1/(1− f) = 1 + f + f 2 + · · · . We record this as a proposition.

Proposition 2.4. Suppose f is a power series with f(0) = 0, then

1

1− f=∞∑n=0

fn

Now we are ready to complete our characterization of invertible powerseries.

Theorem 2.5. A power series f(x) =∑∞

n=0 anxn is invertible if and only if

a0 6= 0, i.e., f(0) 6= 0.

7

Page 8: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

Proof. We already saw that if f(0) = 0, then f is not invertible. Conversely,suppose a0 = f(0) 6= 0. Clearly f is invertible if and only if f/a0 is invertible,so we may assume without loss of generality that f(0) = 1. Let g(x) = 1−f(x).Notice that 1− g = f . Since g(0) = 0, Proposition 2.4 shows

∞∑n=0

gn =1

1− g=

1

f

This completes the proof.

Now we can formally define a generating function.

Definition 2.6 (Generating function). If {an}∞n=0 is a sequence of numbers,the generating function for the sequence is the power series

f(x) =∞∑n=0

anxn

Exercise 1. Verify the following properties of power series.

1. f + g = g + f

2. (f + g) + h = f + (g + h)

3. f + 0 = f

4. f − f = 0

5. 1f = f

6. (fg)h = f(gh)

7. f(g + h) = fg + fh

Exercise 2. Prove Taylor’s formula.

Exercise 3. Show that every formal power series is a generating function.

8

Page 9: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

3 Rational power series

The simplest type of power series is a polynomial. Just as quotients of inte-gers are called rational numbers, quotients of polynomials are called rationalfunctions.

Definition 3.1 (Rational power series). A power series f(x) is rational if thereare polynomials p(x), q(x) with q(0) 6= 0 such that

f(x) =p(x)

q(x)

The condition q(0) 6= 0 is to guarantee that we can divide by q(x). Forexample the geometric series

∑∞n=0 x

n is rational. So is the generating functionof the Fibonacci sequence. In the exercises, you will be asked to verify thatsums, products and inverses of rational power series are again rational.

Given a rational power series f(x) =p(x)

q(x), you can use the method of long

division and partial fractions to find the associated power series.

Example 3.2. Let’s find the power series for f(x) = x+8x2+x−6

. Well,

f(x) =x+ 8

(x− 2)(x+ 3)=

A

x− 2+

B

x+ 3

So x + 8 = A(x + 3) + B(x − 2). Here’s a neat trick: subbing in x = 2 gives10 = 5A so A = 2; subbing in x = −3 gives 5 = −5B so B = −1. Therefore,

f(x) =2

x− 2− 1

x+ 3

We now do some algebraic rearrangement to make things look like a geometricsum; in the first sum multiply top and bottom by −1

2and in the second

multiply top and bottom by 13. We obtain

f(x) = − 1

1− x2

− 1

3

(1

1− (−13x)

)= −

∞∑n=0

1

2nxn − 1

3

∞∑n=0

(−1

3

)nxn

=∞∑n=0

[(−1

3

)n+1

− 1

2n

]xn

9

Page 10: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

Example 3.3. Let’s write f(x) = 1x2+2x+1

as a power series. Notice

f(x) =1

(x+ 1)2=

d

dx

(−1

1− (−x)

)=

d

dx

∞∑n=0

(−1)n+1xn

=∞∑n=1

(−1)n+1nxn−1

=∞∑n=0

(−1)n+2(n+ 1)xn

Exercise 4. Prove if f(x), g(x) are rational power series, then f(x) + g(x) and

f(x)g(x) are rational power series. If g(0) 6= 0, show that f(x)g(x)

is a rationalpower series.

Exercise 5. Write the following rational functions as power series.

1. 1x2−1

2. 1(1−x)3

3. x2+2x+3(1−x)(1−3x)

3.1 Rational power series and linear recurrences

Rational power series are closely related to linear recurrences (also called lineardifference equations). The rule defining the Fibonacci sequence is a linearrecurrence. More formally:

Definition 3.4 (Linear recurrence). A sequence {an}∞n=0 satisfies a linear re-currence of order r > 0 if there exists an integer k ≥ 0 so that for n ≥ k

an+r = cr−1an+r−1 + cr−2an+r−2 + · · ·+ c0an (3.1)

where c0, . . . , cr−1 are real numbers.

Notice that if a sequence satisfies the recurrence (3.1), then it is uniquelydetermined by the terms a0, . . . , ak+r−1. For instance, the Fibonacci sequencesatisfies the second order recurrence fn+2 = fn+1 + fn for n ≥ 0.

Our goal is to imitate what we did for the Fibonacci numbers to show thatthe generating function of a sequence with a linear recurrence is rational.

10

Page 11: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

So let {an}∞n=0 be a sequence satisfying the linear recurrence (3.1) for n ≥k and let f(x) =

∑∞n=0 anx

n be the generating function. We consider thepolynomial

q(x) = 1− cr−1x− cr−2x2 − · · · − c0xr

Notice that q(x) has degree r, the order of the linear recurrence. For theFibonacci sequence, this boils down to the polynomial 1−x−x2 we consideredearlier. If n ≥ k, then the coefficient of xn+r in

f(x)q(x) = (a0 + a1x+ · · ·+ ar+n−2xn+r−2 + ar+n−1x

n+r−1

+ ar+nxn+r + · · · )× (1− cr−1x− cr−2x

2 − · · · − c0xr)

is given by

an+r − cr−1an+r−1 − cr−2an+r−2 − · · · − c0an = 0

where the last equality uses (3.1). Therefore, f(x)q(x) is a polynomial p(x) of

degree at most k + r − 1 and so f(x) =p(x)

q(x).

Suppose on the other hand f(x) =∑∞

n=0 anxn is a rational power series

and f(x) =p(x)

q(x)with q(x) a polynomial of degree r. By multiplying top

and bottom by a scalar, we may assume q(x) = 1 − cr−1x + · · · − c0xr for

certain constants c0, . . . , cr−1. Then f(x)q(x) = p(x). If n + r is greaterthan the degree of p(x), then we have the coefficient of xn+r in f(x)q(x) is 0.This coefficient is an+r − cr−1an+r−1− · · · − c0an by the same computations asabove. Therefore, the sequence {an}∞n=0 satisfies the order r recurrence (3.1)for n ≥ deg(p(x))− r + 1.

We summarize this discussion in a theorem.

Theorem 3.5. A sequence satisfies a linear recurrence if and only if its gener-ating function is rational. More precisely, a sequence {an}∞n=0 with generatingfunction f(x) satisfies a linear recurrence (3.1) of order r if and only if

f(x) =p(x)

q(x)

where q(x) has degree r. Moreover, the recurrence (3.1) holds for all n ≥deg(p(x))− r + 1.

Example 3.6. Let’s count the number an of words of length at most n overthe two-letter alphabet {0, 1} using a second order linear recurrence. Clearly

11

Page 12: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

a0 = 1, a1 = 3. Now there are an+1 − an words of length n + 1. Since a wordof length n+ 2 is obtained from a word of length n+ 1 by appending either a0 or a 1 to the end, we have

an+2 = 2(an+1 − an) + an+1 = 3an+1 − 2an.

This is a linear recurrence of order 2 starting from k = 0. Then q(x) =1− 3x+ 2x2 and

f(x)q(x) = (1 + 3x+ a2x2 + · · · )(1− 3x+ 2x2) = 1 + 3x− 3x = 1

since the above discussion shows that the coefficient of xn+2 in f(x)q(x) iszero for n ≥ 0 as the recurrence has order 2 and starts from k = 0. So

f(x) =1

1− 3x+ 2x2=

1

(1− x)(1− 2x)=−1

1− x+

2

1− 2x. Therefore,

f(x) =∞∑n=0

(2n+1 − 1)xn

and so an = 2n+1 − 1.

Exercise 6. Suppose that the sequence {an}∞n=0 is given by a0 = 1, a1 = 5and the second order linear recurrence an+2 = 4an+1 − 3an for n ≥ 0. Usegenerating functions to find an explicit formula for an.

Exercise 7. Give a formula for the number of words of length at most n overa k-letter alphabet using a second order linear recurrence.

Exercise 8. Use a simple geometric sum to count the number of words of lengthat most n over a k-letter alphabet.

3.2 Newton’s identities

Let f(x) = xm + am−1xm−1 + · · · + a0 be a polynomial with complex roots

r1, . . . , rm (with multiplicities). Define a sequence pn of complex numbers, forn ≥ 1, by pn = rn1 + rn2 + · · ·+ rnm. Newton gave a linear recursion for {pn}∞n=1

in terms of the coefficients of f . Let’s derive it.Let p(x) =

∑∞n=1 pnx

n be the generating function. Consider the polynomial

g(x) = xmf(1

x) = 1 + am−1x+ · · ·+ a0x

m

Since

f(x) =m∏i=1

(x− ri) (3.2)

12

Page 13: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

we have

g(x) = xmm∏i=1

(1

x− ri

)=

m∏i=1

(1− rix).

Taking logarithms gives log g(x) =∑m

i=1 log (1− rix). So taking derivatives:

g′(x)

g(x)=

d

dxlog g(x) =

m∑i=1

−ri1− rix

= −m∑i=1

(∞∑n=0

rn+1i xn

)

= −∞∑n=1

(rn1 + · · ·+ rnm)xn−1

= −p(x)

x

Therefore, p(x) = −xg′(x)

g(x)is a rational function. Since

g′(x) = am−1 + 2am−2x+ · · ·+ma0xm−1

we obtain:

Theorem 3.7 (Newton). Let f(x) = xm+am−1xm−1 + · · ·+a0 be a polynomial

and let p(x) be the generating function for the sequence {pn}∞1=0 where pn isthe sum of the nth-powers of the roots of f(x) (with multiplicity). Then

p(x) =am−1x+ 2am−2x

2 + · · ·+ma0xm

1 + am−1x+ · · ·+ a0xm

Consequently, {pn}∞n=1 satisfies the linear recurrence of order m:

pn+m = −am−1pn+m−1 − am−2pn+m−2 − · · · − a0pn

for n ≥ 1.

One can in fact use Theorem 3.7 to compute recursively all the pn fromthe coefficients of f(x).

Exercise 9. Use the formula from Theorem 3.7 to determine formulas for p2

and p3 in terms of the coefficients of f .

Exercise 10. Show that if pn = 0 for n ≥ 1, then f(x) = xm.

Exercise 11. Show that if A is an m×m matrix such that Trace(An) = 0 forall n ≥ 1, then Am = 0.Hint: Use the previous exercise and the fact that if f(x) is the characteristicpolynomial of A, then f(A) = 0.

13

Page 14: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

4 Regular languages and generating functions

Often it is interesting to count the number of words of each length in a languageL. For instance, C = {0, 10} is a prefix code. How many words of length nare there in C∗. We shall compute this with generating functions.

Definition 4.1 (Generating function of a language). Let L ⊆ A∗ be a lan-guage. Then generating function for L is the power series

fL(x) =∞∑n=0

anxn

where an = |L ∩ An|, i.e., the number of words of length n in L.

For instance, if |A| = m, then there are mn words of length n and so

fA∗ =∞∑n=0

(mx)n =1

1−mx.

In particular, the generating function is rational. This will always be the casefor regular languages. We give two approaches.

4.1 Unambiguous regular expressions

Our first approach is via unambiguous regular expressions.

Definition 4.2 (Unambiguous regular expression). Let L1, L2 ⊆ A∗.

1. The union L1 + L2 is called unambiguous if L1 and L2 are disjoint.

2. The product L1L2 is called unambiguous if each w ∈ L1L2 can beuniquely written as a product w = w1w2 with wi ∈ Li, i = 1, 2.

3. The Kleene star L∗1 is called unambiguous if L1 is a code. One saysL1 is a code if each product Ln1 is unambiguous (n ≥ 0) and the unionL0

1 + L11 + · · · is a disjoint (that is, unambiguous) union.

4. A language is called unambiguously regular if it can be built from thebase regular languages by finitely many applications of unambiguousunion, unambiguous product and unambiguous star.

The advantage of unambiguous regular operations is that the effect of theoperation on generating functions is easy to determine.

14

Page 15: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

Proposition 4.3. Let L1, L2 ⊆ A∗ have respective generating functions fL1(x)and fL2(x). Then:

1. If L1 + L2 is an unambiguous union, then

fL1+L2(x) = fL1(x) + fL2(x)

2. If L1L2 is an unambiguous product, then

fL1L2(x) = fL1(x)fL2(x)

3. If L1 is a code, then

fL∗1(x) =1

1− fL1(x)

Proof. Let an = |L1 ∩ An| and bn = |L2 ∩ An|.

1. A word of length n in L1 +L2 comes from either L1 or L2, but not both.So |(L1 + L2) ∩ An| = an + bn. Therefore, fL1+L2(x) = fL1(x) + fL2(x).

2. A word of length n in L1L2 can be uniquely written as a product of aword of length m from L1 with a word of length n − m from L2. So|L1L2 ∩ An| =

∑nm=0 ambn−m. Thus (2.1) implies

fL1L2(x) =∞∑n=0

n∑m=0

ambn−mxn = fL1(x)fL2(x),

as required.

3. First note that L1 a code implies ε /∈ L1. There for the constant termof fL1(x) is 0 and so 1/(1 − fL1(x)) makes sense. If w ∈ L∗1 has lengthm, then w /∈ Ln1 for n > m. Also the smallest degree term of fnL1

is at least n. So we just need to make sure that fL∗1(x) agrees with1 + fL1(x) + · · · fL1(x)m for all terms of degree up to m, for each m ≥ 0.But this follows from the previous two parts since L0

1 +L1 + · · ·Lm1 is anunambiguous union of unambiguous products.

Example 4.4. Let C = {0, 10}. Then C is a prefix code. Clearly fC(x) = x+x2,so

fC∗ =1

1− x− x2.

15

Page 16: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

We recognize this from (1.7) as the generating function for the Fibonaccinumbers and so we know that the number of words of length n in C∗ is thenth Fibonacci number fn. In particular, (1.8) gives an explicit formula for thenumber of words of length n. Notice that C∗(ε+1) is the language of all wordsthat do not contain a factor 11. Indeed, C∗ contains all words ending in 0 withno factor 11 and the product then breaks things up into those words endingin 0 and those words ending in 1. This is an unambiguous regular expressionand so

fC∗(ε+1) =1 + x

1− x− x2

Example 4.5. A composition of a natural number n > 0 is an ordered se-quence of positive numbers (m1, . . . ,mk) such that m1 + · · · + mk = n. Let’scompute a formula for the number of compositions of n. Consider the infiniteprefix code C = {akb | k ≥ 0} = a∗b. For n > 0 there is a bijection betweenwords of length n in C∗ and compositions of n that corresponds the compo-sition (m1, . . . ,mk) of n to the word (am1−1b)(am2−1b) · · · (amk−1b) of lengthn (what is the inverse?). The regular expression a∗b is unambiguous so thegenerating function for C is

fC =x

1− xThus we have

fC∗ =1

1− fC=

11−2x1−x

=1− x1− 2x

=1− 2x+ x

1− 2x= 1 +

x

1− 2x

= 1 +∞∑n=0

2nxn+1 = 1 +∞∑n=1

2n−1xn

Therefore, there are 2n−1 compositions of n.

Exercise 12. Find the generating function fL(x) and a formula for the numberof words of length n in L∗ for L = {0, 10, 11}.Exercise 13. Find a formula for the number of words of length n in the regularlanguage 1∗01∗. Make sure to justify that you are only using unambiguousproducts and stars.

4.1.1 Unambiguous regular expressions and rationality

Let us observe that the generating functions for the base regular languages arepolynomials.

• f∅(x) = 0.

16

Page 17: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

• f{ε} = 1.

• f{a} = x, a ∈ A.

It now follows from Proposition 4.3 and Exercise 4 that any regular languagethat is unambiguously regular has a rational generating function. Our nexttheorem, which is an improvement on Kleene’s theorem, says that each regularlanguage is in fact unambiguously regular. The argument is an alternativeproof of Kleene’s theorem.

Theorem 4.6. Any regular language is unambiguously regular.

Proof. Let A = (S,A, ι, δ, T ) be a deterministic finite state automaton recog-nizing L. For p, q ∈ S, let Lp,q be the set of non-empty words recognized bythe automaton Ap,q = (S,A, p, δ, {q}). Then

L =⋃t∈T

Lι,t + ∆ι,T

where

∆ι,T =

{{ε} if ι ∈ T∅ else.

Moreover, this union is unambiguous since A is deterministic and so a wordcan bring the initial state to at most one terminal state. So it suffices to showthat each Lp,q with p, q ∈ S is unambiguously regular.

For Q ⊆ S and p, q ∈ S, define Lp,Q,q to be the set of all non-empty wordsthat label paths from p to q which only pass though states in Q except perhapsthe p at the beginning and the q at the end. Then Lp,q = Lp,S,q. We provethat Lp,Q,q is unambiguously regular for each Q ⊆ S, p, q ∈ S, by inductionon |Q|. If Q = ∅, then Lp,Q,q is just the set of labels of edges from p to q, andso is a subset of A and hence unambiguously regular. Assume the result istrue for |Q| = n and now suppose |Q| = n + 1. Then Q = P + {r} for somestate r /∈ P . The idea is now similar to our old proof of Kleene’s theorem. Webreak paths up according to whether they go through r or not. Then

Lp,Q,q = Lp,P,q + Lp,P,rL∗r,P,rLr,P,q. (4.1)

By induction, Lp,P,q, Lp,P,r, Lr, P, r and Lr,P,q are unambiguously regular. Theunion in (4.1) is unambiguous since words in Lp,P,q do not pass through r whengoing from p to q, while all words in Lp,P,rL

∗r,P,rLr,P,q do. The language Lr,P,r

is a prefix code since it does not contain the empty word and r /∈ P implies noproper prefix of an element of Lr,P,r belongs to the language. So L∗r,P,r is an

17

Page 18: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

unambiguous star. Finally, the product Lp,P,rL∗r,P,rLr,P,q is unambiguous since

if w goes from p to q through r, it has a unique prefix x that visits r for thefirst time, a unique suffix z that visits r for the last time and w = xyz where yreads from r to r going through Q = P + {r}. It follows x ∈ Lp,P,r, y ∈ L∗r,P,rand z ∈ Lr,P,q and this is the unique factorization of this sort. This completesthe induction and the proof of the theorem.

Corollary 4.7. The generating function of any regular language is a rationalfunction. In particular, the number of words of length n in a regular languagemust satisfy a linear recurrence.

Because of the close relationship between regular languages and rationalgenerating functions, some books call regular languages rational languages.However, there are languages with rational generating function that are notregular. For instance L = {0n1n | n ≥ 0} is not regular. This language hasexactly one word of every even length and no words of odd length. So itsgenerating function

fL(x) = 1 + x2 + x4 + · · · = 1

1− x2

is rational. In fact this language has the same generating function as (02)∗. Toobtain the proper relationship between regular languages and rational func-tions, one has to consider generating functions in several non-commuting vari-ables, which is beyond the scope of this course.

Example 4.8. Let’s compute a formula the number of words of length n inthe language L = {0, 1}∗01{0, 1}∗. First we need an unambiguous regularexpression. A deterministic automaton accepting this language is

///.-,()*+ 0 //

1

�� /.-,()*+ 1 //

0

�� /.-,()*+��������0,1

��

from which we obtain the unambiguous regular expression 1∗00∗1(0 + 1)∗.Therefore, the generating function for L is given by

fL =1

1− x· x · 1

1− x· x · 1

1− 2x=

x2

(1− x)2(1− 2x)

18

Page 19: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

Using the method of partial fractions, one computes

fL(x) =−1

(1− x)2+

1

1− 2x

= − d

dx

(1

1− x

)+

1

1− 2x

= −∞∑n=1

nxn−1 +∞∑n=0

2nxn

=∞∑n=0

(−(n+ 1) + 2n)xn

Thus|L ∩ An| = 2n − n− 1

Exercise 14. Let G be a finite group with identity e. Let f : A∗ → G be anonto homomorphism. Show that the generating function for the word problemL = f−1({e}) is rational. This result is used in probability theory: from itthey deduce that the Green’s function of a random walk on a finite group isrational.

4.2 A linear algebraic approach

An alternate approach, which works quite well for automata with small num-bers of states, is via linear algebra. Let A = (S,A, q1, δ, T ) be a deterministicfinite state automaton accepting a language L. Let S = {q1, . . . , qm} whereq1 is the initial state. Then, for each i, we define Lqi to be the language ofthe automaton (S,A, qi, δ, T ); so Lqi consists of all words w ∈ A∗ such thatqiw ∈ T . In particular, Lq1 = L. Let fi = fLqi

be the generating function ofLqi ; so f1 = fL. The generating functions f1, . . . , fm are closely related, as weshall see momentarily. Let us first observe that if qi is a fail state, then no wordlabels a path from qi to a terminal state and so Lqi = ∅, whence fi = 0. Thuswe can omit the fail states in what follows (i.e., work with partial deterministicautomata).

If f is a generating function, let us write 〈f, xn〉 to denote the coefficientof xn in f(x). So if f(x) =

∑∞n=0 anx

n, then 〈f, xn〉 = an. Then notice that

〈fi, x0〉 =

{1 qi ∈ T0 else

(4.2)

since ε ∈ Lqi if and only if qi ∈ T .

19

Page 20: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

On the other hand if n ≥ 0 and w ∈ Lqi is a word of length n + 1, thenw = au with a ∈ A and |u| = n. Since we are dealing with a deterministicautomaton, this means that u ∈ Lqia. Conversely, if w = au with a ∈ A,|u| = n and u ∈ Lqia, then qiw = qiau ∈ T and so w ∈ Lqi . Again qia isuniquely determined because A is deterministic. From this, we conclude

|Lqi ∩ An+1| =∑a∈A

|Lqia ∩ An| =m∑j=1

aij|Lqj ∩ An|

whereaij = |{a ∈ A | qia = qj}|

In other words, for n ≥ 0,

〈fq, xn+1〉 =m∑j=1

aij〈fj, xn〉 (4.3)

The matrix A = (aij) is called the adjacency matrix of A . For example, ifwe consider the automaton A from Example 4.8, then the adjacency matrixof A is given by

A =

1 1 00 1 10 0 2

(4.4)

where we order the states from left to right.Let us define

δi =

{1 qi ∈ T0 qi /∈ T

I.e., δi = 〈fi, x0〉 by (4.2). Then equations (4.2) and (4.3) can be translatedinto the following linear system of equations in unknowns f1, . . . , fm and withcoefficients polynomials over R:

f1 = δ1 + x(a11f1 + a12f2 + · · ·+ a1mfm)

......

fm = δm + x(am1f1 + am2f2 + · · ·+ ammfm)

or in matrix form F = ∆+xAF where F = (f1, . . . , fm) and ∆ = (δ1, . . . , δm).Equivalently, we have the system of equations

(I − xA)F = ∆ (4.5)

20

Page 21: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

where I is the m×m identity matrix.Notice that det(I − xA) is a polynomial in x of degree at most m with

constant term det(I − 0A) = det I = 1. Thus det(I − xA) is an invertiblepower series and we can now apply Cramer’s rule (which works over any ringprovided the determinant is invertible over the ring) to conclude that

f1 =det((I − xA)1)

det(I − xA)(4.6)

where (I − xA)1 is the matrix obtained from I − xA by replacing the firstcolumn with ∆. Notice that the numerator of (4.6) is also polynomial in x ofdegree at most m, so this gives another proof that the generating function ofa regular language is a rational function.

Example 4.9. Let us revisit Example 4.8. Using (4.4) we obtain ∆ = (0, 0, 1).

I − Ax =

1− x −x 00 1− x −x0 0 1− 2x

(I − Ax)1 =

0 −x 00 1− x −x1 0 1− 2x

and so det((I − xA)1) = x2, det(I − xA) = (1− x)2(1− 2x) and so we recover

fL(x) =x2

(1− x)2(1− 2x)

Example 4.10. This time we return to the example

L = {0, 1}∗ \ [{0, 1}∗11{0, 1}∗]

which is recognized by the automaton

A = ///.-,()*+�������� 1((

0

�� /.-,()*+�������� 1 //

0

hh /.-,()*+0,1

��

The last state is a fail state and so does not contribute to the generatingfunction computation. Thus we may remove it and work with the partial

21

Page 22: Notes on generating functions in automata theorymathstat.carleton.ca/~bsteinbg/classes/automata2009... · December 5, 2009 Contents 1 Introduction: Calculus can count 1 2 Formal power

deterministic automaton

B = ///.-,()*+�������� 1((

0

�� /.-,()*+��������0

hh

Ordering the states from left to right, we obtain the adjacency matrix

A =

[1 11 0

]and ∆ = (1, 1). Thus

I − xA =

[1− x −x−x 1

](I − xA)1 =

[1 −x1 1

]and so det((I − xA)1) = 1 + x and det(I − xA) = 1− x− x2. Thus we recoverthe generating function

fL(x) =1 + x

1− x− x2

Unfortunately, this linear algebra method becomes exceedingly more diffi-cult to apply as the number of states increases. An alternative approach toCramer’s rule is to observe that (4.5) can be solved using Gaussian elimina-tion, but when performing row reductions you can only divide by power series(in particular, polynomials) with non-zero constant term.

22