analytic combinatorics: functional equations, rational and ... · functional equations—rational...

0123456789

ISS

N 0

249-

6399

appor t de r ech er ch e

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUEAnalytic Combinatorics: Functional Equations,

Rational and Algebraic Functions

Philippe Flajolet, Robert Sedgewick

N ˚ 4103January 2001TH�EME 2

Analytic Combinatorics: Functional Equations,Rational and Algebraic Functions

Philippe Flajolet, Robert Sedgewick

Theme 2 — Genie logicielet calcul symbolique

Projet Algo

Rapport de recherche n ˚ 4103 — January 2001 — 98 pages

Abstract: This report is part of a series whose aim is to present in a synthetic way the major methodsand models in analytic combinatorics. Here, we detail the case of rational and algebraic functionsand discuss systematically closure properties, the location of singularities, and consequences regard-ing combinatorial enumeration. The theory is applied to regular and context-free languages, finitestate models, paths in graphs, locally constrained permutations, lattice paths and walks, trees, andplanar maps.

Key-words: Analytic combinatorics, functional equation, combinatorial enumeration, generatingfunction, asymptotic methods

(Resume : tsvp)

Unit�e de re her he INRIA Ro quen ourtDomaine de Volu eau, Ro quen ourt, BP 105, 78153 LE CHESNAY Cedex (Fran e)T�el�ephone : (33) 01 39 63 55 11 { T�el�e opie : (33) 01 39 63 53 30

Combinatoire analytique:Equations fonctionnelles, fonctions rationnelles et algebriques

Resume : Ce rappport fait partie d’une serie dediee a la presentation synthetique des principalesmethodes et des principaux modeles de la combinatoire analytique. On y discute en detail les fonc-tions rationnelles et algebriques, leurs proprietes decloture, la localisation des singularites et lesconsequences qui en decoulent pour les denombrements combinatoires. La theorie est appliqueeaux langages reguliers et context-free, aux modeles d’etats finis, ainsi qu’aux cheminements dansles graphes, aux permutations localement contraintes, auxmarches aleatoires, et aux cartes planaires.

Mots-cle : Combinatoire analytique, equation fonctionnelle, denombrement combinatoire, fonctiongeneratrice, methodes asymptotiques

i

ANALYTIC COMBINATORICS

Foreword

This report is part of a series whose aim is to present in a synthetic way the majormethods and models in analytic combinatorics. The whole series, after suitable editing, isdestined to be transformed into a book with the title

“Analytic Combinatorics”

The present report is

— Chapter 8,Functional Equations—Rational and Algebraic Functions.

It is part of the following collection of Research Reports from INRIA:

— Chapters 1–3, “Counting and Generating Functions”, RR 1888, 116 pages, 1993;— Chapters 4–5, “Complex Asymptotics and Generating Functions”, RR 2026, 100

pages, 1993;— Chapter 6, “Saddle Point Asymptotics”, RR 2376, 55 pages, 1994;— Chapter 7, “Mellin Transform Asymptotics”, RR 2956, 93 pages, 1996.— Chapter 9, “Multivariate Asymptotics and Limit Distributions”, RR 3162, 123

pages, May 1997.

For historical reasons, the headings of previous chapters in the series were “The AverageCase Analysis of Algorithms”.

Acknowledgements.This work was supported in part by the IST Programme of the European Unionunder contract number IST-1999-14186 (ALCOM-FT). Special thanks are due to Bruno Salvy forsharing his thoughts on the subject of algebraic asymptotics and to Pierre Nicodeme for a thoroughscan of an early version of the manuscript.

CHAPTER 8

Functional Equations—Rational and Algebraic Functions

Mathematics is infinitely wide, while the language that describes it is finite. It follows from thepigeonhole principle that there exist distinct concepts that are referred to by the same name.Mathematics is also infinitely deep and sometimes entirely different concepts turn out to be

intimately and profoundly related.

— Doron Zeilberger[101]

I wish to God these calculations had been executed by steam.

— Charles Babbage (1792-1871)

This part of the book deals with classes of generating functions implicitly defined bylinear, polynomial, or differential relations, globally referred to asfunctional equations.Functional equations arise in well defined combinatorial contexts and they lead system-atically to well-defined classes of functions. One then has available a whole arsenal ofmethods and algorithmic procedures in order to simplify equations, locate singularities,and eventually determine asymptotics of coefficients. The corresponding classes are asso-ciated with algebraic closure properties together with a strong form of analytic “regular-ity” that constrains the location and nature of singularities. We shall detail here the caseof rational functions (that arise from functional equations that are linear and from associ-ated finite-state models) and algebraic functions (that arise from polynomial systems and“context-free” decompositions). A companion chapter willtreat holonomic functions (de-fined by linear differential equations with coefficients themselves polynomial or rationalfunctions that arise from a diversity of contexts includingorder statistics).

Rational functionscome first in order of simplicity. Linear systems of equations occursystematically in all combinatorial problems that are associated with finite state models(themselves closely related to Markov chains), like paths in graphs, regular languages andfinite automata, patterns in strings, and transfer matrix models of statistical physics. Singu-larities are by nature alwayspoles. Consequently, the asymptotic analysis of coefficients ofrational functions is normally achieved via localization of poles. Although difficulties mayarise either in noncombinatorial contexts (nonpositive problems) or when dealing simulta-neously with an infinite collection of functions (for instance, the generating functions oftrees of bounded height or width), the situation is however often tractable: for positive sys-tems, general theorems derived from the classical Perron-Frobenius theory of nonnegativematrices guarantee simplicity of the dominant positive pole.

Next in order of difficulty, there come thealgebraic functionsdefined as solutionsto polynomial equations. Such functions occur in connection with the simplest nonlin-ear combinatorial models, namely the class of context-freemodels. Such models covermany types of combinatorial trees (that are related to the theory of branching processes inprobability theory) or walks (that are close to the probabilistic theory of random walks).

1

2 8. FUNCTIONAL EQUATIONS—RATIONAL AND ALGEBRAIC FUNCTIONS

Algebraic properties include the possibility of performing elimination by devices like re-sultants or Groebner basis: in this way, a system of equations can always be reduced to asingle equation. Singularities are in all cases of a simple form—they arebranch pointscor-responding locally tofractional power series, also known as Newton–Puiseux expansions.Accordingly, the asymptoticshapeof the coefficients of any given algebraic function isthen entirely predictable by singularity analysis. However, nonlinear algebraic equationsadmit several solutions and a so-called “connection problem” has to be solved by consid-eration of the global geometry of the algebraic curve definedby a polynomial equation.Again, fortunately, the situation somewhat simplifies for positive algebraic systems thatcapture most of combinatorial applications.

Last but not least there come theholonomic functionsthat encompass rational andalgebraic functions. They are defined as solutions of lineardifferential equations with ra-tional function coefficients and they arise in a number of contexts, from order statistics toregular graphs. It has been realized over the last two decades, by Stanley, Lipshitz, andZeilberger most notably, that these functions enjoy an extremely rich set of nontrivial clo-sure properties. In particular, they encapsulate most of what is known to admit of closedform in combinatorial analysis. Algebraically, holonomicfunctions are objects that canbe specified by a finite amount of information so that the identities they satisfy isdecid-able. For instance, even a restrictive consequence of this theory leads to an interestingfact summarized by Zeilberger’s aphorism “All binomial identities are trivial”. Analyt-ically, the nature of singularities is again governed by simple laws, a fact that derivesfrom turn-of-the-twentieth-century studies of singularities of linear differential equations(by Schwartz, Fuchs, Birkhoff, Poincare, and others). Theclassification involves a funda-mental dichotomy between what is known asregular singularityandirregular singularity.From there, like in the algebraic case, the asymptoticshapeof the coefficients of any givenholonomic function is predictable by singularity analysis(regular singularity) or saddlepoint analysis (irregular singularity). However, unlike in the algebraic case, theconnectionproblemis not known to be decidable and a priori quantitative bounds(based on combi-natorial reasoning) must sometimes be resorted to in order to prune singularities and com-pletely solve an asymptotic question that arises from a combinatorial problem expressedby a holonomic generating function.

Finally, we make a brief mention here of equations of the composition type. There,strong closure properties tend to fade away. However, a niceset of techniques that havebeen put to use successfully in the analysis of some important combinatorial problems.For instance, the counting of balanced trees [72] and the distributional analysis of heightin simple families of trees [41, 42] relate to analytic iteration theory. On another register,metric properties of general number representation systems and information sources [97]together with the corresponding analyses of digital trees [24] can be approached success-fully by means of functional analytic methods, especially transfer operators.

The rational, algebraic, and holonomic classes

Exact combinatorial enumeration is best expressed in termsof generating func-tions, the recurrent theme of this book. In this chapter and its companion (“FunctionalEquations—Holonomic Functions”), three major classes areidentified: the rational class,the algebraic class, and the holonomic class. Each class is aworld of its own with specificalgebraic properties and specific analytic properties.

At the level of algebra, everything is expressed in term of formal power series, sinceno consideration of convergence enters the discussiona priori. Given a domainK , what is

RATIONAL, ALGEBRAIC, HOLONOMIC CLASSES 3

denoted byK [[z℄℄ is the set of formal power series in the indeterminatez, which means thecollection of formal sums, f(z) = 1Xn=0 fnzn; fn 2 K :We shall normally takeK to be a field like the fieldC of complex numbers or the subfieldQof rational numbers. (We occasionally speak ofN[[z℄℄ orZ[[z℄℄ but these will be regarded assimply denoting particular elements ofQ[[z℄℄ or C [[z℄℄.) We then have, in order of increasingstructural complexity and richness, three major classes ofobjects,K rat [[z℄℄ � K alg [[z℄℄ � K hol [[z℄℄ � K [[z℄℄;corresponding to the rational, algebraic, and holonomic subsets ofK [[z℄℄.

— Rational seriesdenoted byK rat [[z℄℄ are defined as solution of linear equations,

(1) y 2 K rat [[z℄℄ iff y 2 K [[z℄℄ anda1(z)y + a0(z) = 0,

for some polynomialsa0; a1 2 K [z℄.— Algebraic seriesare defined as solutions of polynomial equations,

(2) y 2 K alg [[z℄℄ iff y 2 K [[z℄℄ andeXj=0 aj(z)yj = 0,

for a family of polynomialsaj 2 K [z℄.— Holonomic seriesare defined as solutions of linear differential equations,

(3) y 2 Khol [[z℄℄ iff y 2 K [[z℄℄ andeXj=0 aj(z) djdzj y = 0,

for a family of polynomialsaj 2 K [z℄.It often proves convenient to extend rings into fields. The domainK (z) is the quotient fieldof the ringK [z℄, that is the set of fractionsa=b, with a; b 2 K [z℄ and it is a simple exerciseto check that the definitions of (1), (2), (3) could have been phrased in an equivalent wayby imposing that coefficients lie inK(z) instead ofK [z℄. The quotient field ofK [[z℄℄ isdenoted byK ((z)) and is called the field of formal Laurent series. From its definition, aLaurent series contains a finite number of negative powers ofz, and a formalist might enjoyan “identity” likeK ((z)) = K [[z℄℄[1=z℄. Then, in analogy to (1), (2), (3), one can define threesubsets ofK ((z)), namely,K rat ((z));K alg ((z));K hol ((z)) by looking at solutions inK ((z))instead ofK [[z℄℄.

The definitions we have adopted are by means of a single equation. As it turn out,every class can be alternatively defined in terms ofsystems of equations. Theory grantsus the fact that systems are reducible to single equations, in each of the three cases underconsideration. The reduction can be seen as aneliminationproperty: given a system�that defines simultaneously a vector(y1; : : : ; ym) of solutions, each component,y1 say,is definable by a single equation. For linear systems, this fact is granted by Gaussianelimination and by the theory of determinants (‘Cramer’s rule’). For polynomial systems,one may appeal either to Groebner basis elimination—an algorithmic process reminiscentof Gaussian elimination— or to resultants that are related to determinants. For differentialsystems, one may either appeal to an extension of Groebner bases to differential operatorsor to a method known under the name of “cyclic vectors”; see [23].

Each class carries with it a set of closure properties, some more obvious than others.Apart from the usual arithmetic operations, there is also interest in closure under Hadamard


1. Basic objects.

Domain Notation Typical element (K is a field)

Polynomials K [z℄ eXj=0 jzj , with j 2 K K [z℄ is a ring

Rational fractions K(z) AB with A;B 2 K [z℄ K(z) is a field

Formal power series K [[z℄℄ 1Xj=0 jzj K [[z℄℄ is a ring

Formal Laurent series K((z)) 1Xj=�e jzj K((z)) is a field

2. Special classes.(Coefficientsaj may be taken in eitherK [z℄ or in K(z).)Rational series K rat [[z℄℄ solution ofa1(z)y + a0(z) = 0Algebraic series Kalg [[z℄℄ solution ofae(z)ye + � � �+ a0(z) = 0Holonomic series Khol [[z℄℄ solution ofae(z)�ey + � � �+ a0(z) = 0 (� � ddz )

3. Closure properties.

Class Elimination (System! Eq.) � � � � � RRat Gaussian elim.; determinants Y Y Y Y Y N

Alg Groebner bases; resultants Y Y Y N Y N

Hol diff. Groebner bases; cyclic vector Y Y N Y Y Y

4. Singularities and coefficient asymptotics (simplified forms).

Singularity Coefficient form

Rat (z � �)�m (m integer) nm�1��nAlg (z � �)�� (� = pq 2 Q) n��1��nHol [regular sing.] (z � �)��(log(z � �))k (� algebraic) n��1(log n)k��n

FIGURE 1. A summary of the definitions and major properties of thethree classes of functions: Rational (Rat), Algebraic (Alg), and Holo-nomic (Hol).

products, that is, termwise product of series. Differentiation and integration of formalpower series are defined in the usual way,� Xn fnzn! =Xn nfnzn�1; Z Xn fnzn! =Xn fnn+ 1zn+1:The main closure properties of each class are summarized in Figure 1.

Next comes analysis. A major theme of this book is that asymptotic forms of co-efficients are dictated by the expansions of functions at singularities. In fact functionsassociated to series that belong to any of the major three classes under consideration havea finite number of singularities and the nature of singularities is predictable.

Rational functions can only have poles for which the theory of Chapter 4 applies di-rectly. Algebraic functions enjoy a richer singular structure, where fractional exponents

1. ALGEBRA OF RATIONAL FUNCTIONS 5

occur: the corresponding expansions known since Newton arecalled Newton–Puiseux ex-pansions and the conditions of singularity analysis are automatically satisfied. However,the inherently multivalued character of algebraic functions poses a specific “connectionproblem” as one needs to select the particular branch that isassociated to any given combi-natorial problem. Holonomic functions have well-classified singularities that fall into twocategories called regular and irregular (also diversely known as first kind and second kind,or Fuchsian and non-Fuchsian). The simplest type, the regular type, introduces elementswith exponents that may be arbitrary algebraic numbers together with integral powers of alogarithm. Again, locally the conditions of singularity analysis are automatically satisfied,but again a connection problem arises—how do coefficients inthe expansion at the singu-larity relate to the initial conditions given at the origin?Typical elements of coefficientsasymptotics that are encountered in this book are(1 +p52 )n; 4np�n3 ; n(p17�3)=2;and they reflect the nature of singularities present in each case. In effect, the first element istypical of rational asymptotics (here, a simple pole) and itarises in monomer-dimer tilingsof the interval. The second one corresponds to the asymptotic form of Catalan numbers andits origin is a singular element of an algebraic function with exponent1=2. The third onecorresponds to search cost in quadtrees and the algebraic number present in the exponentis indicative of a holonomic element.

Finally, positive functions, that is, solutions of positive systems, are the ones mostlikely to show up in elementary combinatorial applications. They are structurally moreconstrained and this fact is reflected to some extent by specific singularity and coefficientasymptotics. For instance, Perron-Frobenius theory says that, under certain natural condi-tions of “irreducibility”, a unique dominant pole (that is simple) occurs in rational asymp-totics. Similar conditions on positive algebraic systems constrain the singular exponent tobe equal to12 , hence the characteristic factorn�3=2 present in the asymptotic form of somany enumerative problems.

In this report, we consider the rational and algebraic classes in turn. First, at an alge-braic level, we state an elimination theorem and establish major closure properties. Nextcomes analysis, with its batch of singular asymptotics and matching coefficient forms. Ap-plications (including regular and context-free specifications) illustrate the general theoryin some important combinatorial situations.

1. Algebra of rational functions

Rational functions are the simplest of all objects considered in this chapter. They arenaturally defined as quotients of polynomials (Definition 8.1) or alternatively as compo-nents of solutions of linear systems (Theorem 8.1) a form that is especially convenientfor combinatorial enumeration. Coefficients of rational functions satisfy linear recurrenceswith constant coefficients and they also admit an explicit form, called “exponential poly-nomial” (Theorem 8.1) that is directly related to the location and multiplicity of poles andto the corresponding asymptotic behaviour of coefficients (Theorem 8.4). Closure proper-ties are stated as Theorem 8.2 while easy combinatorial forms for coefficients of rationalfunctions are given in Theorem 8.3 The asymptotic side of rational functions is the subjectof Section 2. An important class of positive systems has poles whose location and naturecan be predicted using an extension of the Perron-Frobeniustheory of positive matrices(Theorems 8.5 and 8.6).


DEFINITION 8.1. A power seriesf(z) 2 C [[z℄℄ is said to be rational if there exist twopolynomialsP (z); Q(z), withQ(0) 6= 0, such thatQ(z)f(z)� P (z) = 0, i.e.,f(z) = P (z)Q(z) :

Clearly, we may always assume thatP andQ are relatively prime: iff = P1=Q1 andÆ(z) = g d(P1; Q1), then the representationf = P=Q with P = P1=Æ, Q = Q1=Æ isirreducible. In addition, it is always possible to normalizeQ in such a way that its constantterm is 1, that is,Q(0) = 1. Also, Euclidean division, provides the formf(z) = R(z) + bP (z)Q(z) ;whereR is a polynomial and deg( bP ) < deg(Q). This transformation fromP=Q to bP=Qonly modifies finitely many initial values of the coefficientsof f .

Observe that, with the normalizationQ(0) = 1, we havef(z) = 1Xk=0P (z)(1�Q(z))k;where the sum is well-defined in the sense of formal power series. By simple majorizationarguments (Q(0) = 1 implies that1 � Q(z) is small forz near 0), a rational series asdefined in the formal sense of Definition 8.1 always determines a function analytic at theorigin.

1.1. Characterizations and elimination. A rational series in one variable can becharacterized in a number of equivalent ways, by refinement of the definition, as a so-lution to a linear system, by the linear recurrence satisfiedby its coefficient, or by the“exponential-polynomial” form of its coefficients.

THEOREM 8.1 (Rational function characterizations).For a power seriesf(z) =Pn fnzn in C [[z℄℄, the following conditions are equivalent to rationality:(i) Normal form:there exist polynomialsP (z); Q(z) 2 C [z℄ such thatf(z) = P (z)Q(z) ;withQ(0) = 1, andP;Q are relatively prime.(ii) Elimination: there exist a vector of formal power seriesv, and a matrixT withpolynomial entries and withdet(I �T(0)) 6= 0, such that the solutiong to the systemg = v +Tg;hasg1 = f .(iii) Coefficient recurrence:there exist constants 1; 2; : : : ; r such thatfn+r = 1fn+r�1 + 2fn+r�2 + � � �+ rfn = 0;for all n greater than a fixed numberN0.(iv) Coefficients as exponential polynomials:there exist a finite set of constants,f!jg j=1,and a finite set of polynomials,fRj(n)g j=1, such thatfn = kXj=1Rj(n)!nj ;for all n greater than a fixed numberN1.

1. ALGEBRA OF RATIONAL FUNCTIONS 7

The matrixT that defines a rational function via a linear system is often called atransitionmatrix or a transfer matrix. The form (iv) for coefficients is called anexponential-polynomial. Naturally, one may always assume that the!j are “sorted”,j!1j � j!2j � � � � ,which is the key to asymptotic approximations.Proof. (i) is equivalent to the definition of a rational power series, bythe comments above.(i) =) (ii) results from the fact thatf satisfies the 1-dimensional systemf = P +(1 � Q)f . The converse property(ii) =) (i) results from Cramer’s solution of linearsystems in terms of determinants. This provides a rational form for f with denominatordet(I � T(z)) that is locally nonzero since, by assumption,det(I � T(0)) 6= 0. (Notethat this condition is in particular automatically satisfied inn the frequent case whereT (0)is nilpotent.)(i) =) (iii) arises from the identity0 = Q(z)f(z)� P (z);upon extracting the coefficient ofzn. The converse implication(iii) =) (i) results fromtranslating the recurrence into an OGF equation in the standard way (multiply byzn andsum).(i) =) (iv) is obtained by partial fraction expansion followed by coefficient extrac-tion by means of standard identity1(1� !z)r =Xn�0�n+ r � 1r � 1 �!nzn;where the binomial coefficient is a polynomial inn of exact degreer � 1. The converseimplication (iv) =) (i) results from the same identity used in the opposite direction tosynthesize the function from its coefficients,Xn�0�n+ r � 1r � 1 �!nzn = 1(1� !z)r ;where the binomial coefficients

�n+r�1r�1 � (r � 1) form a basis of the set of polynomialsC [n℄. �We have opted for the use of determinants as an approach to elimination in linear sys-

tems. An alternative is Gaussian elimination. The principle is well known: the algorithmtakes a system of linear forms and combines them linearly, soas to eliminate all variablesin succession, until a normal form,xi = �i is attained. When we discuss elimination inpolynomial systems later in this chapter (Section 4), we shall encounter similarly two ap-proaches: one, based on resultants, makes extensive use of determinants while the other,relying on Groebner bases, is somewhat reminiscent of Gaussian elimination.

1.2. Closure properties and coefficients.Rational functions satisfy several closureproperties that derive rather directly from their definition or from the characterizationsgranted by Theorem 8.1.

THEOREM 8.2 (Rational series closure).The setC rat [[z℄℄ of rational series is closedunder the operations of sum(f + g), product (f � g), quasi-inverse (defined byf 7!(1� f)�1, conditioned uponf0 = 0), differentiation (�z), composition (f Æ g, conditionedupong0 = 0), and Hadamard (termwise) product,f(z)� g(z) = Xn fnzn!� Xn gnzn! =Xn (fn � gn)zn:


Proof. Only the Hadamard closure deserves a comment: it results from the characterizationof coefficients of rational generating functions as exponential polynomials whose class isobviously closed under product. �

EXERCISE1. LetfFng = (0; 1; 1; 2; 3; 5; : : :) be the Fibonacci sequence. Ex-amine properties of the generating functionsF (m)(z) = 1Xn=0 (Fn)m zn:EXERCISE2. Develop a proof of closure under Hadamard products that isbasedon the Hadamard integral formula of p. 55

Closed form for coefficients.A combinatorial sum expression for coefficients of rationalfunctions results from the expansion of1=Q(z) as

Pk(1�Q(z))k.

THEOREM 8.3 (Rational function coefficients).Let f(z) be a rational function withf(z) = P (z)=Q(z) andQ normalized byQ(0) = 1. SetP (z) = sXj=0 pjzj ; 1�Q(z) = r1z + r2z2 + � � �+ rmzm:and define the combinatorial sums:Sn := X1k1+2k2+��+mkm=n�k1 + � � �+ kmk1; : : : ; km �(rk11 rk22 � � � rkmm ):The coefficients off are expressible in finite terms from theSn:[zn℄P (z)Q(z) = sXj=0 pjSn�j :Proof. The multinomial expansion gives1Q(z) = 11� (1�Q(z))= Xk1;k2;:::;km�k1 + � � �+ kmk1; : : : ; km �(rk11 rk22 � � � rkmm ) zk1+2k2+��+mkm ;and it suffices to multiply this expansion by the numeratorP (z). �

As a consequence, the coefficients ofP=Q are expressible as multinomial sums ofmultiplicity at mostm � 1 (in fact only the number of nonzero monomials inP matters).(Multiplicity is also called “index” in Comtet’s book [26] that provides many interestingexamples of nontrivial expansions.) Note that, by “Fatou’sLemma”, if all the coefficients[zn℄f(z) are integers, then thepj andrj are themselves integers. (See the discussion in [87,p. 264].)

EXAMPLE 1. Fibonacci numbers and binomial coefficients.The generating function ofFibonacci numbers, when expanded according to Theorem 8.3 leads toFn+1 = [zn℄ 11� z � z2 =Xj�0�n� jj �;

2. ANALYSIS OF RATIONAL FUNCTIONS 9

so that Fibonacci numbers are sums of ascending diagonals ofPascal’s triangle. Naturally,infinitely many variants exist when expanding a rational function. For instance, one hasalso 11� (z + z2) = 11� z 11� z21�z = 11� z2 11� z1�z2 = 11� z 1�z21�z ;leading to a trivial variety of binomial forms. More generally, compositions with largestsummand� m have an OGF that is11� (z + z2 + � � � zm) = 11� z 1�zm1�z = 1� z1� 2z + zm+1 ;and corresponding expansions lead to sums that are of index(m�1), 2, and 2, respectively.�

EXERCISE 3. The OGF of an ascending line in Pascal’s triangle,Gn =Pj �n�djj �, is a rational function.

It is a common but often fruitless exercise in combinatorialanalysis to show “directly”equivalence between such combinatorial sums. In fact, as the example above illustrates,elementary combinatorial identities are often nothing butthe image (in a world with littletransparent algebraic structure) of simple algebraic identities between generating functions(that live in a world with a strong structure).

2. Analysis of rational functions

In principle, the asymptotic analysis of coefficients of a rational function is “easy”given the exponential–polynomial form. However, in most applications of interest, rationalfunctions are only given implicitly as solutions to linear systems. This confers a great valueto criteria that ensure unicity and/or simplicity of the dominant pole. Accordingly, thebulk of this section is devoted to a brief exposition of Perron-Frobenius theory that coversadequately the case of positive linear systems.

2.1. General rational functions. The fact that coefficients of rational series are ex-pressible as exponential polynomials yields an asymptoticequivalent. The!j satisfy!j = (�j)�1, where�1; �2; : : : are the poles off(z), that is to say, the zeros of thedenominator off . The formula simplifies as soon as there is a unique number inf!jg, say!1, that dominates the other ones in absolute value.

THEOREM 8.4 (Rational function asymptotics).Let f = P=Q be a rational functionwhereg d(P;Q) = 1. Assume thatf has a unique dominant pole, that is, the zerosf�jgof the denominator polynomialQ(z) satisfyj�1j < j�2j � j�3j � � � � , then (� > 0 beingarbitrary) [zn℄f(z) = R1(n)��n1 +O((j�2j � �)�n;whereR1(n) is a polynomial.

If f(z) has several dominant poles,j�1j = j�2j = � � � = j�rj < j�r+1j � � � � , then(� > 0 being arbitrary)fn = rXj=1Rj(n)��nj +O((j�r+1j � �)�n);where theRj(n) are polynomials. The degree of eachRj equals the order of the poleof f(z) at �i minus one.


Proof. Immediate from Theorem 8.1 �Observe that, iff(z) has nonnegative coefficients,f(z) 2 R�0 [[z℄℄, then�1 must at

least be real and positive by Pringsheim’s theorem. Unicityof a zero of smallest modulusfor Q(z) is equivalent to unicity of dominant singularity forf(z) and this is the simplestcase for analysis. The prototypical application is provided by the Fibonacci numbersF0 =0, F1 = 1, Fn+2 = Fn+1 + Fn with OGFf(z) = z1� z � z2 :The poles are at1=� and1=�, where� = 1 +p52 ; � = 1�p52 ;and one hasFn � �n=p5.

EXAMPLE 2. Compositions into finite summands.The OGF of integer compositionshaving summands in the finite setS = fs1; : : : ; smg with g d(fsjg) = 1 isC(z) = mYj=1 11�Pj zsj :The characteristic polynomialq(z) = P zsj has positive coefficients, so that there existsa unique positive� satisfyingq(�) = 1 and additionally one hasq0(�) > 0. Also all otherroots ofq are of modulus strictly larger than� (by positivity of q or Pringsheim’s theoremcombined with the gcd assumption). There results that[zn℄C(z) = [zn℄ 1q(�)� q(z) � 1q0(�)(�� z) � 1�q0(�)��n;since the dominant pole at� is simple. �

The case where several dominant singularities are present can also be treated easily assoon as one of them has a higher multiplicity, as this singlesout a larger contribution in thecoefficients’ asymptotics.

EXAMPLE 3. Denumerants.The OGF of integer partitions with summands in the setS = fs1; : : : ; smg whereg d(fsjg) = 1 isD(z) = mYj=1 11� zsj :This GF has poles atz = 1 and at roots of unity, but only the pole atz = 1 attainsmultiplicity m, with D(z) �z!1 1� 1(1� z)m ; � = mYj=1 sj :There results the estimate of the number of denumerants[zn℄D(z) � ��1�n+m�1m�1 �, thatis, [zn℄D(z) � 0� mYj=1 sj1A�1 nm�1(m� 1)! ;which is due to Schur. �


In the case when there exist several dominant singularitiespossessing the same multi-plicity, then fluctuations appear; refer to the discussion of Chapter 4. For a general linearrecurrence overQ, equivalently for a GF inQ(z), it is for instance only known that the setf = fn �� fn = 0g is a finite union of (finite or infinite) arithmetic progressions—thisconstitutes the Skolem-Mahler-Lech theorem. However,f is not known to be computableand it is not even known whether the propertyf = ; is decidable. The function11� 65z + z2 = 1 + 1:20z + 0:44z2 � 0:67z3 � 1:24z4 � 0:82z5 + 0:25z6 + � � � ;that is built on the Pythagorean tripleh3; 4; 5i illustrates some of the hardships. We havefn = sin(n+ 1)'sin' ; ' = ar os 35 ;and, for instance, the wayfn approaches its extremes� 54 = � 1sin' depends on how wellmultiples of' approximate multiples of�, that is, on deep arithmetic properties of'=�.Such pathological situations, though possibly present amongst general rational functions,hardly ever occur in analytic combinatorics.

EXERCISE4. Examine empirically the sign pattern of coefficients of the rationalfunctionsgr(z) = 11� z ��z ddz�r � 54(1� z) � f(z)� ; i.e.,gr;n = 1� nr(54 � fn):

2.2. Positive rational functions and Perron-Frobenius theory. For rational func-tions, positivity coupled with some ancillary conditions entails a host of important proper-ties, like unicity of the dominant singularity. Such facts result from the classical Perron-Frobenius theory of nonnegative matrices that we now summarize.

Consider first rational functions of the special formf(z) = 11� S(z) ;for S(z) a polynomial. It is assumed thatS has no constant term (S(0) = 0), so thatf(z)is properly defined as a formal power series. Assume next nonnegativity of the coefficientsof S; this entails existence of a dominant real pole that is simple. If additionallyS(z) isnot a polynomial inzd for somed � 2, then, as is easy to see, there is uniqueness (andsimplicity) of dominant singularity. As we show now, similar properties hold for systems:the conditions are those of properness and positivity coupled with a new combinatorialnotion of “irreducibility”. Granted these conditions, theanalysis suitably extends what wejust saw of the scalar case.

2.2.1. Perron-Frobenius theory of nonnegative matrices.The properties of positiveand of nonnegative matrices have been superbly elicited by Perron [77] in 1907 and byFrobenius [43] in 1908–1912. The corresponding theory has far-reaching implications: itlies at the basis of the theory of finite Markov chains and it extends to positive operators ininfinite-dimensional spaces [60].

For a square matrixM 2 Rm�m , thespectrumis the set of itseigenvalues, that is,the set of� such that�I �M is not invertible (i.e., not of full rank), whereI is the unitmatrix with the appropriate dimension. Adominant eigenvalueis one of largest modulus.

For M a scalar matrix of dimensionm � m with nonnegative entries, a crucialrole is played by thedependency graph; this is the (directed) graph that has vertex setV = f1 : :mg and edge set containing the directed edge(a ! b) iff Ma;b 6= 0. The


FIGURE 2. The irreducibility conditions of Perron-Frobenius theory.Left: a strongly connected digraph. Right: a weakly connected digraphthat is not strongly connected is a collection of strongly connected com-ponents related by a directed acyclic graph.

reason for this terminology is the following: LetM represent the linear transformationny?i =Pj Mi;jyjoi; then, a nonzero entryMi;j means thaty?i depends effectively onyj ,a fact translated by the directed edge(i! j).

Two notions are essential, irreducibility and aperiodicity (the terms are borrowed fromMarkov chain theory and matrix theory).

Irreducibility. The matrixM is calledirreducible if its dependency graph is stronglyconnected (i.e., any two points are connected by a path). By considering only simple paths,it is then seen that irreducibility is equivalent to the condition that (I +M)m has all itsentries that are strictly positive. See Figure 2 for a graphical rendering of irreducibility andfor the general structure of a (weakly connected) digraph.

Periodicity.A strongly connected digraphG is periodicwith parameterd iff all its cy-cles have a length that is a multiple ofd. In that case, the graph decomposes into cyclicallyarranged layers: the vertex setV can be partitioned intod classes,V = V0 [ � � � [ Vd�1,in such a way that the edge setE satisfies

(4) E � d�1[i=0(Vi � V(i+1) mod d):The maximal possibled is called theperiod. (For instance a directed10-cycle is periodicwith parameterd = 1; 2; 5; 10 and the period is 10.) If no decomposition exists withd � 2,so that the period has the trivial value 1, then the graph and all the matrices that admit it astheir dependency graph are calledaperiodic. See Figure 3.

THEOREM 8.5 (Perron-Frobenius theory).LetM be a matrix that is assumed to beirreducible, i.e., its dependency graph is strongly connected.(i) If M has (strictly)positive elements, then its eigenvalues can be ordered in such away that �1 > j�2j � j�3j � � � � ;andM has a unique dominant eigenvalue; this eigenvalue is positive and simple.


M = 2666666666666666640 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 1 1 01 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 01 0 0 1 1 0 0 0 0 0 0 0 0 0

377777777777777775FIGURE 3. The aperiodicity conditions of Perron-Frobenius theory.Top: an aperiodic digraph (left) and a periodic digraph (right). Bottom:A digraph of periodd = 4 (left) corresponding to a matrixM (right).For irreducible matrices, the (combinatorial) property ofthe graph tohave periodd is equivalent to the (analytic) property of the existenceof d dominant eigenvalues and it implies a rotational invariance of thespectrum. Here, the characteristic polynomial isz6(z8� 4z4+2). (Thespectrum consists of(2 � 21=2)1=4 and their four conjugates, as wellas 0.)(ii) If M hasnonnegative elements, then its eigenvalues can be ordered in such a way

that �1 = j�2j = � � � = j�dj > j�d+1j � j�d+2j � � � � ;and each of the dominant eigenvalues is simple with�1 positive.

Furthermore, the quantityd is precisely equal to the period of the dependency graph.If d = 1, in particular, then there is unicity of the dominant eigenvalue. Ifd � 2, the wholespectrum is invariant under the set of transformations� 7! �e2ij�=d; j = 0; 1; : : : ; d� 1:

Periodicity also means that the existence of paths of lengthn between any given pairof nodeshi; ji is constrained by the congruence classn mod d. A contrario, aperiodicityentails the existence, for alln sufficiently large, of paths of lengthn connectinghi; ji.From the definition, a matrixM with periodd has, up to simultaneous permutation of its


rows and columns, a cyclic block structure0BBBBBBBBBB�0 M0;1 0 � � � 00 0 M1;2 � � � 0...

......

. . ....0 0 0 � � � Md�2;d�1Md�1;0 0 0 � � � 0

1CCCCCCCCCCAwhere the blocksMi;i+1 are reflexes of the connectivity betweenVi andVi+1 in (4).

For short, one says that a matrix is positive (resp. nonnegative) if all its elements arepositive (resp. nonnegative). Here are two useful turnkey results, Corollaries 8.1 and 8.2.

COROLLARY 8.1. Any one of the following conditions suffices to guarantee theexis-tence of a unique dominant eigenvalue of a nonnegative matrix T :(i) T has (strictly) positive entries;(ii) T is such that, some powerT s is (strictly) positive;(iii) T is irreducible and at least one diagonal element ofT is nonzero;(iv) T is irreducible and the dependency graph ofT is such that there exist at least

two paths from the same source to the same destination that are of relativelyprime lengths.

Proof. The proof makes an implicit use of the correspondence between terms in coeffi-cients of matrix products and paths in graphs (see below, Section 3.3 for more).

Sufficiency of condition(i) results directly from Case(i) of Theorem 8.5.Condition(ii) immediately implies irreducibility. Unicity of the dominant eigenvalue

(hence aperiodicity) results from Perron-Frobenius properties ofMs, by which�s1 > j�2js.(Also, by elementary graph combinatorics, one can always take the exponents to be at mostthe dimensionm.)

By basic combinatorics of paths in graphs, Conditions(iii) and (iv) imply Condi-tion (ii). �

2.2.2. Positive rational functions.The importance of Perron-Frobenius theory and ofits immediate consequence, Corollary 8.1, stems from the fact that uniqueness of the dom-inant eigenvalue is usually related to a host of analytic properties of generating functionsas well as probabilistic properties of structures. In particular, as we shall see in the nextsection, several combinatorial problems (like automata orpaths in graphs) can be reducedto the following case.

COROLLARY 8.2. Consider the matrixF (z) = (I � zT )�1;whereT , called the “transition matrix”, is a scalar nonnegative matrix. It is assumedthatT is irreducible. Then each entryFi;j(z) ofM(z) has a radius of convergence� thatcoincides with the smallest positive root of the determinantal equation�(z) := det(I � zT ) = 0:Furthermore, the point� is a simple pole of anyFi;j(z).

In addition, ifT is aperiodic or if it satisfies any of the conditions of Corollary 8.1,then all singularities other than� are strictly dominated in modulus by�.


Proof. Define first (as in the statement)� = 1=�1, where�1 is the eigenvalue ofT oflargest modulus that is guaranteed to be simple by assumption of irreducibility and byPerron-Frobenius properties. Next, the relations inducedbyF = I + zTF , namely,Fi;j(z) = Æi;j + zXk Ti;kFk;j(z);together with positivity and irreducibility entail that the Fi;j(z) must all have the sameradius of convergencer. Indeed, eachFij depends positively on all the other ones (byirreducibility) so that any infinite value of an entry in the system must propagate to all theother ones.

The characteristic polynomial�(z) = det(I � zT );has roots that are inverses of the eigenvalues ofT and� = 1=�1 is smallest in modulus.Thus, since� is the common denominator to all theFi;j(z), poles of anyFi;j(z) can onlybe included in the set of zeros of this determinant, so that the inequalityr � � holds.

It remains to exclude the possibilityr > �, which means that no “cancellations” withthe numerator can occur atz = �. The argument relies on finding a positive combinationof some of theFi;j thatmustbe singular at�. We offer two proofs, each of interest in itsown right: one(a) is conveniently based on the Jacobi trace formula, the other(b) is basedon supplementary Perron–Frobenius properties.(a) Jacobi’s trace formula for matrices [48, p. 11],

(5) det Æ exp = exp ÆTr or log Æ det = Tr Æ loggeneralizes the scalar identities1 eaeb = ea+b andlog ab = log a + log b. Here we have(for z small enough)

Tr log(I � zT )�1 = Xi Xn�1Mi;i;n znn= log(det(I � zT )�1;where the first line results from expansion of the logarithm and the second line is an in-stance of the trace formula. Thus, by differentiation, the sum

PiMi;i(z) is seen to besingular at� = 1=�1 and we have established thatr = �.(b) Alternatively, letv1 be the eigenvector ofT corresponding to�1. Perron-Frobeniustheory also teaches us that, under the irreducibility and aperiodicity conditions, the vec-tor v1 has all its coordinates that are nonzero. Then the quantity(1� zT )�1v1 = 11� z�1 v1is certainly singular at1=�1. But it is also a linear combination of theFi;j ’s. Thus at leastone of the entries ofF (hence all of them by the discussion above) must be singular at� = 1=�1. Therefore, we have againr = �.

Finally, under the additional assumption thatT is aperiodic, there follows from Perron-Frobenius theory that� = 1=�1 is well-separated in modulus from all other singularitiesF . �

1The Jacobi trace formula is readily verified when the matrix is diagonizable, and from there, it can beextended to all matrices by an algebraic “density” argument.


It is interesting to note that several of these arguments will be recycled when we dis-cuss the harder problem of analysing coefficients of positive algebraic functions in Sec-tion 5.2.

EXERCISE5. The de Bruijn matrixT 2 R2`�2` is essential in problems relatedto occurrences of patterns in random strings [39]. Its entries are given byTi;j = 12 [[(j = 2i mod 2`) or (j = 2i+ 1 mod 2`)℄℄:Prove that it has a unique dominant eigenvalue. [Hint: consider a suitable graphwith vertices labelled by binary strings of length`.]

We next proceed to show that properties of the Perron-Frobenius type even extend toa large class of linear systems of equations that have nonnegative polynomial coefficients.Such a case is important because of its applicability to transfer matrices; see Section 3.3below.

Some definitions extending the ones of scalar matrices must first be set. A polynomialp(z) =Xj jzej ; every j 6= 0;is said to be primitive if the quantityÆ = g d(fejg) is equal to 1; it is imprimitiveotherwise. Equivalently,p(z) is imprimitive iff p(z) = q(zÆ) for somebona fidepoly-nomial q and someÆ > 1. Thus,z; 1 + z; z2 + z3; z + z4 + 2z8 are primitive while1; 1 + z2; z3 + z6; 1 + 2z8 + 5z12 are not.

DEFINITION 8.2. A linear system with polynomial entries,

(6) f(z) = v(z) + T (z)f(z)whereT 2 R[z℄r�r , v 2 R[z℄r , andf 2 R[z℄r the vector of unknowns is said to be:(a) rationally proper (r–proper)if T (0) is nilpotent, meaning thatT (0)r is the null

matrix;(b) rationally nonnegative (r–nonnegative)if each componentvj(z) and each matrixentryTi;j(z) lies inR�0 [z℄;( ) rationally irreducible (r–irreducible)if (I + T (z))r has all its entries that arenonzero polynomials.(d) rationally aperiodic (r–periodic)if at least one diagonal entry of some powerT (z)e is a primitive polynomial.

It is again possible to visualize these properties of matrices by drawing a directed graphwhose vertices are labelled1; 2; : : : ; r, with the edge connectingi to j that is weightedby the entryTi;j(z) of matrix T (z). Properness means that all sufficiently long paths(and all cycles) must involve some positive power ofz— it is a condition satisfied inwell-founded combinatorial problems; irreducibility means that the dependency graph isstrongly connected by paths involving edges with nonzero polynomials. Periodicity meansthat all closed paths involve weights that are polynomials in someze for somee > 1.

For instance, ifW is a matrix with positive entries, thenzW is r–irreducible

and r–aperiodic, whilez3W is r–periodic. The matrixT = 0BB� z z31 0 1CCA is r–

proper, r–irreducible, and r–aperiodic, sinceT 2 =0BB� z2 + z3 z4z z3 1CCA. The matrixT =

2. ANALYSIS OF RATIONAL FUNCTIONS 170BBBBBB� z3 1 00 0 zz2 0 0 1CCCCCCA is r–proper, but it fails to be r–aperiodic since, for instance, all cycles only

involve powers ofz3, as is visible on the associated graph:

2z

3

1 z

z

By abuse of language, we say thatf(z) is a solution of a linear system if it coincideswith the first component of a solution vector,f � f1. The following theorem generalizesCorollary 8.2.

THEOREM 8.6 (Positive rational systems).(i) Assume that a rational functionf(z)is a solution of a system (6) that is r–positive, r–proper, r–irreducible, and r–aperiodic.Then,f(z) has a unique dominant singularity� that is positive, and is a simple pole;� isthe smallest positive solution of

(7) det(I � T (z)) = 0:(ii) Assume thatf(z) is a solution of a system that is r–positive, r–proper, and r–irreducible (but not necessarily r–aperiodic). Then, the set of dominant singularities off(z) is of the formf�jgd�1j=0 , where�0 2 R�0 , �j=�0 = � is a root of unity, and�j�` is adominant singularity for all = 0; 1; 2; : : : . In addition, each�j is a simple pole.

Proof. Consider first Case(i). For any fixedx > 0, the matrixT (x) satisfies the PerronFrobenius conditions, so that it has a maximal positive eigenvalue�1(x) that is simple.More information derives from the introduction of matrix norms2. The spectral radius ofan arbitrary matrixA is defined as

(8) �(A) = maxj fj�j jg;where the setf�jg is the set of eigenvalues ofA (also called spectrum). Spectral radiusand matrix norms are intimately related since�(A) = limn!+1 (jjAnjj)1=n :In particular, this relation entails that the spectral radius is an increasing function of matrixentries: for nonnegative matrices, ifA � B in the sense thatAi;j � Bi;j (for all i; j), then�(A) � �(B); if A < B in the sense thatAi;j < Bi;j (for all i; j), then�(A) < �(B).(To see the last inequality, note the existence of� > 0 such thatA � (1� �)B.)

Returning to the case at hand, equation (8) and the surrounding remarks imply that thespectral radius�(T (x)), which also equals�1(x) for positivex, satisfies�1(0) = 0; �1(x) strictly increasing; �1(+1) = +1:(The first condition reflects properness, the second one is a consequence of irreducibility,and the last one derives from simple majorizations.) In particular, the equation�1(x) = 1admits a unique root� on (0;+1). (Notice that�1(x) is a real branch of the algebraic

2A matrix normjj:jj satisfies:jjAjj = 0 impliesA = 0; jj Ajj = j j � jjAjj; jjA + Bjj � jjAj+ jjBjj;jjA�Bjj � jjAjj � jjBjj.


curvedet(�I � T (x)) = 0 that dominates all other branches in absolute value forx > 0.There results from the general theory of algebraic functions, see Section 5, that�1(x) isanalytic at every pointx > 0.)

There remains to prove that:(a) � is at most a simple pole off(z); (b) � is actually apole;( ) there are no other singularities of modulus equal to�.

Fact (a) amounts to the property that� is a simple root of the equation�(�) = 1,that is, �0(�) 6= 0. (To prove�0(�) 6= 0, we can argue a contrario. First derivatives�0(�); �00(�), etc, cannot be zero till someoddorder inclusively since this would contradictthe increasing character of�(x) around� along the real line. Next, if derivatives till someevenorder� 2 inclusively were zero, then we would have by the local analytic geometryof �(z) near� some complex valuez1 satisfying:j�(z1)j = 1 andjz1j < �; but for sucha valuez1, by irreducibility and aperiodicity, for some exponente, the entries ofT (z1)ewould be all strictly dominated in absolute value by those ofT (�)e, hence a contradiction.)Then,�0(�) 6= 0 holds and by virtue ofdet(I � T (z)) = (1� �1(z)) Yj 6=1(1� �j(z)) = (1� �1(z)) det(I � T (z))1� �1(z) ;the quantity� is only a simple root ofdet(I � T (z)).

Fact(b) means that no “cancellation” may occur atz = � between the numerator andthe denominator given by Cramer’s rule. It derives from an argument similar to the oneemployed for Corollary 8.2. Fact( ) derives from aperiodicity and the Perron-Frobeniusproperties. �

3. Combinatorial applications of rational functions

Rational functions occur as generating functions of well-recognized classes of enu-merative problems. We examine below the case of regular specifications and regular lan-guages (Subsections 3.1 and 3.2) that are closely related totransfer matrix methods andfinite state models or automata (Subsection 3.3). Local constraints in permutations repre-sent a direct application of transfer matrix methods (Subsection 3.4). Lattice paths lead toan extension of the regular framework to infinite alphabets and infinite grammars, reveal-ing interesting connections with the theory of continued fractions (Subsection 3.5). Forinstance, the classic continued fractions identity (originally due to Gauß),1Xs=0 (1 � 3 � � � (2n� 1)) z2n 11� 1 � z21� 2 � z2

. . .

;expresses combinatorially a very “regular” decompositionof involutive permutations andhas implications on the physics of random interconnection networks.

3.1. Regular specifications.A combinatorial specification is said to beregular if itis nonrecursive (“iterative”, see Chapter 1) and it involves only the constructions of Atom,Union, Product, and Sequence. We consider here unlabelled structures. Since the operatorsassociated to these constructions are all rational, it follows that the corresponding OGFis rational. The OGF can be then systematically obtained from the specification by thesymbolic methods of Chapter 1.

3. APPLICATIONS OF RATIONAL FUNCTIONS 19

For instance, the OGF of the classGh of general Catalan trees of height at mosth isdefined by the recurrence G0 = Z; Gh+1 = Z �SfGhg:Accordingly, the OGF’s areG0(z) = z; G1(z) = z1� z ; G2(z) = z1� z1� z ; : : : ;andGh(z) is none other than thehth convergent in the continued fraction expansion of theCatalan GF: 12 �1�p1� 4z� = z1� z1� z1� z

.. .

:The interesting connections with Chebyshev polynomials and Mellin transform techniques—the expected height of a tree of sizen turns out to be asymptotic to

p�n— are detailedin Chapter 7.

In a similar vein, the classC[h℄ of integer compositions whose summands are at mosth isC[h℄ = SfZ �SfZ; card< hgg so that C [h℄(z) = 11� z � z2 � � � � � zh :The interesting asymptotics is again discussed in Chapter 7in connection with Mellintransform asymptotics. The largest summand in a random composition ofn turns out tohave expectation aboutlog2 n; see [49] for a general discussion.

EXERCISE 6. The OGF of integer compositions and of integer partitionswithsummands constrained to be either in number at mostm or of size at mosts isrational.

3.2. Regular languages.The name “regular specification” has been chosen in orderto be in agreement with the notions of regular expression andregular language from formallanguage theory. These two concepts are now defined formally.

A languageis a set of words over some fixed alphabetA. The structurally simplest (yetnontrivial) languages are theregular languagesthat can be defined in a variety of ways: byregular expressions and by finite automata, either deterministic or nondeterministic.

DEFINITION 8.3. The categoryRegExp of regular expressionsis defined as the cat-egory of expressions that contains all the letters of the alphabet(a 2 A) as well as theempty symbol�, and is such that, ifR1; R2 2 RegExp, then the formal expressionsR1[R2,R1 �R2 andR?1 are regular expressions.

Regular expressions are meant to specifylanguages. The languageL(R) denoted bya regular expressionR is defined inductively by the rules:(i) L(R) = fag if R is the lettera 2 A andL(R) = f�g (with � the empty word) ifR is the symbol�; (ii) L(R1 [ R2) =L(R1)[ L(R2) (with [ the set-theoretic union);(iii) L(R1 �R2) = L(R1) � L(R2) (with� the concatenation of words extended to sets);(iv) L(R?1) = f�g + L(R1) + L(R1) �L(R1) + � � � . A language is said to be aregular languageif it is specified by a regularexpression.

A language is asetof words, but a wordw 2 L(R) may be parsable in several waysaccording toR. More precisely, one defines theambiguity coefficient(or multiplicity) of w


with respect to the regular expressionR as the number of parsings, written�(w) = �R(w).In symbols, we have�R1[R2(w) = �R1(w) + �R2(w); �R1�R2(w) = Xu�v=w �R1(u)�R2(v);with natural initial conditions (�a(b) = Æa;b, ��(w) = Æ�;w), and with the definition of�R?(w) taken as induced by the definition ofR? via unions and products, namely,�R?(w) = Æ�;w + 1Xj=1 �Rj (w):As such,�(w) lies in the completed setN [ f+1g. We shall only consider here regularexpressionsR that areproper, in the sense that�R(w) < +1. It can be checked thatthis condition is equivalent to requiring that noS? with � 2 L(S) enters in the inductivedefinition of the regular expressionR. (This condition is substantially equivalent to thenotion of well-founded specification in Chapter 1.) A regular expressionR is said to beunambiguousiff for all w, we have�R(w) 2 f0; 1g; it is said to beambiguous, otherwise.

Given a languageL = L(R), we are interested in two enumerating sequencesLR;n = Xjwj=n�R(w); Ln = Xjwj=n1w2L;corresponding to the counting of words in the language, respectively, with and withoutmultiplicities. The corresponding OGF’s will be denoted byLR(z) andL(z). (Note that,for a given language, the definition ofL is intrinsic, while that ofLR is dependent on theparticular expressionR that describes the language.) We have the following.

PROPOSITION8.1 (Regular expression counting).Given a regular expressionR as-sumed to be of finite ambiguity, the ordinary generating function LR(z) of the languageL(R), countingwith multiplicity, is given by the inductive rules:� 7! 1; a 7! z; [ 7! +; � 7! �; ? 7! (1� (:))�1:In particular, ifR is unambiguous, then the ordinary generating function satisfiesLR(z) =L(z) and is given directly by the rules above.

Proof. Formal rules associate to any proper regular expressionR a specificationR:� 7! 1 (the empty object); a 7! Za (Za an atom);[ 7! +; � 7! �; ? 7! Sf:gIt is easily recognized that this mapping is such thatR generates exactly the collection ofall parsings of words according toR. The translation rules of Chapter 1 then yield thefirst part of the statement. The second part follows sinceL(z) = LR(z) wheneverR isunambiguous. �

The technique implied by Proposition 8.1 has already been employed silently in earlierchapters. For instance, the regular expressionW (3) = (�+ a+ aa) � (b � (�+ a+ aa))?generates unambiguously all binary words overfa; bg without 3-runs of the lettera; seeChapter 1. The generating function is thenW (3)(z) = (1 + z + z2) 11� z(1 + z + z2) = 1 + z + z21� z � z2 � z3 = 1� z31� 2z + z4 :

3. APPLICATIONS OF RATIONAL FUNCTIONS 21( ) ( ( ( ) ) ( ( ) ( ) ) ( ) ) ) ( ( ) ( ) ( ) )FIGURE 4. A tree of height 3, the corresponding parenthesis encodingof nesting depth 3, and the lattice path representation of height 3.

More generally, the OGF of binary words withoutk runs (of the lettera) is found by similardevices to be W (k)(z) = 1� zk1� 2z + zk+1 :

Consider next parenthesis systems wherea plays the role of the open parenthesis ‘(’andb is the closing parenthesis ‘)’. The collection of words withparenthesis nesting atmost 3 is described unambiguously byP (3) = �a (a(ab)?b)? b�? ;so that the OGF is P (3)(z) = 11� z21� z21� z2 = 1� 2z21� 3z2 + z4 :The general version with parenthesis nesting of depth at most k corresponds bijectively tolattice paths inN�N of height� k, and to general trees of height� k. Indeed, the classicalcorrespondence builds a lattice path that is the trace (up ordown) of edge traversals; seeFigure 4.

EXAMPLE 4. Pattern occurrences in strings.This is a simple example where countingwith ambiguity is used. Fix a binary alphabetA = fa; bg and consider the random variableu;n that represents the number of occurrences of a fixed patternu 2 A? in a randomstring ofAn. (See Chapter 7 of [84] for related material.) The regular expression (withu = u1 � � �um) Ou = A?(u1 � u2 � � �um)A?describes ambiguously the language of all the words that contain the patternu. It is easyto see that a wordw 2 A? has an ambiguity coefficient (with respect toOu) that is exactlyequal to the number of occurrences ofu in w. Clearly, the OGF associated toOu (withmultiplicity counted) isOu(z) = 11� 2z zm 11� 2z = zm(1� 2z)2 :This gives immediately the mean number of occurrences ofu assuming a uniform distri-bution overAn, namelyE(u;n) = 12n [zn℄ zm(1� 2z)2 = (n�m+ 1)2�m:


The mean is asymptotic ton2�m. In other words, there is on average a fraction about2�mof positions at which a pattern of lengthm is to be found in a random text of lengthn. Aultimate consequence is that naıve string-matching has expected complexity that is� 2non random texts of sizen; see [84]. �

EXERCISE7. Analyse the moment of order 2 of the number of occurrences byrelating it to the “correlations” between the pattern and its shifted versions. [Hint:relate the problem to the Guibas-Odlyzko correlation polynomial as describedin [84].]

EXAMPLE 5. Order statistics. Consider an ordered alphabetA = fa1; a2; : : : ; amg,where it is assumed thata1 < a2 < � � � < am. Given a wordw = w1 � � �wn, thejth letterwj is a record inw (respectively a weak record) if it is strictly larger (resp.not smaller) thanall the previous lettersw1; : : : ; wj�1. The study of records is a classical topic in statisticaltheory [28], and we are examining records in permutations of a multisetsince repeatedletters are allowed to composew.

Regular expressions are well-suited to the problem. Consider first (strong) records.The collection of words such that the values of their recordsare aj1 < � � � < ajr isdescribed by the regular expressionRj1;:::;jr = aj1A?�j1aj2A?�j2 � � �ajrA?�jr where A�j = fa1; : : : ; ajg:On the other hand, the product

Qj (�+ aj) generates all the words formed by an increasingsequence of distinct letters. These two observations combine: the regular expressionR = [j1<��<jr Rj1;:::;jr = mYj=1 �� [ ajA?�j�generates all the words inA?, where the number of records appears as the number (r) offactors different from� in the full expansion ofR.

Consequently, by the principles of Chapter 3, the multivariate OGF of wordsR(u;x),with u marking the number of records,x an abbreviation forx1; : : : ; xm, andxj the vari-able that marks the number of occurrences of letteraj , is given byR(u;x) = �1 + ux11� x1��1 + ux21� x1 � x2� � � ��1 + uxm1� x1 � � � � � xm� :One checks thatR(1;x1; : : : ; xm) = (1 � x1 � � � � � xm)�1 as should be. The generat-ing functionR is a huge multivariate extension of the Stirling cycle polynomials and, forinstance, one has [ur℄ [x1x2 � � �xm℄R(u; x) = hmr i:

Assume that each letteraj of a word inAn has probabilitypj and letters occur inde-pendently in words. This model is treated by the substitutionxj 7! pjz. The mean numberof records is then found as the coefficient ofzn in �=�uR(u;x) taken atxj = zpj ; u = 1.The asymptotic estimate results from straight singularityanalysis of the pole atz = 1: Themean number of records in a random word of lengthn with letterj chosen independentlywith probabilitypj is asymptotic top11 + p21� p1 + � � �+ pm1� p1 � � � � � pm�1 :


The analysis asz tends to 1, keepingu as a parameter, shows also that the limit of[zn℄R(u; z) exists. By the continuity theorem for probability generating functions de-scribed in Chapter 9, this implies:The distribution of the number of records whenn tendsto1 converges to the (finite) law with probability generating function�1 + up11� p1��1 + up21� p1 � p2� � � ��1 + upm�11� p1 � � � � � pm�1�upm:

Similarly, the multivariate GFbR(u;x) = �1 + ux11� ux1��1 + ux21� x1 � ux2� � � ��1 + uxm1� x1 � � � � � uxm� :counts words by their number of weak records. This time, there is a double pole atz = 1:The number of (weak) records has a mean asymptotic to � n ( a computable constant)and a limit Gaussian law. �

Permutations and combinations of multisets are a classicaltopic in combinatorial anal-ysis; see for instance MacMahon’s book [68]. The corresponding statistics are of interest tocomputer scientists since they relate to searching in the context of sets and multisets obey-ing nonuniform data distributions. This last topic is for instance considered by Knuth [59,1.2.10.18] and developed by Burge [17] in a pre-symbolic context. Such techniques havebeen successfully employed to analyse data structures likethe ternary search tries of Bent-ley and Sedgewick [9]; see [24]. Prodinger has also developed a collection of studiesconcerning words whose letter probabilities are geometrically distributed,pj = (1� q)qj :see for instance [55]. In the latter case, one may legitimately let the cardinality of the al-phabet become infinite. This approach then provides interesting q-analogues of classicalcombinatorial quantities since they reduce to permutationstatistics whenq ! 1.

EXERCISE8. Analyse records in random permutations of a multiset, which cor-responds to extracting coefficients[xn11 � � �xnmm ℄ in R0u(1; z) and bR0u(1; z). De-rive in this way the fact that the mean number of records in a random permutationof n elements is the harmonic numberHn. [Hint. See [59, 1.2.10.18] and [17].]

3.3. Paths in graphs, automata, and transfer matrices.A closely related set ofapplications of regular functions is to problems that are naturally described as paths indigraphs, or equivalently as finite automata. In physics, the corresponding treatment isalso called the “transfer matrix method”. We start our exposition with the enumeration ofpaths in graphs that constitutes the most direct introduction to the subject.

3.3.1. Paths in graphs.LetG be a directed graph with vertex setf1; : : : ;mg, whereself-loops are allowed and label each edge(a; b) by the (formal or numeric) variablegi;j .Consider the matrixG such thatGa;b = ga;b if the edge(a; b) 2 G; Ga;b = 0 otherwise.

Then, from the standard definition of matrix products, the powersGr have elements thatare path polynomials. More precisely, one has the simple butessential relation,(G)ri;j = Xw2P(i;j;r)w;whereP(i; j; r) is the set of paths inG that connecti to j and have lengthr, and a pathwis assimilated to the monomial in indeterminatesfgi;jg that represents multiplicatively the


succession of its edges; for instance:(G)3i;j = Xm1=i;m2;m3;m4=j gm1;m2gm2;m3gm3;m4 ;In other words, powers of the matrix associated to a graph “generate” all paths in a graph.One may then treat simultaneously all lengths of paths (and all powers of matrices) byintroducing the variablez to record length.

PROPOSITION8.2. (i) LetG be a digraph and letG be the matrix associated toG.The OGFF hi;ji(z) of the set of all paths fromi to j in a digraphG with z marking lengthandga;b marking the occurrence of edge(a; b) is the entryi; j of the matrix(I � zG)�1,namely F hi;ji(z) = (I � zG)�1��i;j = �hi;ji(z)�(z) ;where�(z) = det(I � zG) and�hi;ji(z) is the determinant of the minor of indexi; j ofI � zG.(ii) The generating function of nonempty closed paths is given byXi (F hi;ii(z)� 1) = �z�0(z)�(z) :Proof. Part(i) results from the discussion above which impliesF hi;ji(z) = 1Xn=0 zn (Gn)i;j = �(I � zG)�1�i;j ;and from the cofactor formula of matrix inversion. Part(ii) results from Jacobi’s traceformula. Introduce the quantity known as thezeta function,�(z) := exp Xi 1Xn=1F hi;iin znn ! = exp 1Xn=1 znn TrGn!= exp �Tr log(I � zG)�1� = det(I � zG)�1;where the last line results from the Jacobi trace formula. Thus,�(z) = �(z)�1. On theother hand, differentiation combined with the definition of�(z) yieldsz � 0(z)�(z) = �z�0(z)�(z)= Xi 1Xn=1F hi;iin zn;and Part(ii) follows. �

EXERCISE 9. Can the coefficients of�(z) be related to the polynomialsrepresenting self-loops, 2–loops, triangles, quadrangles, etc, in the graphG?[See [19]]

EXERCISE10. Observe thatz �0(z)�(z) =Xn�1 znTrGn =X� �z1� �z ;(the sum is over eigenvalues). Deduce an algorithm that determines the charac-teristic polynomial of a matrix of dimensionm inO(m4) arithmetic operations.


[Hint: computing the quantities TrGj for j = 1; : : : ; m requires preciselymmatrix multiplications.]

In particular, the number of paths of lengthn is obtained by applying a substitution� : ga;b 7! 0; 1 to (I � zG)�1 upon coefficient extraction by the[zn℄ operation. In asimilar vein, it is possible to consider weighted graphs, where thega;b are assigned realweights; with the weight of a path being defined by the productof its edges weights, onefinds that[zn℄(I � zG)�1 equals the total weight of all paths of lengthn. If furthermorethe assignment is made in such a way that

Pb ga;b = 1, then the matrixG, which is calleda stochastic matrix, can be interpreted as the transition matrix of a Markov chain.

Let us assume that nonnegative weights are assigned to the edges ofG. If the resultingmatrix is irreducible and aperiodic, then Perron-Frobenius theory applies. There exists� = 1=�1, with �1 > 0 the dominant eigenvalue ofG, and the OGF of weighted pathsfrom i to j has a simple pole at�. In that case, it turns out that a random (weighted) pathof lengthn has, asymptotically asn!1,

— an average number of edges of type(i:j) that is� i;jn, for some nonzeroconstant i;j 2 R;

— an average number of encounters with vertexi that is� �in, for some nonzeroconstant�i 2 R.

In other words, a long random path tends to spend asymptotically a fixed (nonzero) frac-tion of its time at any given vertex or along any given edge. These observations are thecombinatorial counterpart of the elementary theory of finite Markov chains. The treatmentof such questions depends on the following lemma.

LEMMA 8.1 (Iteration of Perron-Frobenius matrices).SetM(z) = (I�zG)�1 whereG has nonnegative entries, is irreducible, and is aperiodic.Let�1 be the dominant eigen-value ofG. Then the “residue” matrixR such that

(9) (I � zG)�1 = R1� z=�1 +O(1) (z ! �1)has entries given by (hx; yi represents a scalar product)Rij = ri`jhr; `i ;wherer and` are right and left eigenvectors ofG corresponding to the eigenvalue�1.

Proof. Scalingz as z=�1 reduces the situation to the case of a matrix with dominanteigenvalue equal to 1, so that we assume now�1 = 1. First observe thatR = limn!1Gn:The limit exists for the following reason: geometrically,G decomposes asG = P + SwhereP is the projector on the eigenspace generated by the eigenvector r; one hasSP =PS = 0 andP 2 = P , so thatGn = Pn + Sn; on the other hand,S has spectral radius< 1; thuslimGn exists and it equalsP (so thatR � P is a projector).

Now, for any vectorw, by properties of projections, one hasRw = (w)r;for some coefficient (w). Application of this to each of the base vectorsej (i.e.,ej = (Æj1; : : : ; Æjd)) shows that the matrixR has each of itscolumnsproportional to theeigenvectorr. A similar reasoning with the transposeGt of G and the associated residue


matrixRt shows that the matrixR has each of itsrowsproportional to the eigenvector`.In other words, for some constant , one hasRi;j = `jri:The normalization constant is itself finally determined by applyingR to r and one findsthat = 1=h`; ri. �

EXERCISE11. Relate explicitly the edge traversal and node encounterfrequen-cies to the dominant eigenvectors of a graph assumed to be strongly connectedand not cyclically layered. Discuss the stationary probabilities of a Markov chainin this context.

EXERCISE12. What happens when the matrixG is symmetric (i.e., the graphGis undirected)? Discuss formulæ for reversible Markov chains.

EXAMPLE 6. Locally constrained words.Consider a fixed alphabetA = fa1; : : : ; amgand a setF � A2 of forbidden transitions between consecutive letters. Theset of wordsoverA with no forbidden transitions is denoted byL and is called a locally constrainedlanguage. Clearly, the words ofL are in bijective correspondence with paths in a graphthat is constructed as follows. Set up the complete graphKm�m, where thej vertex of thegraph represents the production of the letteraj . Delete from the complete graph all edgesthat correspond to forbidden transitions. Then, a word of lengthn + 1 has no forbiddentransition iff it corresponds to a path of lengthn in the modified graph. Consequently, theOGF of any locally constrained language is a rational function. Its OGF is given by1 + z(1; 1; : : : ; 1)(I � zT )�1(1; 1; : : : ; 1)t;whereTij is 0 if (ai; aj) 2 F and 1 otherwise. Various specializations, including multi-variate GF’s and nonuniform letter models are easily treated by this method.

The particular case whereF consists of pairs of equal letters defines Smirnovwords [48, p. 69] and is amenable to a direct and explicit treatment. Let W (x1; : : : ; xm)be the multivariate GF of words with the variablexj marking the number of occurrencesof letter aj andS(x1; : : : ; xm) be the corresponding GF for Smirnov words. A simplesubstitution (already discussed in Chapter 3) shows thatW andS are related byW (x1; : : : ; xm) = S� x11� x1 ; : : : ; xm1� xm� ;whileW = (1� x1 � � � � � xm)�1. There results that

(10) S(x1; : : : ; xm) =W � x11 + x1 ; : : : ; xm1 + xm� = 1� mXi=1 xj1 + xj!�1 :In particular, settingxj = z, one gets the univariate OGFS(z) = 1 + z1� (m� 1)z ;implying that the number of words of lengthn ism(m� 1)n�1 (as it should be). �

Locally constrained languages thus have rational generating functions. The processof proof can then be adapted to the enumeration of many types of objects provided theyhave a sufficiently “sequential” structure. Carlitz compositions defined below illustrate a


recycling of the notion of regular expression to objects that are in fact defined over infinitealphabets.

EXAMPLE 7. Carlitz compositions.A Carlitz composition of the integern is a composi-tion of n such that no two adjacent summands have equal values. Consider first composi-tions with a boundm on the largest allowable summand. The OGF of Carlitz compositionsis directly derived from the GF of Smirnov words by substituting xj 7! zj . In this way,the OGF of Carlitz compositions with maximum summand at mostm is found to beC [m℄(z) = 0�1� mXj=1 zj1 + zj1A�1 ;and the OGF of all Carlitz compositions is obtained by letting m tend to infinity: thepreceding GF’s converge to

(11) C [1℄(z) = 0�1� 1Xj=1 zj1 + zj1A�1 :In particular, we get Sequence A003242 ofEIS3:C [1℄(z) = 1 + z + z2 + 3z3 + 4z4 + 7z5 + 14z6 + 23z7 + 39z8 + 71z9 + � � � :

The asymptotic form of the number of Carlitz compositions isthen easily found bysingularity analysis of meromorphic functions (see Chapter 4). We findC [1℄n � C � �n; C := 0:45638; � := 1:75024 :There,1=� is determined as the smallest positive root of the denominator in (11); see [57]for details. In this infinite alphabet example, the OGF is no longer rational but the essentialfeature of having a dominant polar singularity is preserved. �

3.3.2. Finite automata.Finite automata are closely related to languages of paths ingraphs and to regular expressions.

DEFINITION 8.4 (Finite state automaton).A finite automatonA over a finite alphabetA and with vertex setQ, called the set of states, is a digraph whose edges are labelled byletters of the alphabet, that is given together with a designated initial stateq0 2 Q and adesignated set of final statesQf � Q.

A wordw is said to be accepted by the automaton if there exists a path� in the graphconnecting the initial stateq0 to one of the final statesq 2 Qf , so that the succession oflabels of the path� corresponds to the sequence of letters composingw. (The path� isthen called an accepting path forw.) The set of accepted words is denoted byL(A).

In all generality, a finite automaton is, by its definition, anondeterministicdevice: if awordw is accepted, one may not “know”a priori which choices of edges to select in orderto accept it. A finite automaton is said to bedeterministicif given any stateq 2 Q and anyletterx 2 A, there is at most one edge from vertexq that bears labelx. In that case, onedecides easily (in linear time) whether a word is accepted byjust following edges dictatedby the sequence of letters inw.

3The EIS designates Sloane’s On-Line Encyclopedia of Integer Sequences [85]; see [86] for an earlierprinted version.


The main structural result of the theory is that there is complete equivalence betweenthree descriptive models: regular expressions, deterministic finite automata, and nondeter-ministic finite automata. The corresponding theorems are due to Kleene (the equivalencebetween regular expression and nondeterministic finite automata) and to Rabin and Scott(the equivalence between nondeterministic and deterministic automata). Thus, finite au-tomata whether deterministic or not accept (“recognize”) the class of all regular languages.

EXERCISE13. Any language definable by a regular expression is accepted bya(non-deterministic) finite automaton.[Hint: Perform an inductive constructionof an automaton based on separate constructions for unions,products, stars.]

EXERCISE14. Kleene’s theorem:Any language accepted by a finite automatonis definable by a regular expression.[Hint: Construct (inductively on increasingvalues ofk) matrices of regular expressions whereR(k)i;j is a regular expressionthat describes the set of paths from statei to statej constrained never to passthrough a state of index� k.]

EXERCISE15. Rabin and Scott’s theorem:Given any finite automatonA, thereexists an equivalent deterministic automationB, i.e.,L(A) = L(B). [Hint: thestates ofB are all the subsets of the states ofA and the transitions are determinedso thatB emulates all possible computations ofA.]

EXERCISE16. Regular languages are closed under intersection and complemen-tation.

PROPOSITION8.3 (Finite state automata counting).Any language accepted by a finiteautomaton or described by a regular expression has a rational generating function. If thelanguage is specified by an automatonA = hQ;Qf ; q0i, that is deterministic, then thecorresponding ordinary generating functionL0(z) is defined as the componentL0(z) ofthe linear system of equationsLj(z) = �j + zXa2AL�(qj;a)(z);where�j equals 1 ifqj 2 Qf and 0 otherwise, and where�(qj ; a) is the state reachablefrom stateqj when the lettera is read.

Proof. By the fundamental equivalence of models, one may freely assume the automatonto be deterministic. The quantityLj is nothing but the OGF of the language obtained bychanging the initial state of the automaton toqj . Each equation expresses the fact thata word accepted starting fromqj may be the empty word (ifqj is final) or, else, it mustconsist of a lettera followed by a continuation that is itself accepted when the automatonis started from the “next” state, that is, the state of index�(qj ; a).

Equivalently, one may reduce the proof to the enumeration ofpaths in graphs as de-tailed above. �

Given a problem that is described by a regular language, it isthen a matter of choiceto enumerate it using a specification by a regular expression(provided it is unambiguous)or by a finite automaton (provided it is deterministic). Eachproblem usually has a morenatural presentation. From the structural standpoint, it should however be kept in mind thatthe conversion between a regular expression and an equivalent nondeterministic automatainvolves a moderate (at worst polynomial) blow-up in size, while the transformation of anondeterministic automaton to an equivalent deterministic one may involve an exponential


increase in the number of states. (The construction underlying the equivalence theoremof Rabin and Scott involves the power-set of the set of statesQ of the nondeterministicautomaton.) Note that the direct construction of the generating functions associated tofinite automata has been already encountered in Chapter 1 when discussing particular casesof languages containing or excluding a fixed pattern.

As the proof of the proposition shows, the OGF of the languagedefined by a deter-ministic finite automaton involves a quasi-inverse(1 � zT )�1 where the transition ma-trix T is a direct encoding of the automaton’s transitions. Corollary 8.2 and Lemma 8.1were precisely custom-tailored for this situation. Consequently, quantifying probabilisticphenomena associated to finite automata is invariably reducible to finding eigenvectors ofmatrices.

EXAMPLE 8. Words with excluded patterns.Fix a patternu = u1u2 � � �um. Let Eube the set of words that do not contain the wordu as a factor, andFu the set of wordsthat do not containu as a subsequence. The languageEu is defined by the (ambiguous)regular expressionA? n (A?uA?). A deterministic finite automaton forA?uA? is readilyconstructed fromu, from which a deterministic automaton results for the complementEu.(It suffices to exchange final and nonfinal states.) Thus, the generating function ofEu isrational.

A similar argument, starting from the (ambiguous) regular expression that describesFu, namely A?u1A? u2A? � � � A?umA?;also shows (via the construction of the finite automaton) that the OGF ofFu is rational.�

EXERCISE17. Extend the previous argument to prove the rationality ofthe OGFof words that do not contain any pattern from a finite setS, where pattern istaken either in the sense of a factor or a subsequence.

EXERCISE 18. Calculate explicitly the generating function ofFu for an arbi-traryu in the case of a binary and of a ternary alphabet.

Determine the mean number of occurrences ofu as subsequence in a ran-dom text of sizen. Discuss the asymptotic form of the variance of the numberof such occurrences.

EXAMPLE 9. Variable-length codes.A finite set of wordsS � A? is a (“variable-length”)code [11] if each word inA? admit at most one decomposition as a concatenation ofwords, each of which belonging toS. Obviously, such a code can be used for encodinginformations, adjusting formats of data to a particular transmission channel, etc.

It is not cleara priori how to decide whether any given setS is a code. LetS(z) =Pw2S zjwj be the OGF ofS (a polynomial). ThenS is a code iff11� S(z) = L(z);whereL(z) is the GF of all words that admit a decomposition as concatenations of wordsin S. Now, Aho and Corasick have established in [3] that there exists a deterministicautomaton of size linear in the total size ofS (the total number of letters) that recognizesS?(furthermore this automaton can be constructed in linear time). Thus, a system determiningL(z) is found in linear time.


From the combinatorial considerations above, there results a decision procedure forcodicity (based on linear algebra) that is of polynomial time complexity. �

EXERCISE19. A finite state automaton is said to be unambiguous if the set ofaccepting paths of any given words comprises at most one element. Prove thatthe translation into generating function as described above also applies to suchautomata, even if they are nondeterministic.

3.3.3. Transfer matrix methods.The transfer matrix method constitutes a variant ofthe modelling by deterministic automata and the encoding ofcombinatorial problems byregular languages. The very general statement of Theorem 8.6 applies here with fullstrength. Here, we shall simply illustrate the situation byan example inspired by theinsightful exposition of domino tilings and generating functions in the book of Graham,Knuth, and Patashnik [50].

EXAMPLE 10. Monomer-dimer tilings of a rectangle.Suppose one is given pieces thatmay be one of the three forms: monomers (m) that are1� 1 squares, and dimers that aredominoes, either vertically(v) oriented1� 2, or horizontally (h) oriented2 � 1. In howmany ways can ann� 3 rectangle be covered completely and without overlap (‘tiled’) bysuch pieces?

The pieces are thus of the following types,m = ; h = ; v = ;and here is a particular tiling of a5� 3 rectangle:

In order to approach this counting problem, one defines a classC of combinatorial ob-jects called configurations. A configuration relative to ann� k rectangle is a partial tiling,such that all the firstn� 1 columns are entirely covered by dominoes while between zeroand three unit cells of the last column are covered. Here are for instance, configurationscorresponding to the example above.

These diagrams suggest the way configurations can be built bysuccessive addition ofdominoes. Starting with the empty rectangle0 � 3, one adds at each stage a collection ofat most three dominoes in such a way that there is no overlap. This creates a configurationwhere, like in the example above, the dominoes may not be aligned in a flush-right manner.Continue to add successively dominoes whose left border is at abscissa1; 2; 3; etc, in a waythat creates no internal “holes”.

Depending on the state of filling of their last column, configuration can thus be clas-sified into 8 classes that we may index in binary asC000; : : : ; C111. For instanceC001


represent configurations such that the first two cells (from top to bottom, by convention)are free, while the third one is occupied. Then, a set of rulesdescribes the new type of con-figuration obtained, when the sweep line is moved one position to the right and dominoesare added. For instance, we haveC010 � =) C101:

In this way, one can set up a grammar (resembling a deterministic finite automaton)that expresses all the possible constructions of longer rectangles from shorter ones accord-ing to the last layer added. The grammar comprises productions likeC000 = �+mmmC000 +mvC000 + vmC000+ �mmC100 +m�mC010 +mm�C001 + v�C001 + �vC100+m��C011 + �m�C101 + ��mC110 + ��C111 :In this grammar, a “letter” likemv represent the addition of dominoes, in top to bottomorder, of typesm; v respectively; the letterm�m means adding twom-dominoes on the topand on the bottom, etc.

The grammar transforms into a linear system of equations with polynomial coeffi-cients. The substitutionm 7! z, h; v 7! z2 then gives the generating functions of configu-rations withz marking the area covered:C000(z) = (1� 2z3 � z6)(1 + z3 � z6)(1 + z3)(1� 5z3 � 9z6 + 9z9 + z12 � z15) ::In particular, the coefficient[z3n℄C000(z) is the number of tilings of ann� 3 rectangle:C000(z) = 1 + 3z3 + 22z6 + 131z9 + 823z12 + 5096z15 + � � � :The sequence grows like �n (for n � 0 (mod 3)) where� := 1:83828 (� is the cube rootof an algebraic number of degree 5). (See [20] for a computer algebra session.) On average,for largen, there is a fixed proportion of monomers and the distributionof monomers ina random tiling of a large rectangle is asymptotically normally distributed, as results fromthe developments of Chapter 9. �

As is typical of the tiling example, one seeks to enumerate a “special” set of configu-rationsCf . (In the example above, this isC000 representing complete rectangle coverings.)One determines an extended set of configurationsC (the partial coverings, in the example)such that:(i) C is partitioned into finitely many classes;(ii) there is a finite set of “ac-tions” that operate on the classes;(iii) size is affected in a well-defined additive way bythe actions. The similarity with finite automata is apparent: classes play the role of statesand actions the role of letters.

EXERCISE20. For any fixed widthw, the OGF of monomer-dimer coverings ofann� w rectangle is rational.

The OGF of Hamiltonian tours on ann � w rectangle is rational (one isallowed to move from any cell to any other vertically or horizontally adjacentcell). The same holds for king’s tours and knight’s tours.

EXERCISE21. The OGF of trees (binary, general) of bounded width is a rationalfunction.


EXERCISE22. Find combinatorial interpretations ofF [w℄(z) := 1Xn=0 (Fn)w zn;for any fixed integerw, where the numbersFn are the Fibonacci numbers. [Hint:see [50].]

EXERCISE23. Given a fixed digraphG assumed to be strongly connected, anda designated start vertex, one travels at random, moving at each time to anyneighbour of the current vertex chosen with equal likelihood. Show that the ex-pectation of the time to visit all the vertices is a rational number that is effectively(though perhaps not efficiently!) computable.

Often, the method of transfer matrices is used to approximate a hard combinatorialproblem that is not known to decompose, the approximation being by means of a familyof models of increasing “widths”. For instance, the enumeration of the numberTn oftilings of ann�n square by monomers and dimers remains a famous unsolved problem ofstatistical physics. Here, transfer matrix methods may be used to solve then�w version ofthe monomer–dimer coverings, in principle at least, for anyfixed widthw. (The “diagonal”sequence of then � w rectangular models corresponds to the square model.) It hasbeenat least determined by computer search that the diagonal sequenceTn starts as (this issequence A028420 in Sloane’sEIS[86]):1; 7; 131; 10012; 2810694; 2989126727; 11945257052321; : : : :From this and other numerical data, one estimates numerically that (Tn)1=n2 !1:94021 : : :, but no expression for the constant is known to exist4. The difficulty of copingwith the finite-width models is that their complexity (as measured , e.g., by the numberof states) blows up exponentially withw—such models are best treated by computer al-gebra; see [102]—and no law allowing to take a diagonal is visible. However,the finitewidth models have the merit of providing at least provable upper and lower bounds on theexponential growth rate of the hard “diagonal problem”.

3.4. Permutations and local constraints.In this subsection, we examine problemswhose origin lies in nineteenth century recreational mathematics. For instance, themenageproblem solved and popularized byEdouard Lucas in 1891, see [26], has the followingquaint formulation:What is the number of possible ways one can arrangen married cou-ples (‘menages’) around a table in such a way that men and women alternate, butnowoman sits next to her husband?

The menage problem is equivalent to a permutation enumeration problem. Sit firstconventionally the men at places numbered0; : : : ; n � 1, and let�i be the position at theright of which theith wife is placed. Then, a menage placement imposes the condition�i 6= i and�i 6= i+ 1 for eachi. We consider here a linearly arranged table (see remarksat the end for the other classical formulation that considers a round table), so that the

4In contrast, for coverings by dimers only, a strong algebraic structure is available and the number of coversbTn satisfies lim2n!+1�bT2n�1=(2n)2 = exp 1� 1Xn=0 (�1)n(2n + 1)2! = eG=� := 1:33851 : : : ;whereG is Catalan’s constant; see [76] for context on this famous result.


condition�i 6= i+1 becomes vacuous wheni = n. Here is a menage placement forn = 6corresponding to the permutation� = 24 1 2 3 4 5 64 5 6 2 1 3 35

61 2 3 4 5

Clearly, this is a generalization of the derangement problem (for which the weakercondition�i 6= i is imposed), where the cycle decomposition of permutationssuffices toprovide a direct solution (see Chapter 2).

Given a permutation� = �1 � � ��n, any quantity�i � i is called anexceedanceof �.Let be a finite set of integers that we assume to be nonnegative. Then a permutation issaid to be-avoiding if none of its exceedances lies in. The counting problem, as wenow demonstrate, provides an interesting case of application of the transfer matrix method.

The set being fixed, consider first for allj the class of augmented permutationsPn;j that are permutations of sizen such thatj of the positions are distinguished and thecorresponding exceedances lie in, the remaining positions having arbitrary values (butwith the permutation property being satisfied!). Loosely speaking, the objects inPn;j canbe regarded as permutations with “at least”j exceedances in. For instance, with = f1gand � = 0� 1 2 3 4 5 6 7 8 92 3 4 8 6 7 1 5 9 1A ;there are 5 exceedances that lie in (at positions1; 2; 3; 5; 6) and with3 of these distin-guished (say by enclosing them in a box), one obtains an element counted byP9;3 like2 3 4 8 6 7 1 5 9:Let Pn;j be the cardinality ofPn;j . We claim that the numberQn = Qn of -avoidingpermutations of sizen satisfies

(12) Qn = nXj=0(�1)jPn;j :Equation (12) is typically aninclusion-exclusionrelation. To prove it formally, define thenumberRn;k of permutations that have exactlyk exceedances in and the generatingpolynomials Pn(w) =Xj Pn;jwj ; Rn(w) =Xk Rn;kwk :The GF’s are related byPn(w) = Rn(w + 1) or Rn(w) = Pn(w � 1)::(The relationPn(w) = Rn(w + 1) simply expresses symbolically the fact that each-exceedance inR may or may not be taken in when composing an element ofP .) Inparticular, we havePn(�1) = Rn(0) = Rn;0 = Qn as was to be proved.


FIGURE 5. A graphical rendering of the legal template20?02?11?relative to = f0; 1; 2g.

The preceding discussion shows that everything relies on the enumerationPn;j of per-mutations with distinguished exceedances in. Introduce the alphabetA = [ f‘?’g,where the symbol ‘?’ is called the ‘don’t-care symbol’. A word onA, an instance with = f0; 1; 2g being 20?02?11?, is called atemplate. To an augmented permutation, oneassociates a template as follows: each exceedance that is not distinguished is representedby a don’t care symbol; each distinguished exceedance (thereby an exceedance with valuein) is represented by its value. A template is said to be legal ifit arises from an augmentedpermutation. For instance a template2 1 � � � cannot be legal since the corresponding con-straints, namely�1 � 1 = 2, �2 � 2 = 1, are incompatible with the permutation structure(one should have�1 = �2 = 3). In contrast, the template 20?02?11? is seen to be legal.Figure 5 is a graphical rendering; there, letters of templates are represented by dominoes,with a cross at the position of a numeric value in, and with the domino being blank inthe case of a don’t-care symbol.

Let Tn;j be the set of legal templates relative to that have lengthn and comprisejdon’t care symbols. Any such legal template is associated toexactlyj! permutations, sincen� j position-value pairs are fixed in the permutation, while thej remaining positions andvalues can be taken arbitrarily. There results that

(13) Pn;n�j = j!Tn;j and Qn = nXj=0(�1)n�jj!Tn;j ;by (12). Thus, the enumeration of avoiding permutations rests entirely on the enumerationof legal templates.

The enumeration of legal templates is finally effected by means of a transfer matrixmethod, or equivalently, by a finite automaton. If a template� = �1 � � � �n is legal, then thefollowing condition is met,

(14) �j + j 6= �i + i;for all pairs(i; j) such thati < j and neither of�i; �j is the don’t-care symbol. (There areadditional conditions to characterize templates fully, but these only concern a few letters atthe end of templates and we may ignore them in this discussion.) In other words, a�i witha numerical value preempts the value�i + i. Figure 5 exemplifies the situation in the case = f0; 1; 2g. The dominoes are shifted one position each time (since it isthe value of�� i that is represented) and the compatibility constraint (14)is that no two crosses shouldbe vertically aligned. More precisely the constraints (14)are recognized by a deterministicfinite automaton whose states are indexed by subsets off0; : : : ; b� 1g where the “span”b


is defined asb = max!2 !. The initial state is the one associated with the empty set (noconstraint is present initially), the transitions are of the form8<: (qS ; j) 7! qS0 whereS0 = ((S � 1) [ fj � 1g) \ f0; : : : ; b� 1g; j 6= ‘?’(qS ; ?) 7! qS0 whereS0 = (S � 1) \ f0; : : : ; b� 1g;the final state is equal to the initial state (this translatesthe fact that no domino can pro-trude from the right, and is implied by the linear character of the menage problem underconsideration). In essence, the automaton only needs a finite memory since the dominoesslide along the diagonal and, accordingly, constraints older than the span can be forgotten.Notice that the complexity of the automaton, as measured by its number of states, is2b.

Here are the automata corresponding to = f0g (derangements) and to = f0; 1g(menages).

{0} { } { }

For the menage problem, there are two states depending on whether or not the currentlyexamined value has been preempted at the preceding step.

From the automaton construction, the bivariate GFT(z; u) of legal templates, withu marking the position of don’t care symbols, is a rational function that can be determinedin an automatic fashion from. For the derangement and menage problems, one findsT f0g(z; u) = 11� z(1 + u) ; T f0;1g(z; u) = 1� z1� z(2 + u) + z2 :In general, this gives access to the OGF of the correspondingpermutations. Consider thepartial expansion ofT(z; u) with respect tou, taken under the form

(15) T(z; u) =Xr r(z)1� uur(z) ;assuming for convenience only simple poles. There the sum isfinite and it involves alge-braic functions j anduj of the variablez. Finally, the OGF of-avoiding permutationsis obtained fromT by the transformationznuk 7! (�z)nk!;which is the transcription of (13). Define the (divergent) OGF of all permutations,F (y) = 1Xn=0n! yn = 2F0[1; 1; y℄;in the terminology of hypergeometric functions. Then, by the remarks above and (15), wefind Q(z) =Xr r(�z)F (�uj(�z)):In other words,the OGF of-avoiding permutations is a combination of compositions ofthe OGF of the factorial series with algebraic functions.


The expressions simplify much in the case of menages and derangements where thedenominators ofT are of degree 1 inu. One hasQf0g(z) = 11 + zF ( z1 + z ) = 1 + z2 + 2z3 + 9z4 + 44z5 + 265z6 + 1854z7 + � � � ;for derangements, whence a new derivation of the known formula,Qf0gn = nXk=0(�1)k�nk�(n� k)!:Similarly, for (linear) menage placements, one findsQf0;1g(z) = 11 + zF ( z(1 + z)2 ) = 1 + z3 + 3z4 + 16z5 + 96z6 + 675z7 + � � � ;which isEIS–A000271 and corresponds to the formulaQf0;1gn = nXk=0(�1)k�2n� kk �(n� k)!:

Finally, the same techniques adapts to constraints that “wrap around”, that is, con-straints taken modulon. (This corresponds to a round table in the menage problem.)Inthat case, what should be considered is the loops in the automaton recognizing templates(see also the previous discussion of the zeta function of graphs). One finds in this way theOGF of the circular (i.e., classical) menage problem to be (EIS–A000179)bQf0;1g(z) = 1� z1 + zF ( z(1 + z)2 ) + 2z = 1+ z+ z3+2z4+13z5+80z6+579z7+ � � � ;which yields the classical solution of the (circular) menage problem,bQf0;1gn = nXk=0(�1)k 2n2n� k�2n� kk �(n� k)!;a formula that is due to Touchard; see [26, p. 185] for pointers to the vast classical literatureon the subject. The algebraic part of the treatment above is close to the inspiring discussionoffered in Stanley’s book [87]. An application to robustness of interconnections in randomgraphs is presented in [38].

For asymptotic analysis purposes, the following general property proves useful:LetFbe the OGF of factorial numbers and assume thaty(z) is analytic at the origin where itsatisfiesy(z) = z � �z2 +O(z3); then it is true that

(16) [zn℄F (y(z)) � [zn℄F (z(1� �z)) � n!e��:(The proof results from simple manipulations of divergent series in the style of [8].) Thisgives at sight the estimatesQf0gn � ne�1; Qf0;1gn � ne�2:More generally, for any set containing� elements, one hasQfgn � ne��:Furthermore, the numberRn;k of permutations having exactlyk occurrences (k fixed) ofan exceedance in is asymptotic toQfgn � ne��kk! :


In other words, the rare event that an exceedance belongs to obeys of Poisson distributionwith � = jj. These last two results are established by means of probabilistic techniquesin the book [7, Sec. 4.3]. The relation (16) points to a way of arriving at such estimates bypurely analytic-combinatorial techniques.

EXERCISE 24. Given a permutation� = �1 � � ��n, a succession gapis de-fined as any difference�i+1 � �i. Discuss the counting of permutations whosesuccession gaps are constrained to lie outside of a finite set.

In how many ways can a kangaroo pass through all points of the integerinterval [1; n℄ starting at1 and ending atn while making hops that belong tof�2;�1; 1; 2g?

3.5. Lattice paths and walks on the line.In this section, we considerlattice pathsthat are fundamental objects of combinatorics. Indeed, they relate to trees, permutations,and set partitions, to name a few. They also correspond to walks on the integer half-lineand as such they relate to classical random walks and to birth-and-death processes of prob-ability theory. The lattice paths discussed here have stepsthat correspond to movementseither immediately to the left or to the right. Combinatorially, such paths are the limit ofpaths of bounded height, themselves definable as nested sequences. As a consequence, theOGF’s obtained are of the continued fraction type.

DEFINITION 8.5 (Lattice path).A (lattice)path� = (U0; U1; : : : ; Un) is a sequence ofpoints in the latticeN�N such that ifUj = (xj ; yj), thenxj = j andjyj+1� yj j � 1. AnedgehUj ; Uj+1i is called anascent(a) if yj+1�yj = +1, adescent(b) if yj+1�yj = �1,and alevel step( ) if yj+1 � yj = 0.

The quantityn is thelengthof the path,o(�) := y0 is theinitial altitude, h(�) := ynis thefinal altitude. The extremal quantitiessupf�g := maxj yj and inff�g := minj yjare called theheightanddepthof the path.

It is assumed that paths are normalized by the conditionx0 = 0. With this normaliza-tion, a path of lengthn is encoded by a word witha; b; representing ascents, descents, andlevel steps, respectively. What we call thestandard encodingis such a word in which eachstepa; b; is (redundantly) subscripted by the value of they-coordinate of its associatedpoint. For instance,

w = 0 a0 a1 a2 b3 2 2 a2 b3 b2 b1 a0 1encodes a path that connects the initial point(0; 0) to the point(13; 1).

LetH be the set of all lattice paths. Given a geometric condition (Q), it is then possibleto associate to it a “language”H[Q℄ that comprises the collection of all path encodingssatisfying the conditionQ. This language can be viewed either as a set or as a formal sum,H [Q℄ = Xfw j Qgw;in which case it becomes the generating function in infinitely many indeterminates of thecorresponding condition.

The general subclass of paths of interest in this subsectionis defined by arbitrarycombinations of flooring (m), ceiling (h), as well as fixing initial(k) and final (l) altitudes:H [�m;<h℄k;l = fw 2 H : o(w) = k; h(w) = l; inffwg � m; supfwg < hg:


FIGURE 6. The three major decompositions of lattice paths: the archdecomposition (top), the last passages decomposition (bottom left), andthe first passage decomposition (bottom right).

We also need the specializations,H [<h℄k;l = H [�0;<h℄k;l , H [�m℄k;l = H [�m;<1℄k;l , Hk;l =H [�0;<1℄k;l . Three simple combinatorial decompositions of paths then suffice to deriveall the basic formulæ.

Arch decomposition: An excursion from and to level 0 consists of a sequence of“arches”, each made of either a 0 or aa0H[�1℄1;1 b1, so thatH0;0 = � 0 [ a0H[�1℄1;1 b1�? ;which relativizes to height< h.

Last passages decomposition.Recording the times at which each level0; : : : ; k is lasttraversed gives

(17) H0;k = H[�0℄0;0 a0H[�1℄1;1 a1 � � � ak�1H[�k℄k;kFirst passage decomposition.The quantitiesHk;l with k � l are implicitly determined

by the first passage throughk in a path connecting level 0 tol, so that

(18) H0;l = H[<k℄0;k�1ak�1Hk;l (k � l);A dual decomposition holds whenk � l.

The basic results express the generating functions in termsof a fundamental continuedfraction and its associated convergent polynomials. They involve the “numerator” and“denominator” polynomials, denoted byPh andQh that are defined as solutions to thesecond order (or “three-term”) recurrence equation

(19) Yh+1 = (1� h)Yh � ah�1bhYh�1; h � 1;together with the initial conditions(P�1; Q�1) = (1; 0), (P0; Q0) = (0; 1), and with theconventiona�1b0 = 1.

PROPOSITION8.4 (Path continued fractions [35]). (i) The generating functionH0;0of all basic excursions is represented by the fundamental continued fraction:H0;0 = 11� 0 � a0b11� 1 � a1b21� 2 � a2b3

. . .

:(20)

3. APPLICATIONS OF RATIONAL FUNCTIONS 39(ii) The generating function of ceiled excursionsH [<h℄0;0 is given by a convergent of thefundamental fraction:H [<h℄0;0 = 11� 0 � a0b11� 1 � a1b21� 2 � a2b3

. . .1� h�1(21)

= PhQh :(22)(iii) The generating function of floored excursions is given by thetruncation of the funda-mental fraction:H [�h℄h;h = 11� h � ahbh+11� h+1 � ah+1bh+21� h+2 � ah+2bh+3

. . .

(23)

= 1ah�1bh QhH0;0 � PhQh�1H0;0 � Ph�1 ;(24)

Proof. Repeated use of the arch decomposition provides a form ofH [<h℄0;0 with nestedquasi-inverses(1 � f)�1 that is the finite fraction representation (21). The continuedfraction representation for basic paths (namelyH0;0) is then obtained by lettingh ! 1in (21). Finally, the continued fraction form (23) for ceiled excursions is nothing butthe fundamental form (20), when the indices are shifted. Thethree continued fractionexpressions (20), (21), (23) are thence established.

Finding explicit expressions for the fractionsH [<h℄0;0 andH [�h℄h;h next requires determin-ing the polynomials that appear in the convergents of the basic fraction (20). By definition,the convergent polynomialsPh andQh are the numerator and denominator of the fractionH [<h℄0;0 . For the computation ofH [<h℄0;0 andPh; Qh, one classically introduces the linearfractional transformations gj(y) = 11� j � ajbj+1y ;so that

(25) H [<h℄0;0 = g0 Æ g1 Æ g2 Æ � � � Æ gh�1(0) andH0;0 = g0 Æ g1 Æ g2 Æ � � � ; :Now, linear fractional transformations are representableby 2� 2-matrices

(26)ay + b y + d 7! 0� a b d 1A ;


Type Spec. Formula

1. Excursion H0;0 11� 0 + a0b11� 1 + � � �2. Ceiled excursions H [<h℄0;0 PhQh3. Floored excursions H [�h℄h;h 1ah�1bh QhH0;0 � PhQh�1H0;0 � Ph�14. Transitions from 0 H0;l 1Bl (QlH0;0 � Pl)5. Transitions to 0 Hk;0 1Ak (QkH0;0 � Pk)6. Upcrossings from 0 H [<h℄0;h�1 Ah�1Qh7. Downcrossings to 0 H [<h℄h�1;0 Bh�1Qh8. Transitions (k � l) Hk;l 1AkBlQk (QlH0;0 � Pl)9. Transitions (k � l) Hk;l 1AkBlQl (QkH0;0 � Pk)10. Upward excursions H [�m;<h℄m;m 1am�1bm Dm;hDm�1;h11. Downward excursions H [<l+1℄l;l QlQl+112. Transitions in strip (k � l) H [�m;<h℄k;l 1AkBl Dm�1;kDl;hDm�1;h13. Transitions in strip (l � k) H [�m;<h℄k;l 1AkBl Dm�1;lDk;hDm�1;h

TABLE 1. Generating functions associated to some major path con-ditions. The basic continued fraction isH0;0 in Entry 1, with conver-gent polynomialsPh; Qh. Abbreviations used are:Am = a0 � � � am�1,Bm = b1 � � � bm, andDi;j = QiPj � PiQj .

in such a way that composition corresponds to matrix product. By induction on the com-positions that build upH [<h℄0;0 , there follows the equality

(27) g0 Æ g1 Æ g2 Æ � � � Æ gh�1(y) = Ph � Ph�1ah�1bhyQh �Qh�1ah�1bhy ;wherePh andQh are seen to satisfy the recurrence (19). Settingy = 0 in (27) proves (22).

Finally,H [�h℄h;h is determined implicitly as the rooty of the equationg0Æ� � �Ægh�1(y) =H0;0, an equation that, when solved using (27), yields the form (24). �A large number of generating functions can be derived by similar techniques. See

Figure 1 for a summary of what is known. We refer to the paper [35], where this theorywas first systematically developed and to the exposition given in [48, Chapter 5]. Ourpresentation here draws upon the paper [37] where the theory was put to use in order todevelop a formal algebraic theory of the general birth-and-death process.

We examine next a few specializations of the very general formulæ provided by Propo-sition 8.4 and Table 1.


EXAMPLE 11. Standard lattice paths.In order to count lattice paths, it suffices to effectone of the substitutions,�M : aj 7! z; bj 7! z; j 7! z; �D : aj 7! z; bj 7! z; j 7! 0:In the latter case level steps are disallowed, and one obtains so-called “Dyck paths”; in theformer case, all three step types are taken into account, giving rise to so-called “Motzkinpaths”. We restrict attention below to the case of Dyck paths.

The continued fraction expressingH0;0 is then purely periodic and hence a quadraticfunction: H0;0(z) = 11� z21� z21� . . .

= 12z2 �1�p1� 4z2� ;sinceH0;0 satisfiesy = (1 � z2y)�1. The families of polynomialsPh; Qh are in thiscase defined by a recurrence with constant coefficients and they coincide, up to a shift ofindices. Define classically the Fibonacci polynomials by the recurrenceFh+2(z) = Fh+1(z)� zFh(z); F0(z) = 0; F1(z) = 1:One findsQh = Fh(z2) andPh = Fh�1(z2). (The Fibonacci polynomials are essentiallyreciprocals of Chebyshev polynomials.) The GF of paths of height� h is then given byProposition 8.4 as H [<h℄00 (z) = Fh(z2)Fh+1(z2) :In other words, the general results specialize to provide the algebraic part of the analysis ofCatalan tree height, a topic dealt with in Chapter 7. We get more: for instance the numberof ways of crossing a strip of widthh� 1 isH [<h℄0;h�1(z) = zhFh(z2) :In the case of Dyck paths, explicit expressions result from the explicit generating functionsand from the Lagrange-Burmann inversion theorem; see Chapter 7 for details. �EXAMPLE 12. Area under Dyck path and coin fountains.Consider the case of Dyckpath and the parameter equal to the area below the path. Area under a lattice path can bedefined as the sum of the indices (i.e., the starting altitudes) of all the variables that enterthe standard encoding of the path. Thus, the BGFD(z; q) of Dyck path withz markinghalf-length andq marking area is obtained by the substitutionaj 7! qjz; bj 7! qj ; j 7! 0:It proves convenient to operate with the continued fraction

(28) F (z; q) = 11� zq1� zq2. . .

;


so thatD(z; q) = F (q�1z; q2). SinceF andD satisfy difference equations, for instance,

(29) F (z; q) = 11� zqF (qz; q) ;moments of area can be determined by differentiating and setting q = 1 (see Chapter 3 fora direct approach).

A general trick is effective to derive an alternative expression ofF . Attempt to ex-press the continued fractionF of (28) as a quotientF (z; q) = A(z)=B(z). Then, therelation (29) impliesA(z)B(z) = 11� qz A(qz)B(qz) ; henceA(z) = B(qz); B(z) = B(qz)� qzB(q2z);whereq is treated as a parameter. The difference equation satisfiedby B(z) is readilysolved by indeterminate coefficients: this classical technique was introduced in the theoryof integer partitions by Euler. WithB(z) =P bnzn, the coefficient satisfy the recurrenceb0 = 1; bn = qnbn � q2n�1bn�1:This is a first order recurrence onbn that unwinds to givebn = (�1)n qn2(1� q)(1� q2) � � � (1� qn) :In other words, introducing the “q-exponential function”,

(30) E(z; q) = 1Xn=0 (�z)nqn2(q)n ; where (q)n = (1� q)(1� q2) � � � (1� qn);one finds

(31) F (z; q) = E(qz; q)E(z; q) :Given the importance of the functions under discussion in various branches of mathe-

matics, we cannot resist a quick digression. The name of theq-exponential comes form theobvious property thatE(z(q�1); q) reduces toe�z asq ! 1�. The explicit form (30) con-stitutes in fact the “easy half” of the proof of the celebrated Rogers-Ramanujan identities,namely,

(32)

E(�1; q) = 1Xn=0 qn2(q)n = 1Yn=0(1� q5n+1)�1(1� q5n+4)�1E(�q; q) = 1Xn=0 qn(n+1)(q)n = 1Yn=0(1� q5n+2)�1(1� q5n+3)�1;that relate theq-exponential to modular forms. See Andrews’ book [4, Ch. 7] for context.

Here is finally a cute application of these ideas to asymptotic enumeration. Odlyzkoand Wilf define in [74, 73] an (n;m) coin fountain as an arrangement ofn coins in rowsin such a way that there arem coins in the bottom row, and that each coin in a higherrow touches exactly two coins in the next lower row. LetCn;m be the number of(n;m)fountains andC(q; z) be the corresponding BGF withq markingn andz markingm. SetC(q) = C(q; 1). The question is to determine the total number of coin fountains of arean,[qn℄C(q). The series starts as (this is Sequence A005169 of theEIS)C(q) = 1 + q + q2 + 2q3 + 3q4 + 5q5 + 9q6 + 15q7 + 26q8 + � � � ;


as results from inspection of the first few cases.

The functionC(q) is a priori meromorphic injqj < 1. From the bijection with Dyckpaths and area, one finds C(q) = 11� q1� q21� q3

. . .

:The identity (31) implies C(q) = E(q; q)E(1; q) :An exponential lower bound of the form1:6n holds on[qn℄C(q), since(1�q)=(1�q�q2)is dominated byC(q) for q > 0. At the same time, the number[qn℄C(q) is majorized bythe number of compositions, which is2n�1. Thus, the radius of convergence ofC(q) has tolie somewhere between0:5 and0:61803 : : : . It is then easy to check by numerical analysisthe existence of a simple zero of the denominator,E(�1; q), near� := 0:57614. Routinecomputations based on Rouche’s theorem (see Chapter 4) then makes it possible to verifyformally that� is the only simple pole injqj < 3=5, and the process is detailed in [73].Thus, singularity analysis of meromorphic functions applies:The number of coin fountainsmade ofn coins satisfies asymptotically[qn℄C(q) = An +O((5=3)n); := 0:31236; A = ��1 := 1:73566:This example illustrates the power of modelling by continued fractions as well as thesmooth articulation with meromorphic function asymptotics. �

The systematic theory of lattice path enumerations and continued fractions was devel-oped initially because of the need to count weighted latticepaths, notably in the contextof the analysis of dynamic data structures in computer science [36]. In this framework,a system of multiplicative weights�j ; �j ; j is associated with the stepsaj ; bj ; j , eachweight being an integer that represents a number of “possibilities” for the correspondingstep type. A system of weighted lattice paths has counting generating functions given by aneasy specialization of the corresponding multivariate expressions we have just developed,namely,

(33) aj 7! �jz; bj 7! �jz; j 7! jz;wherez marks the length of paths. One can then sometimes solve an enumeration problemexpressible in this way by reverse-engineering the known collection of continued fractionsas found in a reference book like Wall’s treatise [99]. Next, for general reasons, the poly-nomialsP;Q are always elementary variants of a family of orthogonal polynomials thatis determined by the weights [21, 35, 90]. When the multiplicities have enough structuralregularity, the weighted lattice paths are likely to correspond to classical combinatorialobjects and to classical families of orthogonal polynomials; see [35, 36, 46, 48] and Ta-ble 7 for an outline. We illustrate this by a simple example due to Lagarias, Odlyzko, andZagier [62].


Objects Weights(�j ; �j j) Counting Orth. pol.

Simple paths 1; 1; 0 Catalan # Chebyshev

Permutations j + 1; j; 2j + 1 Factorial # Laguerre

Alternating perm. j + 1; j; 0 Secant # Meixner

Involutions 1; j; 0 Odd factorial # Hermite

Set partition 1; j; j + 1 Bell # Poisson-Charlier

Nonoverlap. set part. 1; 1; j + 1 Bessel # Lommel

FIGURE 7. Some special families of combinatorial objects togetherwith corresponding weights, moments, and orthogonal polynomials.

FIGURE 8. An interconnection network on2n = 12 points.

EXAMPLE 13. Interconnection networks and involutions.The problem considered herewas introduced by Lagarias, Odlyzko, and Zagier in [62]: There are2n points on a line,with n point-to-point connections between pairs of points. What is the probable behaviourof thewidth of such an interconnection network?Imagine the points to be1; : : : ; 2n, theconnections as circular arcs between points, and let a vertical line sweep from left to right;width is defined as the maximum number of edges encountered bysuch a line. One mayfreely imagine a tunnel of fixed capacity (this corresponds to the width) inside which wirescan be placed to connect points pairwise. See Figure 8.

Let I2n be the class of all interconnection networks on2n points, which is preciselythe collection of ways of grouping2n elements inton pairs, or, equivalently, the class ofall involutions (i.e., permutations with cycles of length2 only). The numberI2n equals the“odd factorial”, I2n = 1 � 3 � 5 � � � (2n� 1);whose EGF isez2=2 (see Chapter 2). The problem calls for determining the quantity I [h℄2nthat is the number of networks corresponding to a width� h.

The relation to lattice paths is as follows. First, when sweeping a vertical line across anetwork, define an active arc at an abscissa as one that straddles that abscissa. Then buildthe sequence of active arcs counts at half-integer positions 12 ; 32 ; : : : ; 2n� 12 ; 2n+ 12 . Thisconstitutes a sequence of integers where each member is�1 the previous one, that is, alattice path without level steps. In other words, there is anascent in the lattice path foreach element that is smaller in its cycle and a descent otherwise. One may view ascents asassociated to situations where a node “opens” a new cycle, while descents correspond to“closing” a cycle.

Involutions are much more numerous than lattice paths, so that the correspondencefrom involutions to lattice paths is many-to-one. However,one can easily enrich lattice


paths, so that the enriched objects are in one-to-one correspondence with involutions. Con-sider again a scanning position at a half-integer where the vertical line crosses (active)arcs. If the next node is of the closing type, there are` possibilities to choose from. Ifthe next node is of the opening type, then there is only one possibility, namely, to starta new cycle. A complete encoding of a network is obtained by recording additionallythe sequence of then possible choices corresponding to descents in the lattice path (somecanonical order is fixed, for instance, oldest first). If we write these choices as superscripts,this means that the set of all enriched encodings of networksis obtained from the set ofstandard lattice path encodings by effecting the substitutionsbj 7! jXk=1 b(k)j :

The OGF of all involutions is obtained from the generic continued fraction of Propo-sition 8.4 by the substitution aj 7! z; bj 7! j z;wherez records the number of steps in the enriched lattice path, or equivalently, the num-ber of nodes in the network. In other words, we have obtained combinatorially a formalcontinued fraction representation,1Xn=0(1 � 3 � � � (2n� 1))z2n = 11� 1 � z21� 2 � z21� 3 � z2

. . .

;which was originally discovered by Gauß [99]. Proposition 8.4 then gives immediately theOGF of involutions of width at mosth as a quotient of polynomials. DefineI [h℄(z) :=Xn�0 I [h℄2n z2n:One has I [h℄(z) = 11� 1 � z21� 2 � z2

. . .1� h � z2 =Ph(z)Qh(z)

wherePh andQh satisfy the recurrenceYh+1 = Yh � hz2Yh�1:The polynomials are readily determined by their generatingfunctions that satisfies a first-order linear differential equation reflecting the recurrence. For instance, the denominatorpolynomials are identified to reciprocals of the Hermite polynomials,Qh(z) = (z=2)hHh( 12z );


0

20

40

60

80

100

120

140

160

180

200

220

240

260

200 400 600 800 1000

x

0

20

40

60

80

100

120

140

160

180

200

220

240

260

200 400 600 800 1000

x

0

20

40

60

80

100

120

140

160

180

200

220

240

200 400 600 800 1000

x

FIGURE 9. Three simulations of random networks with2n = 1000illustrate the tendency of the profile to conform to a parabola with heightclose ton=2 = 250.

themselves defined classically [2, Ch. 22] as orthogonal with respect to the measuree�x2 dx on (�1;1) and expressible asHm(x) = m! bm=2 Xm=0 (�1)jj!(m� 2j)! (2x)m�2j :In particular, one findsI [0℄ = 1; I [1℄ = 11� z2 ; I [2℄ = 1� 2z21� 3z2 ; I [3℄ = 1� 5z21� 6z2 + 3z4 ; & :

The interesting analysis of the dominant poles of the rational GF’s, for any fixedh,is discussed in the paper [62]. Simulations suggest strongly that the width of a randominterconnection network on2n nodes is tightly concentrated aroundn=2; see Figure 9.Louchard [67] succeeded in proving this fact and a good deal more: With high probability,the profile (the profile is defined here as the number of active arcs as time evolves) of a ran-dom network conforms asymptotically to a deterministic parabolax(1�x=n) (in the scaleof n) to which are superimposed well-characterized random fluctuations of amplitude onlyO(pn). In particular,the width of a random network of2n nodes converges in probabilityto n2 . �

4. Algebra of algebraic functions

Algebraic series and algebraic functions are simply definedas solutions of a poly-nomial equation (Definition 8.6). It is a nontrivial fact established by elimination theory(which can itself be implemented by way of resultants or Groebner bases) that they areequivalently defined as components of solutions of polynomial systems (Theorem 8.7).Like rational functions, they form a differential field but are not closed under integration;unlike rational functions, they are not closed under Hadamard products (Theorem 8.8).

The starting point is the following definition of an algebraic series.

DEFINITION 8.6. A power seriesf 2 C [[z℄℄ is said to bealgebraicif there exists a(nonzero) polynomialP (z; y) 2 C [z; y℄, such thatP (z; f) = 0:The set of all algebraic power series is denoted byC alg [[z℄℄.

Rational functions correspond to the particular case wheredegyP (z; y) = 1; conse-quently, we haveC rat[[z℄℄ � C alg [[z℄℄. Thedegreeof an algebraic seriesf is by definitionthe minimal value of degyP (z; y) = 1 over all polynomials that are cancelled byf (sothat rational series are algebraic of degree 1). Note that one can always assumeP to beirreducible (that isP = QR implies that one ofQ orR is a scalar) and of minimal degree.In effect, assume thatP (z; f(z)) = 0 whereP factorizes:P (z; y) = P1(z; y)P2(z; y).We must have at least one of the equalitiesP1(z; f) = 0 orP2(z; f) = 0, since otherwise,

4. ALGEBRA OF ALGEBRAIC FUNCTIONS 47

one would have two nonzero formal power seriesg1 = P1(z; f), g2 = P2(z; f) such thatg1 � g2 = 0, which is absurd.We do not address at this stage the question of deciding whether a bivariate polynomial

defines one, or several algebraic series, or none. The question is completely settled bymeans of the Newton polygon algorithm discussed in the next section.

EXERCISE25. (i) Prove that there is, up to constant multiples, a unique polyno-mial of minimal degree that is cancelled by an algebraic seriesf 2 C alg [[z℄℄.(ii) Prove furthermore that iff lies inQ[[z℄℄, then the minimal polynomialmay be chosen inQ[z; y℄. (Thus, for enumerative problems, factorization overthe fieldQ is all that is ever required; there is no need for “absolute” factoriza-tion, that is, factorization over the complex numbers.)

[Hint. Point(i) is related to gcd’s and principal ideal domains. Point(ii) isa classical lemma of Eisenstein; see [13].]

4.1. Characterization and elimination. A polynomial system is by definition a setof equations

(34)

8>>><>>>: P1(z; y1; : : : ; ym) = 0...Pm(z; y1; : : : ; ym) = 0;

where eachPj is a polynomial. A solution overC [[z℄℄ of (34) is anm-tuple(f1; : : : ; fm) 2C [[z℄℄m that cancels eachPj , that is,Pj(z; f1; : : : ; fm) = 0. Any of thefj is called a com-ponent solution. A basic result is that any component solution of a (nontrivial) polynomialsystem is an algebraic series. In other words, one can eliminate the auxiliary variablesy2; : : : ; ym and construct a single bivariate polynomialQ such thatQ(z; y1) = 0. Thisresult is a famous one in the theory of polynomial systems.

The study of polynomial systems and algebraic varieties (i.e., the point set of all so-lutions of a polynomial system) involves some subtle mathematics. Although genericinstances (including well-posed combinatorial problems)are normally well-behaved, allsorts of degeneracies might occur in principle, and these have to be accounted for or dis-posed of in statements of theorems. For instance, the number� is known to be a tran-scendental number, i.e., it does not satisfies any polynomial equation with coefficientsin Q. However, the triple(1; 1; �) is a solution of the following polynomial system in(y1; y2; y3):(35) y1 � 1 = 0; y2 + y21 � 2 = 0; (y1 � y2)y3 = 0:The intuitive reason is that the system is pathological: it does not specify afinite numberof values since the equation bindingy3 is algebraically “equivalent” to0y3 = 0.

As this example (35) suggests, in order to avoid degeneracies, we shall restrict atten-tion to systems of polynomials that have a finite set of solutions (in the algebraic closure ofthe coefficient field). Technically, such systems are calledzero-dimensional5. It is a nonob-vious fact that the property for a system to be zero-dimensional is algorithmically check-able (by means of Groebner bases). In the statement that follows, zero-dimensionality ismeant for the field of coefficientsC (z) and for algebraic varieties defined in the algebraicclosure ofC ((z)).

5For the general notion of dimensionality based on Hilbert polynomials, consult for instance [27, x9.3].


THEOREM 8.7 (Algebraic function characterization and elimination). For a powerseriesf(z) =Pn fnzn in C[[z℄℄, the following conditions are equivalent to algebraicity:(i) Normal form: there exists an irreducible polynomialP (z; y) 2 C [z; y℄ such thatP (z; f(z)) = 0:(ii) Elimination: there exists a zero-dimensional system of polynomialsfPj(z;y)gmj=1 such that a solutiony(z) 2 C [[z℄℄m to the system

(36) fPj(z; y1; y2; : : : ; ym) = 0g; j = 1 : :m;hasy1 = f .

Proof. That (i) =) (ii) is immediate, since a single equation is a particular systemofdimensionm = 1 and a single polynomial of degreed has at mostd roots in any extensionfield so that the variety it defines is0-dimensional. The converse property(ii) =) (i)expresses the fact that auxiliary variables can always be eliminated from a system of poly-nomial equations. We sketch below an abstract algebraic argument excerpted from [27].

A basic principle in the theory of equations consists in examining the collection of allthe “consequences” of a given system of equations. This is formalized by the notion ofideal. Given a fieldK , an ideal of the polynomial ringR = K [y1 ; : : : ; ym℄ is a setI thatis closed under sums (a; b 2 I impliesa + b 2 I) and under multiplication by arbitraryelements ofR (a 2 I and 2 R imply a 2 I). The idealhh1; : : : ; hsi generated by thehj 2 R is by definition the set of all combinations

P jhj for arbitrary j 2 R.Consider a system ofs polynomial equations(�) fhj(y1; : : : ; ym) = 0gsj=1 :

Set~y = (y1; : : : ; ym). Clearly if it happens that some particular value of~y satisfies thesystem�,, then this value also cancels any polynomialg in the idealhh1; : : : ; hsi. Nowthe intersection I1 = K [y1 ℄ \ hh1; : : : ; hsi;is an ideal (by elementary algebra, the intersection of two ideals is an ideal). The idealI1consists of a collection of univariate polynomials which vanish at anyy1 that is a compo-nent of a solution of�. InK [y1 ℄, all ideals are principal, meaning that for some polynomialg, one hasI1 = hgi. (The last fact relies simply on the existence of a Euclideandivision,hence of gcd’s inK [y1 ℄.) Thus, there exists a polynomialg such thatg(y1) = 0. (Inpassing, one needs to argue thatg is not the null polynomial. However, if this was thecase, it would contradict the assumption of zero-dimensionality.) Finally, specializing thediscussion to the coefficient fieldK = C (z), i.e., treatingz as a parameter, then yields theelimination statement. �

Note that by the famous Hilbert basis theorem [27, p. 74], every polynomial ideal hasa finite generating set (“basis”). As a consequence, the intersection construction used in theproof of Theorem 8.7 can be extended to obtain a triangular systemfgj = 0g, where eachgj depends ony1; : : : ; yj alone. This makes it possible to obtain eventually all componentsof all solution vectors by successively solving fory1, theny2, etc.

Techniques to perform eliminations can be made “effective”(i.e., algorithmic), twomajor methods being resultants and Groebner bases. A complete expose of eliminationtheory is however beyond the scope of the book. We shall thus limit ourselves to sketch-ing a strategy based on resultants that is conceptually simple. We shall next illustrate


informally the fundamental principles of Groebner bases byway of example. For a com-prehensive discussion of these issues, we refer to the excellent introduction given by Cox,Little, and O’Shea [27].

Resultants. Consider a field of coefficientsK which may be specialized asQ; C ; C (z); C ((z)), as the need arises. A polynomial of degreed in K [x℄ has at mostdroots inK and exactlyd roots in the algebraic closure�K of K . First, we set:

DEFINITION 8.7. Given two polynomials,P (x) = Xj=0 ajx`�j ; Q(x) = mXk=0 bm�kxk ;their resultant(with respect to the variablex) is the determinant of order(`+m):(37) R(P;Q; x) = det

��a0 a1 a2 � � � 0 00 a0 a1 � � � 0 0...

......

. . ....

...0 0 0 � � � a`�1 a`b0 b1 b2 � � � 0 00 b0 b1 � � � 0 0...

......

. . ....

...0 0 0 � � � bm�1 bm��:

The matrix itself is often called the Sylvester matrix and the resultant is also knownas the Sylvester determinant. By its definition, the resultant is a polynomial form in thecoefficients ofP andQ. The main property of resultants is the following.

LEMMA 8.2. (i) If P (x); Q(x) 2 K [x℄ have a common root in the algebraic closure�K of K , then

R(P (x); Q(x); x) = 0:(ii) Conversely, ifR(P (x); Q(x); x) = 0 holds, then eithera0 = 0 or b0 = 0 or elseP (x); Q(x) have a common root in�K .

Proof. We refer globally to [64, Vx10] for a presentation of resultants.(i) If P andQ have0 as a common root, then by construction, the resultant is equalto 0 since the determinant has its last column equal to the 0-vector. LetS be the Sylvestermatrix ofP;Q. If P andQ have a common root� with � 6= 0 then, the linear system

(38) S 0BBBBBB� w0w1...w`+m�1

1CCCCCCA = 0BBBBBB� 00...01CCCCCCA ;

admits a non trivial solution,(w0; : : : ; w`+m�1) with wj = �l+m�1�j . This can be seensince the Sylvester matrix is by construction the matrix of coefficients of the collection of


polynomials xm�1P (x); : : : ; xP (x); P (x); x`�1Q(x); : : : ; xQ(x); Q(x):(In fact, the argument above provides the rationale for the definition (37).) Now, byCramer’s rule6, the existence of a nontrivial solution to the linear system(38) implies thatthe Sylvester determinant has to be 0.(ii) Conversely, let�1; : : : ; �` and�1; : : : ; �m be respectively the roots ofP andQin the algebraic closure�K . It is a known fact [64] that the resultantR of P;Q satisfies

(39) R = a0bm0 Yi;j (�i � �j) = am0 Yi Q(�i) = (�1)`mb0Yj P (�j):Thus, the fact thatP;Q have a common root impliesR = 0 . �

EXERCISE26. Letf(x) be a polynomial of degreewith leading coefficienta0andm distinct roots�1; : : : ; �`. Then

R(f(x); f 0(x); x) = a0Yi6=j(�i � �j):Let g(x) have possibly multiple roots. Develop an algorithm based ongcd com-putations and resultants to produce a lower bound on the distance between anytwo distinct roots ofg (a “separation distance”).

The resultant thus provides asufficientcondition for the existence of common roots,but not always a necessary one. This has implications in situations where the coefficientsaj ; bk depend on one or several parameters. In that case, the condition R(P;Q; x) = 0 willcertainly capture all the situations whereP andQ have a common root, but it may alsoinclude some situations where there is a reduction in degree, although the polynomialshave no common root. For instance, takingP = tx� 1; Q = tx2� 1 (with t a parameter),the resultant with respect tox is found to beR = t(1� t):Indeed, the conditionR = 0 corresponds to either a common root (t = 1 implyingP (1) =Q(1) = 0) or to some degeneracy in degree (t = 0). (Note for this discussion that theresultant applies in particular when the base field isQ(t) or C (t) and the algebraicallyclosed field containsQ((t)) or C ((t)), which is the case of interest what follows.)

Given a system (36), we can then proceed as follows in order toextract a single equa-tion satisfied by one of the indeterminates. By taking resultants withPm, eliminate alloccurrences of the variableym from the firstm � 1 equations, thereby obtaining a newsystem ofm� 1 equations inm� 1 variables (andz kept as a parameter, so that the basefield is C (z)). Repeat the process and successively eliminateym�1; : : : ; y2. The strategy(in the simpler case where variables are eliminated in succession exactly one at a time) issummarized in the procedureeliminateR of Figure 10.

The algorithmeliminateR of Figure 10 will, by virtue of the sufficiency of resultantconditions, always provide a correct polynomial conditionsatisfied byy1, although the re-sulting polynomial that cancelsy1 need not be minimal. In short, elimination by resultantsproduces valid but not necessarily minimal results.

Alternatively, one can appeal to the theory of Groebner bases in order to developa complete algorithm. The Groebner based algorithm is summarized as the procedureeliminateG in Figure 10 and we refer to the already cited book [27] for precise definitionsand complete detail. See also Example 16 below for a presentation by way of example.

6This reasoning is valid in any field of characteristic 0 sinceonly rational operations are used in the proof.


procedureeliminateR (P1; : : : ; Pm, y1; y2; : : : ym);fElimination ofy2; : : : ; ym by resultantsg(A1; : : : ; Am) := (P1; : : : ; Pm);for j from m by �1 to 2dofor k from j � 1 by �1 to 1 doAk := R(Ak; Aj ; yk+1);

return (A1).procedureeliminateG (P1; : : : ; Pm; y1; y2; : : : ym);fElimination ofy2; : : : ; ym by Groebner basesgFix thevariable orderingym > ym�1 > y2 > y1;

Choose thepure lexicographic orderingon monomials;Operate with the coefficient fieldC (z);

Construct aGroebner basisG of the idealhP1; : : : ; Pmi.return (G \ C (z)[y1 ℄)

FIGURE 10. Resultant elimination (top) and Groebner basis elimina-tion (bottom).

Computer algebra systems usually provide implementationsof both resultants andGroebner bases. The complexity of elimination is however exponential in the worst-case.Degrees essentially multiply; this is somewhat intrinsic sincey0 in the systemy0 � z � yk = 0; yk � y2k�1 = 0; : : : ; y1 � y20 = 0;defines withk equations the GF of regular trees of degree2k, while it represents an alge-braic function of degree2k and no less.

The example of coloured trees. We illustrate the previous discussion by means of acombinatorial problem that is treated in succession by a “pencil-and-paper” approach (thatmakes use of a simple combinatorial property and high-school algebra), then by resultantsand finally by Groebner bases of which the last case serves to demonstrate themodusoperandiof Groebner bases.

EXAMPLE 14. Three-coloured trees—pencil-and-paper approach.Consider (plane,rooted) binary trees with nodes coloured by any of three colours,a; b; . We impose thatthe external nodes are coloured by thea-colour. The colouring has to be perfect in thesense that no two adjacent nodes are assigned the same colour. We letA;B; C denote theset of perfectly coloured trees with root of thea; b; type respectively. LetA;B;C be thecorresponding OGF’s where size is taken to be the number of external nodes. The definingequations are thus:

(40)

8>>><>>>: A = a+ (B + C)2B = (C +A)2C = (A+ B)2 8>>><>>>: A� z + (B + C)2 = 0B � (C + A)2 = 0C � (A+B)2 = 0:The pencil-and-paper approach may start by observing thatB andC are isomorphic as

combinatorial classes, as a result of the bijection that exchanges theb and colours. Thus

52 8. FUNCTIONAL EQUATIONS—RATIONAL AND ALGEBRAIC FUNCTIONSB = C. Then equation (40) simplifies into a system of 2 equations in2 unknowns withza parameter:

(41)

8<: A� z � 4B2 = 0B �B2 � 2AB �A2 = 0:SinceB is to be eliminated, we may combine linearly the equations of(41) in order to getrid of B2 (the highest power inB),

(42)

8<: A� z � 4B2 = 0A� z � 4B + 4A2 + 8AB = 0The second equation is linear inB and the value can then be plugged back into the firstequation, resulting in a fourth degree polynomial that cancelsA,

(43) R(z; A) = 16A4 � 8A3 + (17 + 8 z)A2 � (4 + 18 z)A+ z2 + 4 z :and the polynomial is easily checked to be irreducible. �EXAMPLE 15. Three-coloured trees—resultant approach.Start again with the system (40)and take mechanically resultants with respect toC of the third equation together with thefirst and second equations. This eliminatesC and provides two polynomials cancelledbyA andB:

(44)B �A2 � 2A3 �A4 � 4A3B � 6A2B2 � 4A2B � 4AB3 � 2B2A�B4A� z �B2 � 2A2B �A4 � 4A3B � 6A2B2 � 4B2A� 4AB3 � 2B3 �B4:

A further resultant with respect toB will then yield a polynomial involvingA alone andcancelled byA:256A8 + 256A7 + (256 z + 352)A6 � (64 z � 304)A5+ �96 z2 � 112 z + 161�A4 � �80 z2 + 108 z � 26�A3 + �16 z3 � 26 z2 � 22 z � 7�A2� �12 z3 + 10 z2 + 2 z + 4�A+ 9 z2 + 6 z3 + 4 z + z4This time, the polynomial of degree 8 inA. (From (43), we know it cannot be minimal.)In fact it factors as

(45) (4A2 + 3A+ 1 + z)2 (16A4 + � � �+ 4z) = (4A2 + 3A+ 1 + z)2R(z; A);with R(z; A) as in (43).

The series expansion for the combinatorial generating functionA starts withA(z) = z + 4z4 + 16z5 + 56z6 + 256z7 + 1236z8 + 5808z9 + � � � ;so that the first quadratic factor is disqualified as a candidate for minimal polynomial. Wehave thus found again the result of (43). Note that the complete factorization of (45) overQ guarantees at the same time thatR(z; A) is the minimal polynomial so thatA is of exactdegree 4. (See the exercise preceding Theorem 8.7.) �EXAMPLE 16. Three-coloured trees—Groebner basis approach.In accordance with theprocedureeliminateG we first fix an ordering on variables. For the case at hand, we chooseC > B > A;with the intent that variables should be eliminated in that order, firstC, nextB, thenA.Several orderings may be defined on monomials, but for our purposes, we chose thepure


lexicographic orderingthat is the usual alphabetic ordering of words in a dictionary (withC precedingB precedingA, though!). For instance, with this ordering, we have (thinkofC = u, B = v, A = w and the lexicographic order in the Latin script)C > C2 > C3 > CB > CB2 > B > B5 > BA > BA4:The leading term of a polynomial with the chosen ordering is denoted byLT(p). Forinstance:LT(2A2 + 3B2 + 4BA+ 5CB2) = 5CB2.

The basic operation of Groebner basis theory is the “division” operation of a polyno-mial f by a sequence of polynomialsF = (f1; : : : ; fs) producing a remainder denoted

by fF . The idea is to “reduce” the polynomialf by an operation that is reminiscent ofGaussian elimination (for multivariate linear polynomials) and of the Euclidean algorithm(for univariate nonlinear polynomials).

The principle of division is as follows. First transform each polynomial infi of F intoa rewrite rule: (Wi) : LT(fi) 7! �fi + LT(fi):(a rule thus replaces a monomial by a combination of smaller monomials under the mono-mial order.) Use repeatedlyW1; : : : ;Ws (in that order, say, for determinacy) to reducethe leading term off (that changes at each stage). When the leading term no longerre-duces, then proceed similarly with the second leading term,the third one, and so on. Thealgorithm is specified in full in [27, p. 62]. On our example, the three rules are(W1) C2 7! �z +A�B2 � 2BC(W2) C2 7! 2CA+A2(W3) C 7! A2 + 2AB +B2An instance of a reduction will be:C3 (W1)7! CA� CB2 � 2BC2 � zC(W1)7! CA+ 3CB2 � Cz � 2AB + 2B3 + 2Bz (W3)7! � � � :(In such a simple case, the effect of the rules is easy to describe: they eventually clear outall occurrences ofC, with W1 used a number of times andW3 used once at the end todispose of the last linear term inC.)

Next comes the notion ofGroebner basis. The principle is to build a “good” set ofgenerators for theidealdefined byF , that is, the set of all polynomialsI =Xj ajfj ;where the coefficientsaj are polynomials. (On the example, we would haveaj 2C (z)[A;B;C℄.)

DEFINITION 8.8. A Groebner basis for an idealI is a setG of polynomials such thatfor all f 2 I , the division byG yields a complete reduction,fG = 0:

In particular, such a basis should “self-reduce” in the sense that for allg; h 2 G:

(46) x�g � x�hG = 0:(Herex� andx� represent arbitrary monomials, in accordance with standard multi-indexnotation.)


Buchbergers’ algorithm constructs Groebner bases by enriching a given baseF tillthe self-reduction property (46) is ensured. Define theS-polynomial of two monic poly-nomialsp; q asx�p�x�q, with � and� chosen to be the smallest monomials that achievecancellation of the leading terms ofp andq. (S-polynomials are in a sense the simplest

nontrivial polynomials that should reduce.) One must haveS(g; h)G = 0, for all g; h 2 G.Then, the construction of a Groebner basis from a sequenceF proceeds incrementally:we start withH = F and successively add toH all pairsp; q with p; q 2 H such thatS(p; q)H 6= 0. The process is stopped once saturation is reached. (The algorithm is speci-fied in full in [27, p. 97].)

For instance consider the system (40), that is,F = (f1; f2; f3). Set initiallyH = F .TheS-polynomial off1; f2 is obtained by taking1f1�1f2 (the monomialsC2 disappear),giving �A+B2 + 2BC + z +B � 2CA�A2;which reduces moduloH (by way of(W3)) intof4 = B �A2 �A+B2 + z � 2A2B + 2AB2 + 2B3 �A3:Then, the two other pairs providef5 = f2f3H 6= 0, andf6 = f3f1H 6= 0. At this stage,the new-comersf4; f5; f6 are adjoined toH, giving the new valueH = (f1; : : : ; f6). Thecompletion process continues tillH eventually stabilizes. We then take the Groebner basisto beG := H, after removing redundant polynomials [27, Sec. 2.9]. Here, a set of 4polynomials is obtained. In accordance with theory, the Groebner basis contains a singlepolynomial that depends on the variableA alone; this polynomial that is of degree 6 factorsas (4A2 + 3A+ z + 1)R(z; A):The minimal polynomial is once more recovered: Groebner elimination has faithfully ac-complished its task. �

4.2. Closure properties and coefficients.By definition, algebraic functions satisfystrong algebraic closure properties. The proofs now fall asripe fruits as a benefit of ourinvestment on elimination. From this point on,eliminate will designate either of theelim-inateR or eliminateG procedure.

THEOREM 8.8 (Algebraic function closure (1)).The setC alg [[z℄℄ of algebraic seriesis closed under the operations of sum(f + g), product (f � g), quasi-inverse (definedby f 7! (1 � f)�1, conditioned uponf0 = 0), differentiation (�z), composition (f Æ g,conditioned upong0 = 0).

Proof. Let v andw be defined byP (z; u) = 0 andQ(z; v) = 0. For sum, product,quasi-inverse, and composition, just appeal toy = u+ v : eliminate([y � u� v; P (z; u); Q(z; v)℄; y; u; v);y = uv : eliminate([y � uv; P (z; u); Q(z; v)℄; y; u; v);y = (1� u)�1 : eliminate([y � uy � 1; P (z; u)℄; y; u);y = u Æ v(z) : eliminate([P (v; y); Q(z; v)℄; y; v):An immediate consequence is that ifU(z; y) is a rational function and� 2 C alg [[z℄℄ isalgebraic, thenU(z; �) is algebraic. Next, if� satisfiesP (z; �) = 0, then the derivative

4. ALGEBRA OF ALGEBRAIC FUNCTIONS 55�0 with respect toz satisfiesP 0z(z; �) + �0P 0y(z; �) = 0 or �0 = �P 0z(z; �)P 0y(z; �) ;and is therefore a rational fraction in�. Thus, by the previous observation,�0 is algebraic.�

These results show thatC alg [[z℄℄ is a ring. They extend transparently toC alg ((z)) whichis a field.

Algebraic series are not closed under Hadamard product. To see it, considerh(z) := 1p1� z � 1p1� z = 1Xn=0 116n�2nn �2zn:The coefficient[zn℄h(z) is asymptotic to1=(�n). Consequently, by an Abelian theorem,h(z) � 1� log 11� z ; z ! 1�:Such a singularity is incompatible with algebraicity, where only fractional power seriesmay occur; see the next section. On the positive side, we havethe following.

THEOREM8.9 (Algebraic function closure (2)).(i) If f is rational andg is algebraic,then the Hadamard productf � g is algebraic.(ii) Define the diagonal of a bivariate seriesF 2 C [x; y℄,�x;yF (x; y) =Xn Fn;nzn;Then, the diagonal of a rational bivariate series is an algebraic series.

Proof. (i) Takef 2 C rat [[z℄℄ andg 2 C alg [[z℄℄ and consider the Hadamard producth = f�g. The Hadamard product distributes over sums, so that is is enough to show algebraicitywhenf is a simple fraction element,f = (1 � az)�m. This is settled by simple algebra,since (1� az)�m � g(z) = (1� z)�m � g(az) = dm�1dzm�1 �zm�1g(az)� :

An alternative, more analytic, derivation can be based on Hadamard’s formula forHadamard products. This important formula valid for all functions analytic at 0 is

(47) f � g(z) = 12i� Z f(t)g(zt ) dtt :If f andg are analytic injzj < R andjzj < S respectively, then can be any loop aroundzero whose elementst satisfyjtj < R andjz=tj < S, that is,jzjS < jtj < R;and the conditions are compatible providedz is of small enough modulus. (The verificationby direct series expansions is immediate.) For the originalproblem ofC rat [[z℄℄� C alg [[z℄℄,an evaluation of (47) by residues then provides the result.(ii) Our proof will start from another of Hadamard’s formulæ,

(48) �x;yF (x; y) = 12i� Z F (t; zt ) dtt :that is proved by the same devices as (47). IfF (x; y) is analytic injxj < R andjyj < S,thent should be taken such thatjzjS < jtj < R, and the domain of possible values is


nonempty as long asjzj is small enough. (The verification is again by straight expansion.)This formula generalizes (47) sincef � g(z) = �x;y(f(x)g(y)).

Now, for jzj small, we can evaluate the integral by residues insidez=t < S. See [88]for details. Assumez small enough and letF (x; y) = A(x; y)=B(x; y). Consider the roots�1(z); : : : ; �r(z) of the characteristic equation,B(�; z=�) = 0 that tend to 0 asz ! 0.Then, forz small enough and encircling the “small” roots�1(z); : : : ; �r(z), a residuecomputation gives �x;y = rXj=1 Res

�F (t; zt )�t=�j(z) :Thus, the diagonal is a rational function of (some) branchesof an algebraic function and itis therefore rational. �Closed form for coefficients. Coefficients of algebraic functions satisfy interesting re-lations, starting with recurrences of finite order. In addition, they can be presented ascombinatorial sums.

First, we show that algebraic functions satisfy differential equations. Take� 2C alg [[z℄℄ an algebraic series of degreed, with minimal polynomialP (z; y); and letU(z; y)be a bivariate rational function. We have seen before thatU(z; �) is also algebraic. In fact,we are going to show thatU can be rewritten asU(z; �) = d�1Xj=0 qj(z)�j ; qj(z) 2 C (z):In other words, a quantityC(z; �) with � algebraic is representable by a polynomial ofC (z)[�℄. The situation generalizes what is known about algebraic numbers and reductionto polynomial form, for instance,13p2 +p3 = 13 3p22 � 16 3p22p3 + 16 3p2� 16 3p2p3 + 13 :

Write U(z; y) = V (z; y)=W (z; y). Clearly, it suffices to prove that1=W (z; �) re-duces to polynomial form. We may freely assume thatW is not a multiple ofP , and sinceP is irreducible, we haveg d(P;W ) = 1, where the gcd is taken inC (z)[y℄. Then, byBezout’s theorem, there existsa(z; y); b(z; y) 2 C (z)[y℄ such thataW � bP = 1. In otherwords,a is the inverse ofW moduloP . We thus have, at the expense of a single gcd andsimple polynomial reduction:U(z; �) � [a(z; y)V (z; y) mod P (z; y)℄y=� :This gives the following basic result:Every rational fraction inz and� lives in the setC [z℄(�)<d of polynomials with�-degree bounded from above byd.

Consider nowC(z)[y℄<d as aC(z)–vector space of dimensiond. The (d + 1) firstderivatives �(z); �0(z); : : : ; �(d)(z)lie in that space and thus they must be bound. As a consequence, there exist coefficients j(in C (z)) such that dXj=0 j(z) djdzj �(z) = 0:


This relation can be obtained constructively by cancellinga minor of maximal rank ofthe determinant expressing the linear dependency7 Extracting coefficients then produces arecurrence of the form eXj=0 dj(n)�n+j = 0;for a sequence of coefficientsdj(n) 2 C (n). (Denominators can always be eliminated sothat the coefficients can be taken to be polynomials ofC[n℄.)

This discussion brings us to a general statement about coefficients of algebraic func-tions.

THEOREM 8.10 (Algebraic coefficients).(i) The coefficients of any algebraic powerseriesy(z) satisfy a linear recurrence with polynomial coefficients, namely,eXj=0 dj(n)yn+j = 0;whereyn = [zn℄y(z) and j(n) 2 C [n℄.(ii)Let �(z; y) be a bivariate polynomial such that�(0; 0) = 0, �0y(0; 0) = 0 and�(z; 0) 6= 0. Consider the algebraic function implicitly defined byf(z) = �(z; f(z)).Then, the Taylor coefficients off(z) admit expressions as combinatorial sums,

(49) [zn℄ f(z) = Xm�1 1m [znym�1℄�m(z; y):Proof. Part(i) is established by the discussion above. Part(ii) is based on a yet unpub-lished memo of Flajolet and Soria, from which the discussionthat follows is extracted.

First, a variant form of the coefficients is available, and Eq. (49) is elementarily equiv-alent to

(50) [zn℄ f(z) =Xm [znym�1℄��m(z; y)�1��0y(z; y)�;as results from the observation that(m+ 1)g0(y)gm(y) = ddygm+1(y).

The starting point of the analytic part of the proof is an integral formula giving theroot of an equation�(y) = 0 inside a simple domain. Let�(y) be analytic and assume thatinside the domain defined by a closed curve , the equation�(y) = 0 has a unique root.Then, we have

(51) RootOf(�(y) = 0; ) = 1� Z y�0(y)�(y) dy:That formula can be seen as a modified form of the “principle ofthe argument” ([51]), butit can also be checked directly via a residue computation.

The algebraic functionf(z) under consideration is a root ofy ��(z; y) = 0. There-fore, if we proceed formally and apply formula (51) to�(y) � y ��(z; y), we getf(z) = 12i� Z y 1��0y(z; y)y ��(z; y) dy= 1� Z 1��0y(z; y)1� 1y�(z; y) dy:(52)

7This result is often referred to in the combinatorics literature as “Comtet’s theorem”; see [25].


Still proceeding formally, we expand the integrand using

(53)11� u = 1 + u+ u2 + u3 + � � � ;

which leads to

(54) f(z) = Xm�0Z (1��0y(z; y))�m(z; y) dyjym :Now Cauchy’s coefficient formula,[ym�1℄ g(y) = 1� Z0+ g(y) dyym ;when applied to the integral in Eq. (54) provides

(55) f(z) = Xm�0[ym�1℄ (1��0y(z; y))�m(z; y);which is the form given in (50).

There only remains to complete the analytic part of the argument. First, by the as-sumptions made regarding�, the point(0; 0) is an ordinary point of the algebraic curvey � �(z; y) = 0. Thus all branches of the curve are “well separated” from thebranchcorresponding tof(z). We can thus find�1 > 0 andr1 > 0 so that, whenz lies in thedomainjzj � �1, we havejf(z)j � r1 while jfj(z)j > r1 for all the conjugate branchesfj(z) 6= f(z). In other words, the use of the integral formula (52) is justified providedwe takejzj < �1 together with the contour = fy = jyj = rg for anyr � r1. We shallhenceforth assume that such a choice has been made.

The next condition to be satisfied is the validity of expansion (53) used to derive (54).This requires the inequalityjuj < 1 whenu = 1y�(z; y). should havejuj < 1. Butthe conditions on�(z; y) at (0; 0) imply that in a sufficiently small neighbourhood of theorigin, i.e. jzj < �2 andjyj < r2, the bound,j�(z; y)j � K(jzj+ jzyj+ jy2j);holds for some positive constantK. Thus, for jzj < �2 and jyj < r2, the conditionj�(z; y)j < jyj is granted if

(56) jzj < jyj1� jyj1 + jyj :We can now conclude the argument. Choose as contour a circle centered at the origin

with radiusr = min(r1; r2). To guarantee condition (56), impose thatz be such thatjzj < � where � = min��1; �2; r1� r1 + r� :Therefore, Equation (55) holds true. �

Part(ii) of the theorem generalizes the Lagrange inversion formula that correspondsto the “separable” case�(z; y) = z�(y). As an example, the coefficients off(z) = z + z2f2(z) + z3f3(z);admit the “nice” formfn = Xm�1� m� 1n�m+ 1; 5m� 3n� 2; 2n� 3m+ 1�:In general, if� comprisesp monomials, the formula obtained is a(p�2)-fold summation.

5. ANALYSIS OF ALGEBRAIC FUNCTIONS 59

5. Analysis of algebraic functions

Algebraic functions can only have a type of singularity constrained to be a branchpoint. The local expansion at such a singularity is a fractional power series known as aNewton–Puiseux expansion. Singularity analysis is systematically applicable to algebraicfunctions—hence the characteristic form of asymptotic expansions that involve terms ofthe form!nnp=q (for some algebraic number! and some rational exponentp=q). In thissection, we develop such basic structural results (Subsection 5.1). However, coming upwith effective solutions (i.e., decision procedures) is not obvious in the algebraic case.Hence, a number of nontrivial algorithms are also described(built on top of elimination byresultants or Groebner bases) in order to locate and analysesingularities (Newton’s poly-gon method), and eventually determine the asymptotic form of coefficients. In particular,the multivalued character of algebraic functions creates aneed to solve “connection prob-lems”. Finally, like in the rational case, positive systems(Subsection 5.2) enjoy specialproperties that further constrain what can be observed as regards asymptotic behavioursand properties of random structures. Our presentation of positive systems is based on anessential result of the theory, the Drmota–Lalley–Woods theorem, that plays for algebraicfunctions a role quite similar to that of Perron-Frobeniustheory for rational functions.

5.1. General algebraic functions.Let P (z; y) be an irreducible polynomial ofC [z; y℄, P (z; y) = p0(z)yd + p1(z)yd�1 + � � �+ pd(z):The solutions of the polynomial equationP (z; y) = 0 define a locus of points(z; y) inC � C that is known as a complex algebraic curve. Letd be they-degree ofP . Then, foreachz there are at mostd possible values ofy. In fact, there existd values ofy “almostalways”, that is except for a finite number of cases:

— If z0 is such thatp0(z0) = 0, then there is a reduction in the degree iny andhence a reduction in the number ofy-solutions for the particular value ofz = z0.One can conveniently regard the points that disappear as “points at infinity”.

— If z0 is such thatP (z0; y) has a multiple root, then some of the values ofy willcoalesce.

Define theexceptional setof P as the set (R is the resultant):�[P ℄ := fz �� R(z) = 0g; R(z) := R(P (z; y); �yP (z; y); y):(The quantityR(z) is also known as the discriminant ofP (z; y) taken as a function ofy.)If z 62 �[p℄, then we have a guarantee that there existd distinct solutions toP (z; y) = 0,sincep0(z) 6= 0 and�yP (z; y) 6= 0. Then, by the implicit function theorem, each ofthe solutionsyj lifts into a locally analytic functionyj(z). What we call abranchof thealgebraic curveP (z; y) = 0 is the choice of such anyj(z) together with a connected

region of the complex plane throughout which this particular yj(z) is analytic.EXERCISE27. Verify that, whenz approachesz0 with p(z0) = 0, then a numberat least 1 of the values ofy satisfyingP (z; y) = 0 tend to infinity.

Singularities of an algebraic function can thus only occur if z lies in the exceptionalset�[P ℄. At a pointz0 such thatp0(z0) = 0, some of the branches escape to infinity,thereby ceasing to be analytic. At a pointz0 where the resultant polynomialR(z) vanishesbut p0(z) 6= 0, then two or more branches collide. This can be either a multiple point(two or more branches happen to assume the same value, but each one exists as an analyticfunction aroundz0) or a branch point (some of the branches actually cease to be analytic).


An example of an exceptional point that is not a branch point is provided by the classicallemniscate of Bernoulli: at the origin, two branches meet while each one is analytic there(see Figure 11).

10-1

FIGURE 11. The lemniscate of Bernoulli defined byP (z; y) = (z2 +y2)2 � (z2 � y2) = 0: the origin is a double point where two analyticbranches meet.

A partial impression of the topology of a complex algebraic curve may be gotten byfirst looking at its restriction to the reals. Consider the polynomial equationP (z; y) = 0,where P (z; y) = y � 1� zy2;which defines the OGF of the Catalan numbers. A rendering of the real part of the curveis given in Figure 12. The complex aspect of the curve as givenby=(y) as a function ofz is also displayed there. In accordance with earlier observations, there are normally twosheets (branches) above each each point. The exceptional set is given by the roots of thediscriminant, R = z(1� 4z):For z = 0, one of the branches escapes at infinity, while forz = 1=4, the two branchesmeet and there is a branch point; see Figure 12.

In summary the exceptional set provides a set ofpossible candidatesfor the singular-ities of an algebraic function. This discussion is summarized by the slightly more generallemma that follows.

LEMMA 8.3 (Location of algebraic singularities).LetY (z) 2 C alg [[z℄℄ satisfy a poly-nomial equationP (z; Y ) = 0. Then,Y (z) is analytic at the origin, i.e., it has a non zeroradius of convergence. In addition, it can be analytically continued along any half-lineemanating from the origin that does not cross any point of theexceptional set.

(The fact that an algebraicseriescannot have radius of convergence equal to 0, i.e.,be purely divergent, follows from the method of majorizing series [52, II, p 94] as well asfrom the detailed discussion given below.)

Nature of singularities. We start the discussion with an exceptional point that is placedat the origin (by a translationz 7! z + z0) and assume that the equationP (0; y) = 0 hask equal rootsy1; : : : ; yk wherey = 0 is this common value (by a translationy 7! y + y0or an inversiony 7! 1=y, if points at infinity are considered). Consider a punctureddiskjzj < r that does not include any other exceptional point relative to P . In the argumentthat follows, we lety1; (z); : : : ; yk(z) be analytic determinations of the root that tend to 0asz ! 0.


–10

–8

–6

–4

–2

0

2

4

6

8

10

y

–1 –0.8 –0.6 –0.4 –0.2 0.2z

–1

0

1

Re(z)

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1

Im(z)

–2

0

2

4

y

0.15

0.2

0.25

0.3

0.35

Re(z)

–0.1

–0.05

0

0.05

0.1

Im(z)

–3

–2

–1

0

1

2

3

y

FIGURE 12. The real section of the Catalan curve (top). The complexCatalan curve with a plot of=(y) as a function ofz = (<(z);=(z))(bottom left); a blowup of=(y) near the branch point atz = 14 (bottomright).

Start at at some arbitrary value interior to the real interval (0; r), where the quantityy1(z) is locally an analytic function ofz. By the implicit function theorem,y1(z) canbe continued analytically along a circuit that starts fromz and returns toz while simplyencircling the origin (and staying within the punctured disk). Then, by permanence ofanalytic relations,y1(z) will be taken into another root, say,y(1)1 (z). By repeating theprocess, we see that after a certain number of times� with 1 � � � k, we will haveobtained a collection of rootsy1(z) = y(0)1 (z); : : : ; y(�)1 (z) = y1(z) that form a set of� distinct values. Such roots are said to form acycle. In this case,y1(t�) is an analyticfunction oft except possibly at 0 where it is continuous and has value 0. Thus, by generalprinciples (regarding removable singularities), it is in fact analytic at 0. This in turn implies


the existence of a convergent expansion near 0:

(57) y1(t�) = 1Xn=1 ntn:The parametert is often called thelocal uniformizing parameter, as it reduces a mul-tivalued function to a single value one. This translates back into the world ofz: eachdetermination ofz1=� yields one of the branches of the multivalued analytic function as

(58) y1(z) = 1Xn=1 nzn=�:Alternatively, with! = e2i�=� a root of unity, the� determinations are obtained asy(j)1 = 1Xn=1 n!nzn=�;each being valid in a sector of opening< 2�. (The case� = 1 corresponds to an analyticbranch.)

If r = k, then the cycle accounts for all the roots which tend to 0. Otherwise, werepeat the process with another root and, in this fashion, eventually exhaust all roots. Thus,all thek roots that have value0 atz = 0 are grouped into cycles of size�1; : : : ; �`. Finally,values ofy at infinity are brought to zero by means of the change of variablesy = 1=u,then leading to negative exponents in the expansion ofy.

THEOREM 8.11 (Newton–Puiseux expansions at a singularity).Letf(z) be a branchof an algebraic functionP (z; f(z)) = 0. In a circular neighbourhood of a singularity� slit along a ray emanating from�, f(z) admits a fractional series expansion (Puiseuxexpansion) that is locally convergent and of the formf(z) = Xk�k0 k(z � �)k=�;for a fixed determination of(z � �)1=�, wherek0 2 Z and� is an integer� 2, called the“branching type”.

Newton (1643-1727) discovered the algebraic form of Theorem 8.11, published it inhis famous treatiseDe Methodis Serierum et Fluxionum(completed in 1671). This methodwas subsequently developed by Victor Puiseux (1820–1883) so that the name of Puiseuxseries is customarily attached to fractional series expansions. The argument given aboveis taken from the neat exposition offered by Hille in [52, Ch. 12, vol. II]. It is known asa “monodromy argument”, meaning that it consists in following the course (–dromy) ofvalues of a multivalued analytic function along paths in thecomplex plane till it returns toits original (mono–) value.

Newton polygon.Newton also described a constructive approach to the determination ofbranching types near a point(z0; y0), that by means of the previous discussion can alwaysbe taken to be(0; 0). In order to introduce the discussion, let us examine the Catalangenerating function nearz0 = 1=4. Elementary algebra gives the explicit form of the twobranches y1(z) = 12z �1�p1� 4z� ; y2(z) = 12z �1 +p1� 4z� ;whose forms are consistent with what Theorem 8.11 predicts.If however one starts directlywith the equation, P (z; y) � y � 1� zy2 = 0


then, the translationz = 1=4 � Z (the minus sign is a mere notational convenience),y = 2 + Y yields

(59) Q(Z; Y ) � �14Y 2 + 4Z + 4ZY + ZY 2:Look for solutions of the formY = Z�(1 + o(1)) with 6= 0 (the existence is granteda priori by the Newton-Puiseux Theorem). Each of the monomials in (59) gives rise to aterm of a well determined asymptotic order. respectivelyZ2�; Z1; Z�+1; Z2�+1. If theequation is to be identically satisfied, then the main asymptotic order ofQ(Z; Y ) shouldbe 0. Since 6= 0, this can only happen if two or more of the exponents in the sequence(2�; 1; � + 1; 2� + 1) coincideand the coefficients of the corresponding monomial inP (Z; Y ) is zero, a condition that is an algebraic constraint on the constant . Furthermore,exponents of all the remaining monomials have to be larger since by assumption theyrepresent terms of lower asymptotic order.

Examination of all the possible combinations of exponents leads one to discover thatthe only possible combination arises from the cancellationof the first two terms ofQ,namely� 14Y 2 + 4Z, which corresponds to the set of constraints2� = 1; �14 2 + 4 = 0;with the supplementary conditions�+1 > 1 and2�+1 > 1 being satisfied by this choice� = 12 . We have thus discovered thatQ(Z; Y ) = 0 is consistent asymptotically withY � 4Z1=2; Y � �4Z1=2:

The process can be iterated upon subtracting dominant terms. It invariably gives riseto complete formal asymptotic expansions that satisfyQ(Z; Y ) = 0 (in the Catalan ex-ample, these are series in�Z1=2). Furthermore, elementary majorizations establish thatsuch formal asymptotic solutions represent indeed convergent series. Thus, local expan-sions of branches have indeed been determined. This is Newton’s algorithm for expandingalgebraic functions near a branch point.

An algorithmic refinement (also due to Newton) can be superimposed on the previousdiscussion and is known as the method ofNewton polygons. Consider a general polynomialQ(Z; Y ) =Xj2J ZajY bj ;and associate to it the finite set of points(aj ; bj) in N � N, which is called the Newtondiagram. It is easily verified that the only asymptotic solutions of the formY / Ztcorrespond to values oft that are inverse slopes (i.e.,�x=�y) of lines connecting two ormore points of the Newton diagram (this expresses the cancellation condition between twomonomials ofQ) and such that all other points of the diagram are on this line or totheright of it. In other words:

Newton’s polygon method.The possible exponentst such thatY � Zt is asolution to a polynomial equation correspond to the inverseslopes of the leftmostconvex envelope of the Newton diagram. For each viablet, a polynomial equa-tion constrains the possible values of the corresponding coefficient . Completeexpansions are obtained by repeating the process, which means deflatingY fromits main term by way of the substitutionY 7! Y � Zt.


–0.4

–0.2

0

0.2

0.4

y

–0.4 –0.2 0.2 0.4x

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

5

��

��

��

��

��

��

��

��

��

0 1 2 3 4 5 6

1

2

3

4

FIGURE 13. The real curve defined by the equationP = (y�x2)(y2�x)(y2 � x3) � x3y3 near(0; 0) (left) and the corresponding Newtondiagram (right).

Figure 13 illustrates what goes on in the case of the curvep = 0 whereP (z; y) = (y � z2)(y2 � z)(y2 � z3)� z3y3= y5 � y3z � y4z2 + y2z3 � 2 z3y3 + z4y + z5y2 � z6;considered near the origin. As the partly factored form suggests, we expect the curve toresemble the union of two orthogonal parabolas and of a curvey = �z3=2 having a cusp,i.e., the union of y = z2; y = �pz; y = �z3=2;respectively. It is visible on the Newton diagram of the expanded form that the possibleexponentsy / zt at the origin are the inverse slopes of the segments composing theenvelope, that is, t = 2; 12 ; 32 :

For computational purposes, once determined the branchingtype�, the value ofk0that dictates where the expansion starts, and the first coefficient, the full expansion can berecovered by deflating the function from its first term and repeating the Newton diagramconstruction. In fact, after a few initial stages of iteration, the method of indeterminatecoefficients can always be eventually applied8. Computer algebra systems usually havethis routine included as one of the standard packages; see [82].

EXERCISE 28. Discuss the degree of the algebraic number in the expansionY � Zt in relation to Newton’s diagram.

8Bruno Salvy, private communication, August 2000


Asymptotic form of coefficients. The Newton–Puiseux theorem describes precisely thelocal singular structure of an algebraic function. The expansions are valid around a singu-larity except for one direction. In particular they hold in indented disks of the type requiredin order to apply the formal translation mechanisms of singularity analysis (Chapter 5).

THEOREM 8.12 (Algebraic asymptotics).Let f(z) = Pn fnzn be an algebraic se-ries. Assume that the branch defined by the series at the origin has a unique dominantsingularity atz = �1 on its circle of convergence. Then, the coefficientfn satisfies theasymptotic expansion, fn � ��n1 0�Xk�k0 dkn�1�k=�1A ;wherek0 2 Z and� is an integer� 2.

If f(z) has several dominant singularitiesj�1j = j�2j = � � � = j�rj, then there existsan asymptotic decomposition (where� is some small fixed number,� > 0)fn = rXj=1 �(j)(n) +O((j�1j+ �))�n;where �(j)(n) � ��nj 0B� Xk�k(j)0 d(j)k n�1�k=�j1CA ;eachk(j)0 is inZ, and each�j is an integer� 2.

Proof. The directional expansions granted by Theorem 8.11 are of the exact type requiredby singularity analysis; see Chapter 5. Composite contoursshould be used in the case ofmultiple singularities, where each�(j)(n) is the contribution obtained by translation of thelocal singular element. �

In the case of multiple singularities, arithmetic cancellations may occur: consider forinstance the case of1q1� 65z + z2 = 1 + 0:60z + 0:04z2 � 0:36z3 � 0:408z4 � � � � ;and refer to the corresponding discussion of rational coefficients, page 11. Fortunately,such delicate situations tend not to arise in combinatorialsituations.

EXAMPLE 17. Unary-binary trees. the generating function of unary binary trees isdefined byP (z; f) = 0 whereP (z; y) = y � z � zy � zy2;so that f(z) = 1� z �p1� 2z � 3z22z = 1� z �p(1 + z)(1� 3z)2z :There exist only two branches:f and its conjugate�f that form a cycle of size 2 at13 .The singularities of all branches are at0;�1; 13 as is apparent from the explicit form off or from the defining equation. The branch representingf(z) at the origin is analyticthere (by a general argument or by the combinatorial origin of the problem). Thus, thedominant singularity off(z) is at 13 and it is unique in its modulus class. The “easy” case ofTheorem 8.13 then applies oncef(z) has been expanded ear13 . As a rule, the organization


0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

y

–0.4 –0.2 0 0.2z

FIGURE 14. Non-crossing graphs:(a) a random connected graph ofsize 50; (b) the real algebraic curve corresponding to non-crossingforests.

of computations is simpler if one makes use of the local uniformizing parameter with achoice of sign in accordance to the direction along which thesingularity is approached. Inthis case, we setz = 13 � Æ2 and findf(z) = 1� 3 Æ + 92Æ2 � 638 Æ3 + 272 Æ4 � 2997128 Æ5 + � � � ; Æ = (13 � z)1=2:This translates immediately intofn � [zn℄f(z) � 3n+1=22p�n3 �1� 1516n + 505512n2 � 80858192n3 + � � �� :The approximation provided by the first three terms is quite good: forn = 10 already, itestimatesf10 = 835. with an error less than1. �

EXERCISE29. Estimate the growth of the coefficients in the asymptoticexpan-sion of the number of unary-binary trees.

EXAMPLE 18. Non-crossing forests.Consider the regularn-gon whose vertices are num-bered1; : : : ; n. A non-crossing graph (connected graph, tree, forest, etc)is defined asa graph having the property that no two edges edges cross. LetFn be the number ofnon-crossing graphs of sizen that are forests, i.e., acyclic graphs. It is shown below (Sec-tion 6.1; see also [40]) that the OGFF (z) satisfies the equationP (z; F ) = 0, whereP (z; y) = y3 + (z2 � z � 3)y2 + (z + 3)y � 1;and that the combinatorial GF starts asF (z) = 1 + 2z + 7z2 + 33z3 + 181z4 + 1083z5 + � � � :(This is sequence A054727 ofEIS.) The exceptional set is mechanically computed as rootsof the discriminant R = �z3(5z3 � 8z2 � 32z + 4):Newton’s algorithm shows that two of them, sayy0 andy2, form a cycle of length 2 withy0 = 1�pz+O(z), y2 = 1+pz+O(z)while it is the “middle branch”y1 = 1+z+O(z2)that corresponds to the combinatorial GFF (z).


The other exceptional points are the roots of the cubic factor of R, namely = f�1:93028; 0:12158; 3:40869g:Let � := 0:1258 be the root in(0; 1). By Pringsheim’s theorem and the fact that the OGF ofan infinite combinatorial class must have a positive dominant singularity in[0; 1℄, the onlypossibility for the dominant singularity ofy1(z) is �. (For a more general argument, seebelow.)

For z near�, the three branches of the cubic give rise to one branch that is ana-lytic with value approximately0:67816 and a cycle of two conjugate branches with valuenear1:21429 at z = �. The expansion of the two conjugate branches is of the singulartype, �� p1� z=�;where� = 4337 + 1837� � 3574�2 := 1:21429; � = 137p228� 981� � 5290�2 := 0:14931:The determination with a minus sign must be adopted for representing the combinatorialGF whenz ! �� since otherwise one would get negative asymptotic estimates for thenonnegative coefficients. Alternatively, one may examine the way the three real branchesalong(0; �) match with one another at 0 and at��, then conclude accordingly.

Collecting partial results, we finally get by singularity analysis the estimateFn = �2p�n3!n�1 +O( 1n )� ; ! = 1� := 8:22469where the cubic algebraic number� and the sextic� are as above. �

The example above illustrates several important points in the analysis of coefficientsof algebraic functions when there are no simple explicit radical forms. First of all a givencombinatorial problem determines a unique branch of an algebraic curve at the origin.Next, the dominant singularity has to be identified by “connecting” the combinatorialbranch with the branches at every possible singularity of the curve. Finally, computationstend to take place over algebraic numbers and not simply rational numbers.

So far, examples have illustrated the common situation where the exponent at thedominant singularity is12 , which is reflected by a factor ofn�3=2 in the asymptotic form ofcoefficients. Our last example shows a case where the exponent assumes a different value,namely14 .

EXAMPLE 19. “Supertrees”.Consider the equationP (z; S) = 0 whereP (z; y) = zy4 � y3 + (2 z + 1) y2 � y + z:The general aspect of the curve is given in Figure 15 and the asymptotic expansion of

the branch with positive coefficients,y(z) = z + z2 + 3z3 + 8z4 + � � � , is sought.The discriminant is found to beR = z(4z + 3)(4z � 1)3;

so that the dominant singularity of the branch of combinatorial interest isz = 14 where itsvalue equals 1. The translationz = 12 � Z; y = 1 + Y transformsP into~P (Z; Y ) = (14 � Z)Y 4 � 4ZY 3 � 8ZY 2 � 8ZY � 4Z:


–3

–2

–1

0

1

2

3

y

–0.6 –0.4 –0.2 0.2z

FIGURE 15. Supertrees (left) and their quartic generating function (right).

The main cancellation arises from14Y 4�4Z = 0: this corresponds to a segment of inverseslope1=4 in the Newton diagram and accordingly to a cycle formed with 4conjugatebranches. Thus, one has, asz ! ( 14 )�,y(z) = 1� 2 4r14 � z + � � � so that [zn℄y(z) � 1p8�( 34 )4n n�5=4;asn ! 1. This exhibits a nonstandard occurrence of�( 34 ), of the singular exponent14and of the coefficient exponent� 54 .

Here is the combinatorial origin of this GF. Consider the GF of complete binary trees,b(z) = 1�p1� 4z22z :(The function is singular at12 with valueb(1=2) = 1.) The compositionb(zb(z)) repre-sents the GF of “supertrees” (or “trees or trees”) obtained by grafting planted binary trees(zb(z)) at each node of a complete tree(b(z)). The original functionS(z) satisfies infact S(z2) = b(zb(z)). (Naturally, the construction was purposely designed to create aninteresting confluence of singularities.) �Computable coefficient asymptotics.The previous discussion contains the germ of acomplete algorithm for deriving an asymptotic expansion ofcoefficients of any algebraicfunction. We sketch here the main principles leaving some ofthe details to the reader.Observe that the problem is aconnection problems: the “shapes” of the various sheetsaround each point (including the exceptional points) are known, but it remains to connectthem together and see which ones are encountered first when starting from a given branchat the origin.

Algorithm ACA: Algebraic Coefficient Asymptotics.Input: A polynomialP (z; y) with d = degyP (z; y); a seriesY (z) such thatP (z; Y ) =0 and assumed to be specified by sufficiently many initial termsso as to be distinguishedfrom all other branches.Output: The asymptotic expansion of[zn℄Y (z) whose existence is granted by Theo-rem 8.12.


The algorithm consists of three main steps:Preparation, Dominant singularities, andTransla-tion.

I. Preparation:Define the discriminantR(z) = R(P; P 0y; y).(P1) Compute the exceptional set� = fz �� R(z) = 0g and the points of infinity�0 =fz �� p0(z) = 0g, wherep0(z) is the leading coefficient ofP (z; y) considered as afunction ofy.(P2) Determine the Puiseux expansions of all thed branches at each of the points of� [ f0g(by Newton diagrams and/or indeterminate coefficients). This includes the expansion ofanalytic branches as well. Letfy�;j(z)gdj=1 be the collection of all such expansions atsome� 2 � [ f0g.(P3) Identify the branch at 0 that corresponds toY (z).

II. Dominant singularities(Controlled approximate matching of branches). Let�1;�2; : : : be apartition of the elements of� [ f0g sorted according to the increasing values of their modulus:it isassumed that the numbering is such that if� 2 �i and� 2 �j , thenj�j < j�j is equivalent toi < j.Geometrically, the elements of� have been grouped in concentric circles. First, a preparation step isneeded.(D1) Determine a nonzero lower boundÆ on the radius of convergence of any local Puiseux

expansion of any branch at any point of�. Such a bound can be constructed from theminimal distance between elements of� and from the degreed of the equation.

The sets�j are to be examined in sequence until it is detected that one ofthem contains a singu-larity. At stepj, let �1; �2; : : : ; �s be an arbitrary listing of the elements of�j . The problem is todetermine whether any�k is a singularity and, in that event, to find the right branch towhich it isassociated. This part of the algorithm proceeds by controlled numerical approximations of branchesand constructive bounds on the minimum separation distancebetween distinct branches.(D2) For each candidate singularity�k, with k � 2, set�k = �k(1 � Æ=2). By assumption,

each�k is in the domain of convergence ofY (z) and of anyy�k;j .(D3) Compute a nonzero lower bound�k on the minimum distance between two roots ofP (�k; y) = 0. This separation bound can be obtained from resultant computations.(D4) EstimateY (�k) and eachy�k;j(�k) to an accuracy better than�k=4. If two elements,Y (z) andy�k;j(z) are (numerically) found to be at a distance less than�k for z = �k, thenthey are matched:�k is a singularity and the correspondingy�k;j is the correspondingsingular element. Otherwise,�k is declared to be a regular point forY (z) and discardedas candidate singularity.

The main loop onj is repeated until a singularity has been detected., whenj = j0, say. The radius ofconvergence� is then equal to the common modulus of elements of�j0 ; the corresponding singularelements are retained.

III. Coefficient expansion. Collect the singular elements at all the points� determined to be adominant singularity at Phase III. Translate termwise using the singularity analysis rule,(� � z)p=� 7! �p=��n �(�p=�+ n)�(�p=�)�(n+ 1) ;and reorganize into descending powers ofn, if needed.

This algorithm vindicates the following assertion:

PROPOSITION8.5 (Decidability of algebraic connections.).The dominant singular-ities of a branch of an algebraic function can be determined by the algorithmACA in afinite number of operations in the algebraic closure of the base field,C or Q .


5.2. Positive functions and positive systems.The discussion of algebraic singulari-ties specializes nicely to the case of positive functions. We first indicate a procedure thatdetermines the radius of convergence of any algebraic series withpositivecoefficients. Theprocedure takes advantage of Pringsheim’s theorem that allows us to restrict attention tocandidate singularities on the positive half-line. It represents a shortcut that is often suit-able for human calculation and, in fact, the it systematizessome of the techniques alreadyused implicitly in earlier examples.

Algorithm ROCPAF: Radius of Convergence of Positive Algebraic Functions.Input: A polynomialP (z; y) with d = degyP (z; y); a seriesY (z) such thatP (z; Y ) =0 that is known to have only nonnegative coefficients ([zn℄Y (z) � 0) and is assumed tobe specified by sufficiently many initial terms.Output: The radius of convergence� of Y (z).

Plane-sweep.Let�+ be the subset of those elements of the exceptional set� which are positivereal.(R1) Sort the subset of those branchesfy0;jg at 0+ that have totally real coefficients. This is

essentially a lexicographic sort that only needs the initial parts of each expansion. Setinitially �0 = 0 andU(z) = Y (z).(R2) Sweep over all� 2 �+ in increasing order. To detect whether a candidate� is the domi-nant positive singularity, proceed as follows:

— Sort the branchesfy�;jg at�� that have totally real coefficients.— using the orders at�+0 and��, match the branchU(z)with its corresponding branch

at ��, sayV (z); this makes use of the total ordering between real branches at �+0and��. If the branchV (z) is singular, then return� = � as the radius of conver-gence ofY (z) and useV (z) as the singular element ofY (z) at z = �; otherwisecontinue with the next value of� 2 �+ while replacingU(z) by V (z) and�0 by �.

This algorithm is a plane-sweep that takes advantage of the fact that the real branchesnear a point can be totally ordered; finding the ordering onlyrequires inspection of a finitenumber of coefficients. The plane-sweep algorithm enables us to trace at each stage theoriginal branch and keep a record of its order amongst all branches. The method workssince no two real branches can cross at a point other than a multiple point, such a pointbeing covered as an element of�+.

We now turn to positive systems. Most of the combinatorial classes known to admitalgebraic generating functions involve singular exponents that are multiples of12 . Thisempirical observation is supported by the fact, to be provedbelow, that a wide class ofpositive systems have solutions with a square-root singularity. Interestingly enough, thecorresponding theorem is due to independent research by several authors: Drmota [31]developed a version of the theorem in the course of studies relative to limit laws in variousfamilies of trees defined by context-free grammars; Woods [100], motivated by questionsof Boolean complexity and finite model theory, gave a form expressed in terms of colouringrules for trees; finally, Lalley [63] came across a similarly general result when quantifyingreturn probabilities for random walks on groups. The statement that follows is a fundamen-tal result in the analysis of algebraic systems arising fromcombinatorics and is (rightly)called the “Drmota-Lalley-Woods” theorem. Notice that theauthors of [31, 63, 100] provemore: Drmota and Lalley show how to pull out limit Gaussian laws for simple parameters(e.g., as in [31] by a perturbative analysis; see Chapter 9); Woods shows howto deduceestimates of coefficients even in some periodic or non-irreducible cases (see definitionsbelow).


In the treatment that follows we start from a polynomial system of equations,fyj = �j(z; y1; : : : ; ym)g ; j = 1; : : : ;m:We shall discuss in the next section a class of combinatorialspecifications, the “context-free” specifications, that leads systematically to such fixed-point systems. The case oflinear systems has been already dealt with, so that we limit ourselves here tononlinearsystemsdefined by the fact that at least one polynomial�j is nonlinear in some of theindeterminatesy1; : : : ; ym.

First, for combinatorial reasons, we define several possible attributes of a polynomialsystem.

— Algebraic positivity(or a-positivity). A polynomial system is said to bea-positiveif all the component polynomials�j have nonnegative coefficients.

Next, we want to restrict consideration to systems that determine a unique solution vector(y1; : : : ; ym) 2 (C [[z℄℄)m. (This discussion is related to 0-dimensionality in the sensealluded to earlier.) Define thez-valuationval(~y) of a vector~y 2 C [[z℄℄m as the minimumover all j’s of the individual valuations9 val(yj). The distance between two vectors isdefined as usual byd(~y; ~y0) = 2�val(~y�~y0). Then, one has:

— Algebraic properness(or a-properness). A polynomial system is said to bea-proper if it satisfies a Lipschitz conditiond(�(~y);�(~y 0)) < Kd(~y; ~y 0) for someK < 1.

In that case, the transformation� is a contraction on the complete metric space of formalpower series and, by the general fixed point theorem, the equation y = �(y) admits aunique solution. In passing, this solution may be obtained by the iterative scheme,~y(0) = (0; : : : ; 0)t; ~y(h+1) = �(y(h)); y = limh!1 y(h):

The key notion is irreducibility. To a polynomial system,~y = �(~y), associate itsdependency graphdefined as a graph whose vertices are the numbers1; : : : ;m and theedges ending at a vertexj arek ! j, if yj figures in a monomial of�k(j). (This notion isreminiscent of the one already introduced for linear systemon page 8.5.)

— Algebraic irreducibility(or a-irreducibility). A polynomial system is said to bea-irreducibleif its dependency graph is strongly connected.

Finally, one needs a technical notion of periodicity to dispose of cases likey(z) = 12z �1�p1� 4z� = z + z3 + 2z5 + � � � ;(the OGF of complete binary trees) where coefficients are only nonzero for certain residueclasses of their index.

— Algebraic aperiodicity(or a-aperiodicity). A power series is said to beaperiodicif it contains three monomials (with nonzero coefficients),ze1 ; ze2 ; ze3 , such thate2 � e1 ande3 � e1 are relatively prime. A proper polynomial system is said tobe aperiodic if each of its component solutionsyj is aperiodic.

THEOREM8.13 (Positive polynomial systems).Consider a nonlinear polynomial sys-tem~y = �(~y) that is a-proper, a-positive, and a-irreducible. In that case, all component

9Let f =P1n=� fnzn with f� 6= 0; the valuation off is by definition val(f) = �.


solutionsyj have the same radius of convergence� < 1. Then, there exist functionshjanalytic at the origin such that

(60) yj = hj �p1� z=�� (z ! ��):In addition, all other dominant singularities are of the form�! with ! a root of unity.

If furthermore the system is a-aperiodic, allyj have� as unique dominant singularity. Inthat case, the coefficients admit a complete asymptotic expansion of the form

(61) [zn℄yj(z) � ��n0�Xk�1 dkn�1�k=21A :Proof. The proof consists in gathering by stages consequences of the assumptions. It isessentially based on close examination of “failures” of theimplicit function theorem andthe way these lead to singularities.(a) As a preliminary observation, we note that each component solution yj is analgebraic function that has a nonzero radius of convergence. In particular, singularitiesare constrained to be of the algebraic type with local expansions in accordance with theNewton-Puiseux theorem (Theorem 8.11).(b) Properness together with the positivity of the system implies that eachyj(z) hasnonnegative coefficients in its expansion at 0, since it is a formal limit of approximants thathave nonnegative coefficients. In particular, each power series yj has a certain nonzeroradius of convergence�j . Also, by positivity,�j is a singularity ofyj (by virtue of Pring-sheim’s theorem). From the nature of singularities of algebraic functions, there exists someorderR � 0 such that eachRth derivative�Rz yj(z) becomes infinite asz ! ��j .

We establish now that�1 = � � � = �m. In effect, differentiation of the equations com-posing the system implies that a derivative of arbitrary orderr, �rzyj(z), is a linear form inother derivatives�rzyj(z) of the same order (and a polynomial form in lower order deriva-tives); also the linear combination and the polynomial formhave nonnegative coefficients.Assume a contrario that the radii were not all equal, say�1 = � � � = �s, with the other radii�s+1; : : : being strictly greater. Consider the system differentiated a sufficiently large num-ber of times,R. Then, asz ! �1, we must have�Rz yj tending to infinity forj � s. On theother hand, the quantitiesys+1, etc., being analytic, theirRth derivatives that are analyticas well must tend to finite limits. In other words, because of the irreducibility assumption(and again positivity), infinityhas topropagate and we have reached a contradiction. Thus,all theyj have the same radius of convergence and we let� denote this common value.( 1) The key step consists in establishing the existence of a square-root singularity atthe common singularity�. Consider first the scalar case, that is

(62) y � �(z; y) = 0;where� is assumed to depend nonlinearly ony and have nonnegative coefficients. Therequirement of properness means thatz is a factor of all monomials, except the constantterm�(0; 0).

Let y(z) be the unique branch of the algebraic function that is analytic at 0. Compar-ison of the asymptotic orders iny inside the equalityy = �(z; y) shows that (by nonlin-earity) we cannot havey ! 1 whenz tends to a finite limit. Let now� be the radius ofconvergence ofy(z). This argument shows thaty(z) is necessarily finite at its singularity�.We set� = y(�) and note that, by continuity� � �(�; �) = 0.


By the implicit function theorem, a solution(z0; y0) of (62) can be continued analyti-cally as(z; y0(z)) in the vicinity ofz0 as long as the derivative with respect toy,J(z0; y0) := 1� �0y(z0; y0)remains nonzero. The quantity� being a singularity, we must haveJ(�; �) = 0. (Inpassing, the system � � �(�; �) = 0; J(�; �) = 0;determines only finitely many candidates for�.) On the other hand, the second derivative��00yy is nonzero at(�; �) (by positivity, since no cancellation can occur); there results bythe classical argument on local failures of the implicit function theorem thaty(z) has asingularity of the square-root type (see also Chapters 4 and5). More precisely, the localexpansion of the defining equation (62) at(�; �) binds(z; y) locally by�(z � �)�0z(�; �)� 12(y � �)2�00yy(�; �) + � � � = 0;where the subsequent terms are negligible by Newton’s polygon method. Thus, we havey � � = �s �z(�; �)�00yy(�; �) (�� z)1=2 + � � � ;the negative determination of the square-root being chosento comply with the fact thaty(z) increases asz ! ��. This proves the first part of the assertion in the scalar case.( 2) In the multivariate case, we graft an ingenious argument [63] that is based ona linearized version of the system to which Perron-Frobenius theory is applicable. First,irreducibility implies that any component solutionyj depends nonlinearly on itself (bypossibly iterating�), so that a discrepancy in asymptotic behaviours would result for theimplicitly definedyj in the event that someyj tends to infinity.

Now, the multivariate version of the implicit function theorem grants locally the ana-lytic continuation of any solutiony1; y2; : : : ; ym atz0 provided there is no vanishing of theJacobian determinantJ(z0; y1; : : : ; ym) := det�Æi;j � ��yj�i(z0; y1; : : : ; ym)� ;whereÆi;j is Kronecker’s symbol. Thus, we must haveJ(�; �1; : : : ; �m) = 0 where �j := yj(�):

The next argument (we follow Lalley [63]) uses Perron-Frobenius theory and linearalgebra. Consider the modified Jacobian matrixK(z0; y1; : : : ; ym) := � ��yj�i(z0; y1; : : : ; ym)� ;which represents the “linear part” of�. For z; y1; : : : ; ym all nonnegative, the matrixKhas positive entries (by positivity of�) so that it is amenable to Perron-Frobenius theory.In particular it has a positive eigenvalue�(z; y1; : : : ; ym) that dominates all the other inmodulus. The quantity b�(z) = �(y1(z); : : : ; ym(z))is increasing as it is an increasing function of the matrix entries that themselves increasewith z for z � 0.

We propose to prove thatb�(�) = 1, In effect,b�(�) < 1 is excluded since otherwise(I �K) would be invertible atz = � and this would implyJ 6= 0, thereby contradictingthe singular character of theyj(z) at�. Assumea contrariob�(�) > 1 in order to exclude


the other case. Then, by the increasing property, there would exists�1 < � such thatb�(�1) = 1. Let v1 be a left eigenvector ofK(�1; y1(�1); : : : ; ym(�1)) corresponding tothe eigenvalueb�(�1). Perron-Frobenius theory grants that such a vectorv1 has all itscoefficients that are positive. Then, upon multiplying on the left byv1 the column vectorscorresponding toy and�(y) (which are equal), one gets an identity; this derived identityupon expanding near�1 gives

(63) A(z � �1) = �Xi;j Bi;j(yi(z)� yi(�1))(yj(z)� yj(�1)) + � � � ;where� � � hides lower order terms and the coefficientsA;Bi;j are nonnegative withA > 0.There is a contradiction in the orders of growth if eachyi is assumed to be analytic at�1since the left side of (63) is of exact order(z � �1) while the right side is at least as smallas to(z � �1)2. Thus, we must haveb�(�) = 1 andb�(x) < 1 for x 2 (0; �).

A calculation similar to (63) but with�1 replaced by� shows finally that, ifyi(z)� yi(�) � i(�� z)�;then consistency of asymptotic expansions implies2� = 1, that is� = 12 . (The argu-ment here is similar to the first stage of a Newton polygon construction.) We have thusproved that the component solutionsyj(z) have a square-root singularity. (The existenceof a complete expansion in powers of(� � z)1=2 results from examination of the Newtondiagram.) The proof of the general case (60) is at last completed.(d) In the aperiodic case, we first observe that eachyj(z) cannot assume an infinitevalue on its circle of convergencejzj = �, since this would contradict the boundedness ofjyj(z)j in the open diskjzj < � (whereyj(�) serves as an upperbound). Consequently, bysingularity analysis, the Taylor coefficients of anyyj(z) areO(n�1��) for some� > 1and the series representingyj at the origin converges onjzj = �.

For the rest of the argument, we observe that ify = �(z; ~y), theny = �hmi(z; ~y)where the superscript denotes iteration of the transformation � in the variables~y =(y1; : : : ; ym). By irreducibility,�hmi is such thateachof its component polynomials in-volvesall the variables.

Assume that there would exists a singularity�� of someyj(z) on jzj = �. The triangu-lar inequality yieldsjyj(��)j < yj(�) where strictness is related to the general aperiodicityargument encountered at several other places in this book. But then, the modified Jaco-bian matrixKhmi of �hmi taken at theyj(��) has entries dominated strictly by the entriesof Khmi taken at theyj(�). There results (see page 17) that the dominant eigenvalue ofKhmi(z; ~yj(��))must be strictly less than 1. But this would imply thatI�Khmi(z; ~yj(��))is intervertible so that theyj(z) would be analytic at��. A contradiction has been reached:� is the sole dominant singularity of eachyj and this concludes the argument. �

We observe that the dominant singularity is obtained amongst the positive solutions ofthe system ~� = �(�; ~� ); J(�; ~� ) = 0:For the Catalan GF, this yields� � 1� ��2 = 0; 1� 2�� = 0;giving back (as expected)� = 14 , � = 12 . For three coloured trees (withy1; y2; y3 repre-sentingA;B;C), the system is formed of the specialization of the defining equations (40),namely,�1 � �� (�2 + �3)2 = 0; �2 � (�3 + �1)2 = 0; �3 � (�1 + �2)2 = 0:

6. APPLICATIONS OF ALGEBRAIC FUNCTIONS 75

together with the Jacobian conditiondet0BBB� 1 �2�2 � 2�3 �2�3 � 2�2�2�1 � 2�3 1 �2�3 � 2�1�2�1 � 2�2 �2�2 � 2�1 1 1CCCA = 0:It is found (by elimination) that� = 14�(3� + 1); where 4�3 + 4�2 + �� 1 = 0;with � := 0:34681 and� := 0:177681 being the only feasible solution to the constraints.Thus, the number of 3-coloured trees grows roughly like5:62nn�3=2.

EXERCISE30. Match the computation of the dominant singularity of 3-colouredtrees against the determination of the minimal polynomialR(z; A) of the previ-ous section.

6. Combinatorial applications of algebraic functions

In this section, we first present context-free specifications that admit a direct transla-tion into polynomial systems (Section 6.1). When particularized to formal languages, thisgives rise to context-free languages (Section 6.2) that, provided an unambiguity conditionis met, lead to algebraic generating functions. An important subclass, especially as regardscomputer science applications, is that of simple families of trees succinctly presented inSection 6.3.

The next two subsections introduce objects whose constructions still lead to algebraicfunctions, but in a non-obvious way. This includes: walks with a finite number of allowedbasic jumps (Section 6.4) and planar maps (Section 6.5). In that case, bivariate functionalequations are induced by the combinatorial decompositions. The common form is

(64) �(z; u; F (z; u); h1(z); : : : ; hr(z)) = 0;where� is a known polynomial and the unknowns areF andh1; : : : ; hr. Specific methodsare to be appealed to in order to attain solutions to such functional equations that wouldseem at first glance to be grossly underdetermined. Random walks lead to a linear versionof (64) that is treated by the so-called “kernel method”. Maps lead to nonlinear versionsthat are solved by means of Tutte’s “quadratic method”. In both cases, the strategy consistsin bindingz andu by forcing them to lie on an algebraic curve (suitably chosenin order toeliminate the dependency onF (z; u)), and then pulling out the algebraic consequences ofsuch a specialization.

6.1. Context-free specifications.A context-free systemis a collection of combinato-rial equations,

(65)

8>>><>>>: C1 = �1(~a; C1; : : : ; Cm)...

...Cm = �m(~a; C1; : : : ; Cm);where~a = (a1; : : :) is a vector of atoms and each of the�j only involves the combinatorialconstructions of disjoint union and cartesian product. A combinatorial classC is said to


be context-free if it is definable as the first component (C = C1) of a well-founded context-free system. The terminology comes from linguistics and it stresses the fact that objectscan be “freely” generated by the rules in (65), this without any constraints imposed by anoutside context10.

For instance the class of plane binary trees defined byB = e+ (i�B � B) (e; i atoms)is a context-free class. The class of general plane trees defined byG = o� sequen e(G) (o an atom)

is definable by the systemG = o�F ; F = 1+ (F � G);with F defining forests, and so it is also context-free. (This example shows more generallythat sequences can always be reduced to polynomial form.)

Context-free specifications may be used to describe all sorts of combinatorial objects.For instance, the classT of triangulations of convex polygons is specified symbolically by

(66) T = r+ (r� T ) + (T �r) + (T �r� T );wherer represents a generic triangle.

The general symbolic rules given in Chapter 1 apply in all such cases. Thereforethe Drmota-Lalley-Woods theorem (Theorem 8.13) provides the asymptotic solution to animportant category of problems.

PROPOSITION 8.6 (Context-free specifications).A context-free classC admits anOGF that satisfies a polynomial system obtained from the specification by the translationrules: A+ B 7! A+B; A�B 7! A � B:The OGFC(z) is an algebraic function to which algebraic asymptotics applies. In partic-ular, a context-free classC that gives rise to an algebraically aperiodic irreducible systemhas an enumeration sequence satisfyingCn � p�n3!n;where ; ! are computable algebraic numbers.

This last result explains the frequently encountered estimates involving a factor ofn�3=2 (corresponding to a square-root singularity of the OGF) that can be found through-out analytic combinatorics.

EXERCISE31. IfA;B are context-free specifications then:(i) the sequence classC = sequen e(A) is context-free;(ii) the substitution classD = A[b 7! B℄ iscontext-free.

We detail below an example from combinatorial geometry.

10Formal language theory also defines context-sensitive grammars where each rule (called a production)is applied only if it is enabled by some external context. Context-sensitive grammars have greater expressivepower than context-free ones, but they depart significantlyfrom decomposability since they are surrounded bystrong undecidability properties; accordingly context-sensitive grammars cannot be associated with any globalgenerating function formalism.


EXAMPLE 20. Planar non-crossing configurations.The enumeration of non-crossingplanar configurations is discussed here at some level of generality. (An analytic problemin this orbit has been already treated in Example 18.) The purpose is to illustrate the factthat context-free descriptions can model naturally very diverse sorts of objects includingparticular topological-geometric configurations. The problems considered have their originin combinatorial musings of the Rev. T.P. Kirkman in 1857 andwere revisited in 1974 byDomb and Barett [30] for the purpose of investigating certain perturbative expansions ofstatistical physics. Our presentation follows closely thesynthesis offered in [40].

Consider for each value ofn the regularn-gon built from vertices taken for conve-nience to be then complex roots of unity and numbered0; : : : ; n � 1. A non-crossinggraph is a graph on this set of vertices such that no two of its edges cross. From there,one defines non-crossing connected graphs, non-crossing forests (that are acyclic), andnon-crossing trees (that are acyclic and connected); see Figure 16. Note that there is awell-defined orientation of the complex plane and also that the various graphs consideredcan always be rooted in some canonical way (e.g., on the vertex of smallest index) sincethe placement of vertices is rigidly fixed.

Trees. A non-crossing tree is rooted at 0. To the root vertex, is attached an orderedcollection of vertices, each of which has an end-node� that is the common root of twonon-crossing trees, one on the left of the edge(0; �) the other on the right of(0; �). LetTdenote the class of trees andU denote the class of trees whose root has been severed. Witho denoting a generic node, we then haveT = o� U ; U = sequen e(U � o� U);which corresponds graphically to the “butterfly decomposition”:

UUU

U U

U = T =

In terms of OGF, this gives the system

(67) fT = zU; U = (1� zU2)�1g () fT = zU; U = 1 + UV; V = zU2g;where the latter form corresponds to the expansion of the sequence operator. Consequently,T satisfiesT = T 3 � zT + z2, which by Lagrange inversion givesTn = 12n�1�3n�3n�1 �.

Forests.A (non-crossing) forest is a non-crossing graph that is acyclic. In the presentcontext, it is not possible to express forests simply as sequences as trees, because of thegeometry of the problem.

Starting conventionally from the root vertex 0 and following all connected edges de-fines a “backbone” tree. To the left of every vertex of the tree, a forest may be placed.There results the decomposition (expressed directly in terms of OGF’s),

(68) F = 1 + T [z 7! zF ℄;whereT is the OGF of trees andF is the OGF of forests. In (68), the termT [z 7! zF ℄denotes a functional composition. A context-free specification in standard form resultsmechanically from (67) upon replacingz by zF , namely

(69) F = 1 + T; T = zFU; U = 1 + UV; V = zFU2:


(connected graph)

(tree) (forest)

(graph)

Configuration / OGF Coefficients (exact / asymptotic)

Trees (EIS: A001764) z + z2 + 3z3 + 12z4 + 55z5 + � � �T 3 � zT + z2 = 0 12n� 1 3n� 3n� 1 !� p327p�n3 (274 )nForests (EIS: A054727) 1 + z + 2z2 + 7z3 + 33z4 + 181z5 � � �F 3 + (z2 � z � 3)F 2 + (z + 3)F � 1 = 0 nXj=1 12n� j nj � 1! 3n� 2j � 1n� j !� 0:07465p�n3 (8:22469)nConnected graphs (EIS: A007297) z + z2 + 4z3 + 23z4 + 156z5 + � � �C3 + C2 � 3zC + 2z2 = 0 1n� 1 2n�3Xj=n�1 3n� 3n+ j ! j � 1j � n+ 1!� 2p6� 3p218p�n3 �6p3�nGraphs (EIS: A054726) 1 + z + 2z2 + 8z3 + 48z4 + 352z5 + � � �G2 + (2z2 � 3z � 2)G+ 3z + 1 = 0 1n n�1Xj=0(�1)j nj! 2n� 2� jn� 1� j !2n�1�j� p140� 99p24p�n3 �6 + 4p2�n

FIGURE 16. (Top) Non-crossing graphs: a tree, a forest, a connectedgraph, and a general graph. (Bottom) The enumeration of non-crossingconfigurations by algebraic functions.


This system is irreducible and aperiodic, so that the asymptotic shape ofFn is of theform !nn�3=2, as predicted by Proposition 8.6. This agrees with the precise formuladetermined in Example 18.

Graphs. Similar constructions (see [40]) give the OGF’s of connected graphs andgeneral graphs. The results are summarized in Figure 16. Note the common shape of theasymptotic estimates and also the fact that binomial expressions are always available inaccordance with Theorem 8.10. �

Note on “tree-like” structures.A context-free specification can always be regarded asdefining a class of trees. Indeed, if thejth term in the construction�j is “coloured” withthe pair(i; j), it is seen that a context-free system yields a class of treeswhose nodes aretagged by pairs(i; j) in a way that is consistent with the system’s rules (65). However,despite this correspondence, it is often convenient to preserve the possibility of operatingdirectly with objects11 when the tree aspect is unnatural. By a terminology borrowedfromthe theory of syntax analysis in computer science, such trees are referred to as “parse trees”or “syntax trees”.

EXERCISE32. The parse trees associated mechanically with the specification oftriangulations above are in bijective correspondence withbinary (rooted plane)trees.

6.2. Context-free languages.Let A be a fixed finite alphabet whose elements arecalled letters. AgrammarG is a collection of equations

(70) G : 8>>><>>>: L1 = 1(~a;L1; : : : ;Lm)...

...Lm = m(~a;L1; : : : ;Lm);where eachj involves only the operations of union ([) and catenation product( � ) with~a the vector of letters inA. For instance,1(~a;L1;L2;L3) = a2 � L2 � L3 [ a3 [ L3 � a2 � L1:A solution to (70) is anm-tuple of languages over the alphabetA that satisfies the system.By convention, one declares that the grammarG defines the first component,L1.

To each grammar (70), one can associate a context-free specification (65) by trans-forming unions into disjoint union, ‘[’ 7! ‘+’, and catenation into cartesian products,‘ �’ 7! ‘�’. Let bG be the specification associated in this way to the grammarG. Theobjects described bybG appear in this perspective to be trees (see the discussion above re-garding parse trees). Leth be the transformation from trees ofbG to languages ofG thatlists letters in infix (i.e., left-to-right) order: we call such anh the erasing transforma-tion since it “forgets” all the structural information contained in the parse tree and onlypreserves the succession of letters. Clearly, applicationof h to the combinatorial specifica-tions determined bybG yields languages that obey the grammarG. For a grammarG and awordw 2 A?, the number of parse treest 2 bG such thath(t) = w is called theambiguitycoefficientof w with respect to the grammarG; this quantity is denoted by�G(w).

A grammarG is unambiguous if all the corresponding ambiguity coefficients are ei-ther 0 or 1. This means that there is a bijection between parsetrees of bG and words of

11Some authors have even developed a notion of “object grammars”; see for instance [32] itself inspired bytechniques of polyomino surgery in [29].


the language described byG: each word generated is uniquely “parsable” according to thegrammar. From Proposition 8.6, we have immediately:

PROPOSITION8.7. Given a context-free grammarG, the ordinary generating functionof the languageLG(z), counting wordswith multiplicity, is an algebraic function. Inparticular, a context-free language that admits an unambiguous grammar specificationhas an ordinary generating functionL(z) that is an algebraic function.

This theorem originates from early works of Chomsky and Sch¨utzenberger [22] whichhave exerted a strong influence on the philosophy of the present book.

For example consider the Łukasiewicz languageL = (a � L � L � L) [ b:This can be interpreted as the set of functional terms built from the ternary symbola andthe nullary symbolb:L = fb; abbb; aabbbbb; ababbbb; : : :g' fb; a(b; b; b); a(a(b; b; b); b; b); a(b; a(b; b; b); b); : : :g;where' denotes combinatorial isomorphism. It is easily seen that the terms are in bijectivecorrespondence with their parse trees, themselves isomorphic to ternary trees. Thus thegrammar is unambiguous, so that the OGF equation translatesdirectly from the grammar,

(71) L(z) = zL(z)3 + z:As another example, we revisit Dyck paths that are definable by the grammar,

(72) D = 1 [ (a �D � b �D) ;wherea denotes ascents andb denotes descents. Each word in the language must start witha lettera that has a unique matching letterb and thus it is uniquely parsable according tothe grammar (72). Since the grammar is unambiguous, the OGF reads off:D(z) = z + z2D(z)2:

EXERCISE 33. Investigate the relations between parse trees of Łukasiewiczwords (71) and of non-crossing trees.

EXERCISE34. Extend the discussion to trees where node degrees are constrainedto be multiples of somed � 2. Relate combinatorially this problem tod-arytrees.

6.3. Simple families of trees.Meir and Moon in a classic paper [69] were the first todiscover the possibility of general asymptotic results concerning simple families of trees.Recall that asimple family of treesis defined as the class of all rooted unlabelled planetrees such that the outdegrees of nodes are constrained to belong to a finite set 2 N. Thedegree polynomial is �(y) := X!2 !u!;where ! is a positive “multiplicity coefficient” (in the case of puresets, one has ! = 1;the situation ! 2 N covers trees with ! allowed colours for a node of degree!). Then,the OGF of all trees in the family satisfiesT (z) = z�(T (z)):This is simply a scalar system to which the preceding theory applies.


Assume to be aperiodic, in the sense that the common gcd of the elements of equals 1. Then the pair(�; �) (where� is the radius of convergence ofT (z) and� = T (�))is a solution to � � ��(�) = 0; 1� ��0(�) = 0implying the “characteristic equation”,

(73) �(�) � ��0(�) = 0:It can be seen that there is a unique positive solution to (73), which determines� = ��(�) = 1�0(�) :Then, one finds in agreement with Theorems 8.12 and and 8.13,[zn℄T (z) � ��np�n3 ; = (2�(�)�00(�))�1=2:The result extends to finite weighted multisets of allowablenode degrees (hence to thenonplanar labelled case). Periodicities are also easily reduced to the general case. See thelast chapter of this book for a summary.

EXERCISE35. Examine the enumeration of “semi-simple” families of trees thatare binary trees defined by the fact that edges have size between two bounds < d, binary nodes have size 0, and end-nodes have size between two boundsa < b. Such trees are relevant to the enumeration of secondary structures ofnucleic sequences in biology [89].

EXERCISE 36. Examine the enumeration of plane trees that are colouredin rdifferent ways, where the colours at each node are constrained to satisfy a finiteset of compatibility rules [100].

EXERCISE37. A branching process conditioned by fixing the sizen of its totalprogeny leads to a simple family of trees.

EXERCISE 38. Show that non-plane trees with node degrees restricted tosome finite yield generating functions that are never algebraic but that thetype of the dominant singularity is still a square-root. [See Polya and Otter’sworks [75, 79].]

6.4. Walks and the kernel method.Start with a set that is a finite subset ofZ andis called the set ofjumps. A walk (relative to) is a sequencew = (w0; w1; : : : ; wn)such thatw0 = 0 andwi+1 � wi 2 , for all i, 0 � i < n. A nonnegative walksatisfieswi � 0 and anexcursionis a nonnegative walk such that, additionally,wn = 0. Thequantityn is called the length of the walk or the excursion. For instance, Dyck paths andMotzkin paths analysed in Section 3.5 are excursions that correspond to = f�1;+1gand = f�1; 0;+1g respectively. (Walks and excursions can be viewed as particularcases of paths in a graph in the sense of Section 3.3, with the graph taken to be the infinitesetZ>0 of integers.)

We propose to determinefn, the number of excursions of lengthn and type, via thecorresponding OGF F (z) = 1Xn=0 fnzn:


In fact, we shall determine the more general BGFF (z; u) :=Xn;k fn;kukzn;wherefn;k is the number of walks of lengthn and final altitudek (i.e., the value ofwn inthe definition of a walk is constrained to equalk). In particular, one hasF (z) = F (z; 0).

We let� denote the smallest (negative) value of a jump, andd denote the largest(positive) jump. A fundamental role is played in this discussion by the “characteristicpolynomial” of the walk, S(y) := X!2 y! = dXj=� Sjyjthat is a Laurent polynomial12. Observe that the bivariate generating function of general-ized walks where intermediate values are allowed to be negative, withz marking the lengthandu marking the final altitude, is rational:

(74) G(z; u) = 11� zS(u) :Returning to nonnegative walks, the main result to be provedbelow is the follow-

ing: For each finite set 2 Z, the generating function of excursions is an algebraicfunction that is explicitly computable from. There are many ways to view this result.The problem is usually treated within probability theory bymeans of Wiener-Hopf factor-izations [81]. In contrast, Labelle and Yeh [61] show that an unambiguous context-freespecification can be systematically constructed, a fact that is sufficient to ensure the al-gebraicity of the GFF (z). (Their approach is based implicitly on the construction ofafinite pushdown automaton itself equivalent, by general principles, to a context-free gram-mar.) The Labelle-Yeh construction reduces the problem to alarge, but somewhat “blind”,combinatorial preprocessing, and, for analysts it has the disadvantage of not extracting asimpler (and noncombinatorial) structure inherent in the problem. The method describedbelow is often known as the “kernel” method. It takes its inspiration from exercises in the1968 edition of Knuth’s book [58] (Ex. 2.2.1.4 and 2.2.1.11) where a new approach wasproposed to the enumeration of Catalan and Schroeder objects. The technique has sincebeen extended and systematized by several authors; see for instance [5, 6, 15, 33, 34].

Let fn(u) = [zn℄F (z; u) be the generating function of walks of lengthn with urecording the final altitude. There is a simple recurrence relatingfn+1(u) tofn(u), namely,

(75) fn+1(u) = S(u) � fn(u)� rn(u);where rn(u) is a Laurent polynomial consisting of the sum of all the monomials ofS(u)fn(u) that involve negative powers13 of u:

(76) rn(u) := �1Xj=� uj ([uj ℄S(u)fn(u)) = fu<0gS(u)fn(u):12If is a set, then the coefficients ofS lie in f0; 1g. The treatment above applies in all generality to cases

where the coefficients are arbitrary positive real numbers.This accounts for probabilistic situations as well asmultisets of jump values.

13The convenient notationfu<0g denotes the singular part of a Laurent expansion:fu<0gf(z) :=Pj<0 �[uj ℄f(u)� � uj .


The idea behind the formula is to subtract the effect of thosesteps that would take the walkbelow the horizontal axis. For instance, one hasS(u) = S�1u +O(1) : rn(u) = S�1u fn(0)S(u) = S�2u2 + S�1u +O(1) : rn(u) = �S�2u2 + S�1u � fn(0) + S�2u f 0n(0)and generally:

(77) �j(u) = 1j!fu<0gujS(u):Thus, from (75) and (76) (multiply byzn+1 and sum), the generating functionF (z; u)

satisfies the fundamental functional equation

(78) F (z; u) = 1 + zS(u)F (z; u)� zfu<0g (S(u)F (z; u)) :Explicitly, one has

(79) F (z; u) = 1 + zS(u)F (z; u)� z �1Xj=0 �j(u) � �j�uj F (z; u)�u=0 ;for Laurent polynomials�j(u) that depend onS(u) in an effective way by (77).

The main equations (78) and (79) involve one unknown bivariate GF,F (z; u) and univariate GF’s, the partial derivatives ofF specialized atu = 0. It is true, but not atall obvious, that the single functional equation (79) fullydetermines the + 1 unknowns.The basic technique is known as “cancelling the kernel” and it relies on strong analyticityproperties; see the book by Fayolleet al. [34] for deep ramifications. The form of (79) tobe employed for this purpose starts by grouping on one side the terms involvingF (z; u),(80) F (z; u)(1� zS(u)) = 1� z �1Xj=0 �j(u)Gj(z); Gj(z) := � �j�uj F (z; u)� :If the right side was not present, then the solution would reduce to (74). In the case at hand,from the combinatorial origin of the problem and implied bounds, the quantityF (z; u)is bivariate analytic at(z; u) = (0; 0) (by elementary exponential majorizations on thecoefficients). The main principle of the kernel method consists incouplingthe values ofzandu in such a way that1 � zS(u) = 0, so thatF (z; u) disappears from the picture. Acondition is that bothz andu should remain small (so thatF remains analytic). Relationsbetween the partial derivatives are then obtained from sucha specializations,(z; u) 7!(z; u(z)), which happen to be just in the right quantity.

Consequently, we consider the “kernel equation”,

(81) 1� zS(u) = 0;which is rewritten as u = z � (u S(u)):Under this form, it is clear that the kernel equation (81) defines + d branches of analgebraic function. A local analysis (Newton’s polygon method) shows that, amongst these +d branches, there are branches that tend to 0 asz ! 0while the otherd tend to infinityasz ! 0. Let u0(z); : : : ; u �1(z) be the branches that tend to 0, that we call “small”branches. In addition, we single outu0(z), the “principal” solution, by the reality conditionu0(z) � z1= ; := (S )1= 2 R>0 (z ! 0+):


By local uniformization (57), the conjugate branches are given locally byu`(z) = u0(e2i`�z) (z ! 0+):Couplingz andu by u = u`(z) produces interesting specializations of Equation (80).

In that case,(z; u) is close to(0; 0) whereF is bivariate analytic so that the substitution isadmissible. By substitution, we get

(82) 1� z �1Xj=0 �j(u`(z)) � �j�uj F (z; u)�u=0 ; ` = 0 : : � 1:This is now a linear system of equation in unknowns (the partial derivatives) withalgebraic coefficients that, in principle, determinesF (z; 0).

A convenient approach to the solution of (82) is due to Mireille Bousquet-Melou. Theargument goes as follows. The quantity

(83) M(u) := u � zu �1Xj=0 �j(u) �j�uj F (z; 0)can be regarded as a polynomial inu. It is monic while it vanishes by construction at the small branchesu0; : : : ; u �1. Consequently, one has the factorization,

(84) M(u) = �1Y=0(u� u`(z)):Now, the constant term ofM(u) is otherwise known to equal�zS� F (z; 0), by the def-inition (83) of M(u) and by Equation (77) specialized to�0(u). Thus, the comparisonof constant terms between (83) and (84) provides us with an explicit form of the OGF ofexcursions: F (z; 0) = (�1) �1S� z �1Y=0 u`(z):One can then finally return to the original functional equation and pull the BGFF (z; u).We can thus state:

PROPOSITION 8.8 (Kernel method for walks).Let be a finite step of jumps andlet S(u) be the characteristic polynomial of. In terms of the small branches of the“kernel” equation, 1� zS(u) = 0;denoted byu0(z); : : : ; u �1(z), the generating function of excursions is expressible asF (z) = (�1) �1zS� �1Y=0 u`(z) whereS� = [u� ℄S(u)is the multiplicity (or weight) of the smallest element� 2 .

More generally the bivariate generating function of nonnegative walks is bivariatealgebraic and given byF (z; u) = 1u � z (u S(u)) �1Y=0 (u� u`(z)) :


We give next a few examples illustrating this kernel technique.

Trees and Łukasiewicz codes.A particular class of walks is of special interest; it corre-sponds to cases where = 1, that is, the largest jump in the negative direction has ampli-tude 1. Consequently, + 1 = f0; s1; s2; : : : ; sdg. In that situation, combinatorial theoryteaches us the existence of fundamental isomorphisms between walks defined by stepsand trees whose degrees are constrained to lie in1 + . The correspondence is by way ofŁukasiewicz codes14 (‘also known as ‘Polish” prefix codes, “Polish” prefix notation), andfrom it we expect to find tree GF’s in such cases.

As regards generating functions, there now exists onlyonesmall branch, namely thesolutionu0(z) to u0(z) = z�(u0(z)) (where�(u) = uS(u)) that is analytic at the origin.One then hasF (z) = F (z; 0) = 1zu0(z), so that the walk GF is determined byF (z; 0) = 1zu0(z); u0(z) = z�(u0(z)); �(u) := uS(u):This form is consistent with what is already known regardingthe enumeration of simplefamilies of trees. In addition, one findsF (z; u) = 1� u�1u0(z)1� zS(u) = u� u0(z)u� z�(u) :

Classical specializations are rederived in this way:

— the Catalan walk (Dyck path), defined by = f�1;+1g and�(u) = 1 + u2,has u0(z) = 12z �1�p1� 4z2� ;

— the Motzkin walk, defined by = f�1; 0;+1g and�(u) = 1 + u+ u2 hasu0(z) = 12z �1� z �p1� 2z � 3z2� ;— the modified Catalan walk, defined by = f�1; 0; 0 + 1g (with two steps of

type0) and�(u) = 1 + 2u+ u2, hasu0(z) = 12z �1� 2z �p1� 4z� ;— thed-ary tree walk (the excursions encoded-ary trees) defined by = f�1; d�1g, hasu0(z) that is defined implicitly byu0(z) = z(1 + u0(z)d):

Examples of the general case.Take now = f�2;�1; 1; 2g so thatS(u) = u�2 + u�1 + u+ u2:Then,u0(z); u1(z) are the two branches that vanish asz ! 0 of the curvey2 = z(1 + y + y3 + y4):The linear system that determinesF (z; 0) andF 0(z; 0) is8>><>>: 1�� zu0(z)2 + zu0(z)�F (z; 0)� zu0(z)F 0(z; 0) = 01�� zu1(z)2 + zu1(z)�F (z; 0)� zu1(z)F 0(z; 0) = 0

14Such a code [66] is obtained by a preorder traversal of the tree, recording ajump ofr� 1 when a node ofoutdegreer is encountered. The sequence of jumps gives rise to an excursion followed by an extra�1 jump.


(derivatives are taken with respect to the second argument)and one findsF (z; 0) = �1zu0(z)u1(z); F 0(z; 0) = 1z (u0(z) + u1(z) + u0(z)u1(z)):This gives the number of walks, through a combination of series expansions,F (z) = 1 + 2z2 + 2z3 + 11z4 + 24z5 + 93z6 + 272z7 + 971z8 + 3194z9 + � � � :A single algebraic equation forF (z) = F (z; 0) is then obtained by elimination (e.g., viaGroebner bases) from the system:8>>><>>>: u20 � z(1 + u0 + u30 + u40) = 0u21 � z(1 + u1 + u31 + u41) = 0zF + u0u1 = 0Elimination shows thatF (z) is a root of the equationz4y4 � z2(1 + 2z)y3 + z(2 + 3z)y2 � (1 + 2z)y + 1 = 0:

For walks corresponding to = f�2;�1; 0; 1; 2g, we find similarly F (z) =� 1zu0(z)u1(z); whereu0; u1 are the small branches ofy2 = z(1+ y+ y2+ y3+ y4), theexpansion starts asF (z) = 1 + z + 3z2 + 9z3 + 32z4 + 120z5 + 473z6 + 1925z7 + 8034z8 + � � � ;andF (z) is a root of the equationz4y4 � z2(1 + z)y3 + z(2 + z)y2 � (1 + z)y + 1 = 0:

6.5. Maps and the quadratic method.A (planar) map is a connected planar graphtogether with an embedding into the plane. In all, generality, loops and multiple edgesare allowed. A planar map therefore separates the plane intoregions called faces (Fig-ure 17). The maps considered here are in addition rooted, meaning that a face, an incidentedge, and an incident vertex are distinguished. In this section, only rooted maps are con-sidered15. When representing rooted maps, we shall agree to draw the root edge with anarrow pointing away from the root node, and to take the root face as that face lying to theleft of the directed edge (represented in grey on Figure 17).

Tutte launched in the 1960’s a large census of planar maps, with the intention of attack-ing the four-colour problem by enumerative techniques16; see [16, 93, 94, 95, 96]. Thereexists in fact an entire zoo of maps defined by various degree or connectivity constraints.In this chapter, we shall limit ourselves to conveying a flavour of this vast theory, with thegoal of showing how algebraic functions arise. The presentation takes its inspiration fromthe book of Goulden and Jackson [48, Sec. 2.9]

Let M be the class of all maps where size is taken to be the number of edges. LetM(z; u) be the BGF of maps withu marking the number of edges on the outside face.The basic surgery performed on maps distinguishes two casesbased upon the nature of the

15Nothing is lost regarding asymptotic properties of random structures when a rooting is imposed. Thereason is that a map has, with probability exponentially close to 1, a trivial automorphism group; consequently,almost all maps ofm edges can be rooted in2m ways (by choosing an edge, and an orientation of this edge), andthere is an almost uniform2m-to-1 correspondence between unrooted maps and rooted ones.

16The four-colour theorem to the effect that every planar graph can be coloured using only four colours waseventually proved by Appel and Haken in 1976, using structural graph theory methods supplemented by extensivecomputer search.


FIGURE 17. A planar map.

root edge. A rooted map will be declared to be isthmic if the root edger of map� is an“isthmus” whose deletion would disconnect the graph. Clearly, one has,

(85) M = o+M(i) +M(n);whereM(i) (resp.M(n)) represent the class of isthmic (resp. non-isthmic) maps and ‘o’is the graph consisting of a single vertex and no edge. There are accordingly two ways tobuild maps from smaller ones by adding a new edge.(i) The class of all isthmic maps is constructed by taking two arbitrary maps andjoining them together by a new root edge, as shown below:

The effect is to increase the number of edges by 1 (the new rootedge) and have the rootface degree become 2 (the two sides of the new root edge) plus the sum of the root facedegrees of the component maps. The construction is clearly revertible. In other words, theBGF ofM(i) is

(86) M (i)(z; u) = zu2M(z; u)2:(ii) The class of non-isthmic maps is obtained by taking an already existing mapand adding an edge that preserves its root node and “cuts across” its root face in someunambiguous fashion (so that the construction should be revertible). This operation willtherefore result in a new map with an essentially smaller root-face degree. For instance,there are 5 ways to cut across a root face of degree 4, namely,

��

��

��

��

��

��

��

��

��

��

��

��


This corresponds to the linear transformationu4 7! zu5 + zu4 + zu3 + zu2 + zu1:In general the effect on a map with root face of degreek is described by the transformationuk 7! z(1� uk+1)=(1 � u); equivalently, each monomialg(u) = uk is transformed intou(g(1)�ug(u))=(1�u). Thus, the OGF ofM(n) involves a discrete difference operator:

(87) M (n)(z; u) = zuM(z; 1)� uM(z; u)1� u :Collecting the contributions from (86) and (87) in (85) thenyields the basic functional

equation,

(88) M(z; u) = 1 + u2zM(z; u)2 + uzM(z; 1)� uM(z; u)1� z :The functional equation (88) binds two unknown functions,M(z; u) andM(z; 1).

Much like in the case of walks, it would seem to be underdetermined. Now, a method dueto Tutte and known as the quadratic method provides solutions. Following Tutte and theaccount in [48, p. 138], we consider momentarily the more general equation

(89) (g1F (z; u) + g2)2 = g3;wheregj = Gj(z; u; h(z)) and theGj are explicit functions—here the unknown functionsareF (z; u) andh(z) (cf. M(z; u) andM(z; 1) in (88)). Bindu andz in such a way thatthe left side of (89) vanishes, that is, substituteu = u(z) (a yet unknown function) so thatg1F + g2 = 0. Since the left-hand side of (89) now has a double root inu, so must theright-hand side, which implies

(90) g3 = 0; �g3�u ��u=u(z) :The original equation has become a system of two equations intwo unknowns that de-termines implicitlyh(z) andu(z). From there, elimination provides individual equationsfor u(z) and forh(z). (If needed,F (z; u) can then be recovered by solving a quadraticequation.) It will be recognized that, if the quantitiesq1; g2; g3 are polynomials, then theprocess invariably yields solutions that are algebraic functions.

We now carry out this programme in the case of maps and Equation (88). First, isolateM(z; u) by completing the square, giving

(91)

�M(z; u)� 12 1� u+ u2zu2z(1� u) �2 = Q(z; u) + M(z; 1)u(1� u) ;where Q(z; u) = z2u4 � 2zu2(u� 1)(2u� 1) + (1� u2)4u4z2(1� u)2 :Next, the condition expressing the existence of a double root isQ(z; u) + 1u(1� u)M(z; 1) = 0; Q0u(z; u) + 2u� 1u2(1� u)2M(z; 1) = 0:It is now easy to eliminateM(z; 1), since the dependency inM is linear, and a straightfor-ward calculation shows thatu = u(z) should satisfy�u2z + (u� 1)� �u2z + (u� 1)(2u� 3)� = 0:

7. NOTES 89

The first parametrization would lead toM(z; 1) = 1=z which is not admissible. Thus,u(z)is to be taken as the root of the second factor, withM(z; 1) being defined parametricallyby z = (1� u)(2u� 3)u2 ; M(z; 1) = �u 3u� 4(2u� 3)2 :The change of parameteru = 1� 1=w reduces this further to the “Lagrangean form”,

(92) z = w1� 3w ; M(z; 1) = 1� 4w(1� 3w)2 :To this the Lagrange inversion theorem can be applied. The number of maps withn edges,Mn = [zn℄M(z; 1) is then determined asMn = 2 (2n)!3nn!(n+ 2)! ;and one obtains SequenceA000168of theEIS:M(z; 1) = 1 + 2z + 9z2 + 54z3 + 378z4 + 2916z5 + 24057z6+ 208494z7+ � � � :We refer to [48, Sec. 2.9] for detailed calculations (that are nowadays routinely performedwith assistance of a computer algebra system). Currently, there exist many applications ofthe method to maps satisfying all sorts of combinatorial constraints (e.g., multiconnectiv-ity); see [83] for a recent panorama.

The derivation above has purposely stressed a parametrizedapproach as this consti-tutes a widely applicable approach in many situations. In a simple case like this, we mayalso eliminateu and solve explicitly forM(z; 1), to wit,M(z) �M(z; 1) = � 154 z2 �1� 18z � (1� 12z)3=2� :It is interesting to note that the singular exponent here is32 , a fact further reflected by thesomewhat atypical factor ofn�5=2 in the asymptotic form of coefficients:Mn � 2p�n5 12n (n!1):Accordingly, randomness properties of maps are appreciably different from what is ob-served in trees and many commonly encountered context-freeobjects.

7. Notes

A very detailed discussion of rational functions in combinatorial analysis is given inStanley’s book [87] that focusses on algebraic aspects. The main characterizations and clo-sure properties are developed there on a firm algebraic basis. The Perron-Frobenius theoryis covered extensively in Gantmacher’s reference book on matrix theory [45, Ch. 13] aswell as in most courses on stochastic processes because of its relevance to finite Markovchains; see for instance the excellent appendix of Karlin and Taylor’s course [54]. Theconnections between rational series and formal languages form the subject of the book byBerstel and Reutenauer [12] that also discusses rational series in noncommutative indeter-minates. Connections between graphs, matrices, and rational functions appear in referencebooks in algebraic combinatorics, like those by Biggs [14] or Godsil [46].

The intimate relations between finite automata, regular expressions, and languagesbelong to the protohistory of computer science in the 1950’s. The connection with gen-erating functions evolved largely from joint research [22] of the mathematician MarcoSchutzenberger and the linguist Noam Chomsky in the secondhalf of the 1950’s. The


method of transfer matrices is closely related to finite automata and it belongs to the folk-lore of statistical physics; see Temperley’s monograph [91].

Lattice paths are fundamental objects of combinatorics andseveral books have beendevoted already to their enumerative aspects; see [71] for a treatment that offers insights onmotivations coming from neighbouring areas of classical statistics and probability theory.Historically, the connection between lattice paths and continued fractions has surfacedindependently in works of Touchard (topological configurations [92]), I.J. Good (randomwalks [47]), Lenard (one-dimensional statistical mechanics [65]), Szekeres (combinatoricsof some of Ramanujan’s identities), Flajolet (histories and dynamic data structures [35,36]), Jackson (combinatorics of Ising models [53]), and Read (chord diagrams [80]). Thebasic results are still rediscovered and published periodically. General syntheses appearin [35, 37, 48] and the relation to orthogonal polynomials is well developed in Godsil’sbook [46]. The presentation given here bases itself on reference [37], a study motivatedmore specifically by the formal theory of birth-and-death processes.

Algebraic functions are the modern counterpart of the studyof curves by classicalGreek mathematicians. They are either approached by algebraic methods (this is the coreof algebraic geometry) or by transcendental methods. For our purposes, however, onlyrudiments of the theory of curves are needed. For this, thereexist several excellent in-troductory books, of which we recommend the ones by Abhyankar [1], Fulton [44], andKirwan [56]. On the algebraic side, we have striven to provide an introduction to alge-braic functions that requires minimal apparatus. At the same time the emphasis has beenput somewhat on algorithmic aspects, since most algebraic models are nowadays likelyto be treated with the help of computer algebra. As regards symbolic computational as-pects, we recommend the treatise by von zur Gathen and Jurgen[98] for background, whilepolynomial systems are excellently reviewed in the book by Cox, Little, and O’Shea [27].

In the combinatorial domain, algebraic functions have beenused early: in Euler andSegner’s enumeration of triangulations (1753) as well as inSchroder’s famous “vier com-binatorische Probleme” described in [88, p. 177]. A major advance was the realizationby Chomsky and Schutzenberger that algebraic functions are the “exact” counterpart ofcontext-free grammars and languages (see again the historic paper [22]). A masterful sum-mary of the early theory appears in the proceedings edited byBerstel [10] while a modernand precise exposition forms the subject of Chapter 6 of Stanley’s book [88]. On theanalytic-asymptotic side, many researchers have long beenaware of the power of Puiseuxexpansions in conjunction with some version of singularityanalysis (often in the form ofthe Darboux–Polya method: see [79] based on Polya’s classic paper [78] of 1937). How-ever, there appeared to be difficulties in coping with the fully general problem of algebraiccoefficient asymptotics [18, 70]. We believe that Section 5.1 sketches the first completetheory. In the case of positive systems, the “Drmota-Lalley-Woods” theorem is the key tomost problems encountered in practice—its importance should be clear from the develop-ments of Section 5.2.

Bibliography

[1] A BHYANKAR , S.-S.Algebraic geometry for scientists and engineers. American Mathematical Society,1990.

[2] A BRAMOWITZ, M., AND STEGUN, I. A. Handbook of Mathematical Functions. Dover, 1973. A reprintof the tenth National Bureau of Standards edition, 1964.

[3] A HO, A. V., AND CORASICK, M. J. Efficient string matching: an aid to bibliographic search. Communi-cations of the ACM 18(1975), 333–340.

[4] A NDREWS, G. E.The Theory of Partitions, vol. 2 of Encyclopedia of Mathematics and its Applications.Addison–Wesley, 1976.

[5] BANDERIER, C. Combinatoire analytique des chemins et des cartes. PhD thesis, Universite Paris VI, Apr.2001. In preparation.

[6] BANDERIER, C., BOUSQUET-M ELOU, M., DENISE, A., FLAJOLET, P., GARDY, D., AND GOUYOU-BEAUCHAMPS, D. On generating functions of generating trees. InFormal Power Series and AlgebraicCombinatorics(June 1999), C. Martınez, M. Noy, and O. Serra, Eds., Universitat Politecnica de Catalunya,pp. 40–52. (Proceedings of FPSAC’99, Barcelona. Also available as INRIA Res. Rep. 3661, April 1999.).

[7] BARBOUR, A. D., HOLST, L., AND JANSON, S. Poisson approximation. The Clarendon Press OxfordUniversity Press, New York, 1992. Oxford Science Publications.

[8] BENDER, E. A. Asymptotic methods in enumeration.SIAM Review 16, 4 (Oct. 1974), 485–515.[9] BENTLEY, J., AND SEDGEWICK, R. Fast algorithms for sorting and searching strings. InEighth Annual

ACM-SIAM Symposium on Discrete Algorithms(1997), SIAM Press.[10] BERSTEL, J., Ed.Series Formelles. LITP, University of Paris, 1978. (Proceedings of a School,Vieux–

Boucau, France, 1977).[11] BERSTEL, J.,AND PERRIN, D. Theory of codes. Academic Press Inc., Orlando, Fla., 1985.[12] BERSTEL, J.,AND REUTENAUER, C. Les series rationnelles et leurs langages. Masson, Paris, 1984.[13] BIEBERBACH, L. Theorie der gewohnlichen Differentialgleichungen, vol. LXVI of Grundlehren der math-

ematischen Wissenschaften. Springer Verlag, Berlin, 1953.[14] BIGGS, N. Algebraic Graph Theory. Cambridge University Press, 1974.[15] BOUSQUET-M ELOU, M., AND PETKOVSEK, M. Linear recurrences with constant coefficients: the multi-

variate case.Discrete Mathematics ??(2000), ??? In press.[16] BROWN, W. G.,AND TUTTE, W. T. On the enumeration of rooted non-separable planar maps.Canadian

Journal of Mathematics 16(1964), 572–577.[17] BURGE, W. H. An analysis of binary search trees formed from sequences of nondistinct keys.JACM 23,

3 (July 1976), 451–454.[18] CANFIELD , E. R. Remarks on an asymptotic method in combinatorics.Journal of Combinatorial Theory,

Series A37 (1984), 348–352.[19] CARTIER, P., AND FOATA , D. Problemes combinatoires de commutation et rearrangements, vol. 85 of

Lecture Notes in Mathematics. Springer Verlag, 1969.[20] CAZALS , F. Monomer-dimer tilings.Studies in Automatic Combinatorics 2(1997). Electronic publication

http://algo.inria.fr/libraries/autocomb/autocomb.ht ml .[21] CHIHARA , T. S.An Introduction to Orthogonal Polynomials. Gordon and Breach, New York, 1978.[22] CHOMSKY, N., AND SCHUTZENBERGER, M. P. The algebraic theory of context–free languages. InCom-

puter Programing and Formal Languages(1963), P. Braffort and D. Hirschberg, Eds., North Holland,pp. 118–161.

[23] CHYZAK , F.,AND SALVY , B. Non-commutative elimination in Ore algebras proves multivariate identities.Journal of Symbolic Computation 26, 2 (1998), 187–227.

[24] CLEMENT, J., FLAJOLET, P., AND VALL EE, B. Dynamical sources in information theory: A generalanalysis of trie structures.Algorithmica 29, 1/2 (2001), 307–369.

[25] COMTET, L. Calcul pratique des coefficients de Taylor d’une fonction algebrique.Enseignement Math. 10(1964), 267–270.

[26] COMTET, L. Advanced Combinatorics. Reidel, Dordrecht, 1974.[27] COX, D., LITTLE , J.,AND O’SHEA, D. Ideals, Varieties, and Algorithms: an Introduction to Computa-

tional Algebraic Geometry and Commutative Algebra, 2nd ed. Springer, 1997.[28] DAVID , F. N.,AND BARTON, D. E.Combinatorial Chance. Charles Griffin, London, 1962.

91

92 BIBLIOGRAPHY

[29] DELEST, M.-P.,AND V IENNOT, G. Algebraic languages and polyominoes enumeration.Theoretical Com-puter Science 34(1984), 169–206.

[30] DOMB, C., AND BARRETT, A. Enumeration of ladder graphs.Discrete Mathematics 9(1974), 341–358.[31] DRMOTA, M. Systems of functional equations.Random Structures & Algorithms 10, 1–2 (1997), 103–124.[32] DUTOUR, I., AND FEDOU, J.-M. Object grammars and random generation.Discrete Mathematics and

Theoretical Computer Science 2(1998), 47–61.[33] FAYOLLE , G., AND IASNOGORODSKI, R. Two coupled processors: the reduction to a Riemann-Hilbert

problem.Zeitschrift fur Wahrscheinlichkeitstheorie und Verwandte Gebiete 47, 3 (1979), 325–351.[34] FAYOLLE , G., IASNOGORODSKI, R.,AND MALYSHEV, V. Random walks in the quarter-plane. Springer-

Verlag, Berlin, 1999.[35] FLAJOLET, P. Combinatorial aspects of continued fractions.Discrete Mathematics 32(1980), 125–161.[36] FLAJOLET, P., FRANCON, J., AND VUILLEMIN , J. Sequence of operations analysis for dynamic data

structures.Journal of Algorithms 1(1980), 111–141.[37] FLAJOLET, P.,AND GUILLEMIN , F. The formal theory of birth-and-death processes, lattice path combi-

natorics, and continued fractions.Advances in Applied Probability 32(2000), 750–778.[38] FLAJOLET, P., HATZIS, K., NIKOLETSEAS, S.,AND SPIRAKIS, P. On the robustness of interconnections

in random graphs: A symbolic approach.Theoretical Computer Science ??(2001), ???–??? Accepted forpublication. A preliminary version appears as INRIA Research Report4069, November 2000, 21 pages.

[39] FLAJOLET, P., KIRSCHENHOFER, P.,AND TICHY, R. F. Deviations from uniformity in random strings.Probability Theory and Related Fields 80(1988), 139–150.

[40] FLAJOLET, P.,AND NOY, M. Analytic combinatorics of non-crossing configurations. Discrete Mathemat-ics 204, 1-3 (1999), 203–229. (Selected papers in honor of Henry W. Gould).

[41] FLAJOLET, P.,AND ODLYZKO , A. The average height of binary trees and other simple trees. Journal ofComputer and System Sciences 25(1982), 171–213.

[42] FLAJOLET, P., AND ODLYZKO , A. Limit distributions for coefficients of iterates of polynomials withapplications to combinatorial enumerations.Mathematical Proceedings of the Cambridge PhilosophicalSociety 96(1984), 237–253.

[43] FROBENIUS, G. Uber Matrizen aus nicht negativen Elementen.Sitz.-Ber. Akad. Wiss., Phys-Math Klasse,Berlin (1912), 456–477.

[44] FULTON, W. Algebraic Curves. W.A. Benjamin, Inc., New York, Amsterdam, 1969.[45] GANTMACHER, F. R.Matrizentheorie. Deutscher Verlag der Wissenschaften, Berlin, 1986. A translation

of the Russian original Teoria Matriz, Nauka, Moscow, 1966.[46] GODSIL, C. D.Algebraic Combinatorics. Chapman and Hall, 1993.[47] GOOD, I. J. Random motion and analytic continued fractions.Proceedings of the Cambridge Philosophi-

cal Society 54(1958), 43–47.[48] GOULDEN, I. P.,AND JACKSON, D. M. Combinatorial Enumeration. John Wiley, New York, 1983.[49] GOURDON, X. Largest component in random combinatorial structures.Discrete Mathematics 180, 1-3

(1998), 185–209.[50] GRAHAM , R., KNUTH, D., AND PATASHNIK , O. Concrete Mathematics. Addison Wesley, 1989.[51] HENRICI, P.Applied and Computational Complex Analysis, vol. 2. John Wiley, New York, 1974.[52] HILLE , E.Analytic Function Theory. Blaisdell Publishing Company, Waltham, 1962. 2 Volumes.[53] JACKSON, D. Some results on product-weighted lead codes.Journal of Combinatorial Theory,Series A

25 (1978), 181–187.[54] KARLIN , S.,AND TAYLOR , H. A First Course in Stochastic Processes, second ed. Academic Press, 1975.[55] K IRSCHENHOFER, P., AND PRODINGER, H. The number of winners in a discrete geometrically dis-

tributed sample.The Annals of Applied Probability 6, 2 (1996), 687–694.[56] K IRWAN , F. Complex Algebraic Curves. No. 23 in London Mathematical Society Student Texts. Cam-

bridge University Press, 1992.[57] KNOPFMACHER, A., AND PRODINGER, H. On Carlitz compositions.European Journal of Combinatorics

19, 5 (1998), 579–589.[58] KNUTH, D. E. The Art of Computer Programming, vol. 1: Fundamental Algorithms. Addison-Wesley,

1968. Second edition, 1973.[59] KNUTH, D. E. The Art of Computer Programming, 3rd ed., vol. 1: Fundamental Algorithms. Addison-

Wesley, 1997.[60] KRASNOSELSKII, M. Positive solutions of operator equations. P. Noordhoff, Groningen, 1964.[61] LABELLE , J.,AND YEH, Y.-N. Generalized Dyck paths.Discrete Mathematics 82(1990), 1–6.[62] LAGARIAS, J. C., ODLYZKO , A. M., AND ZAGIER, D. B. On the capacity of disjointly shared networks.

Computer Netwoks 10, 5 (1985), 275–285.[63] LALLEY, S. P. Finite range random walk on free groups and homogeneous trees.Ann. Probab. 21, 4

(1993), 2087–2130.[64] LANG, S.Linear Algebra. Addison-Wesley Publishing Co., Inc., Reading, Mass.-DonMills, Ont., 1966.[65] LENARD, A. Exact statistical mechanics of a one–dimensional system with Coulomb forces.The Journal

of Mathematical Physics 2, 5 (Sept. 1961), 682–693.

BIBLIOGRAPHY 93

[66] LOTHAIRE, M. Combinatorics on Words, vol. 17 of Encyclopedia of Mathematics and its Applications.Addison–Wesley, 1983.

[67] LOUCHARD, G. Random walks, Gaussian processes and list structures.Theoretical Computer Science 53,1 (1987), 99–124.

[68] MACMAHON, P. A. Introduction to combinatory analysis. Chelsea Publishing Co., New York, 1955. Areprint of the first edition, Cambridge, 1920.

[69] MEIR, A., AND MOON, J. W. On the altitude of nodes in random trees.Canadian Journal of Mathematics30 (1978), 997–1015.

[70] MEIR, A., AND MOON, J. W. The asymptotic behaviour of coefficients of powers of certain generatingfunctions.European Journal of Combinatorics 11(1990), 581–587.

[71] MOHANTY, S. G.Lattice path counting and applications. Academic Press [Harcourt Brace JovanovichPublishers], New York, 1979. Probability and MathematicalStatistics.

[72] ODLYZKO , A. M. Periodic oscillations of coefficients of power seriesthat satisfy functional equations.Advances in Mathematics 44(1982), 180–205.

[73] ODLYZKO , A. M. Asymptotic enumeration methods. InHandbook of Combinatorics, R. Graham,M. Grotschel, and L. Lovasz, Eds., vol. II. Elsevier, Amsterdam, 1995, pp. 1063–1229.

[74] ODLYZKO , A. M., AND WILF, H. S. The editor’s corner:n coins in a fountain.American MatematicalMonthly 95(1988), 840–843.

[75] OTTER, R. The number of trees.Annals of Mathematics 49, 3 (1948), 583–599.[76] PERCUS, J. K.Combinatorial Methods, vol. 4 of Applied Mathematical Sciences. Springer-Verlag, 1971.[77] PERRON, O. Uber Matrizen.Mathematische Annalen 64(1907), 248–263.[78] POLYA , G. Kombinatorische Anzahlbestimmungen fur Gruppen, Graphen und chemische Verbindungen.

Acta Mathematica 68(1937), 145–254.[79] POLYA , G.,AND READ, R. C.Combinatorial Enumeration of Groups, Graphs and Chemical Compounds.

Springer Verlag, New York, 1987.[80] READ, R. C. The chord intersection problem.Annals of the New York Academy of Sciences 319(May

1979), 444–454.[81] ROBERT, P.Reseaux et files d’attente: methodes probabilistes, vol. 35 ofMathematiques & Applications.

Springer, Paris, 2000.[82] SALVY , B., AND ZIMMERMANN , P. GFUN: a Maple package for the manipulation of generatingand

holonomic functions in one variable.ACM Trans. Math. Softw. 20, 2 (1994), 163–167.[83] SCHAEFFER, G.Conjugaison d’arbres et cartes combinatoires aleatoires. Ph.d. thesis, Universite de Bor-

deaux I, Dec. 1998.[84] SEDGEWICK, R., AND FLAJOLET, P. An Introduction to the Analysis of Algorithms. Addison-Wesley

Publishing Company, 1996.[85] SLOANE, N. J. A. The On-Line Encyclopedia of Integer Sequences. 2000. Published electronically at

http://www.research.att.com/njas/sequences/ .[86] SLOANE, N. J. A., AND PLOUFFE, S. The Encyclopedia of Integer Sequences. Academic Press, 1995.

Available electronically athttp://www.research.att.com/njas/sequences/ .[87] STANLEY, R. P.Enumerative Combinatorics, vol. I. Wadsworth & Brooks/Cole, 1986.[88] STANLEY, R. P.Enumerative Combinatorics, vol. II. Wadsworth & Brooks/Cole, 1998.[89] STEIN, P. R.,AND WATERMAN , M. S. On some new sequences generalizing the Catalan and Motzkin

numbers.Discrete Mathematics 26, 3 (1979), 261–272.[90] SZEGO, G. Orthogonal Polynomials, vol. XXIII of Americam Mathematical Society Colloquium Publica-

tions. A.M.S, Providence, 1989.[91] TEMPERLEY, H. N. V. Graph theory and applications. Ellis Horwood Ltd., Chichester, 1981. Ellis Hor-

wood Series in Mathematics and its Applications.[92] TOUCHARD, J. Contribution a l’etude du probleme des timbres poste.Canadian Journal of Mathematics

2 (1950), 385–398.[93] TUTTE, W. T. A census of planar maps.Canadian Journal of Mathematics 15(1963), 249–271.[94] TUTTE, W. T. On the enumeration of planar maps.Bull. Amer. Math. Soc. 74(1968), 64–74.[95] TUTTE, W. T. On the enumeration of four-colored maps.SIAM Journal on Applied Mathematics 17

(1969), 454–460.[96] TUTTE, W. T. Planar enumeration. InGraph theory and combinatorics (Cambridge, 1983). Academic

Press, London, 1984, pp. 315–319.[97] VALL EE, B. Dynamical sources in information theory: Fundamental intervals and word prefixes.Algo-

rithmica 29, 1/2 (2001), 262–306.[98] VON ZUR GATHEN, J.,AND GERHARD, J.Modern computer algebra. Cambridge University Press, New

York, 1999.[99] WALL , H. S.Analytic Theory of Continued Fractions. Chelsea Publishing Company, 1948.

[100] WOODS, A. R. Coloring rules for finite trees, and probabilities of monadic second order sentences.Ran-dom Structures Algorithms 10, 4 (1997), 453–485.

[101] ZEILBERGER, D. Closed form (pun intended!).Contemporary mathematics 143(1988), 579–607.

94 BIBLIOGRAPHY

[102] ZEILBERGER, D. Symbol-crunching with the transfer-matrix method in order to count skinny phys-ical creatures.Integers 0(2000), Paper A9. Published electronically athttp://www.integers-ejcnt.org/vol0.html .

Contents

Chapter 8. Functional Equations—Rational and Algebraic Functions 1The rational, algebraic, and holonomic classes 21. Algebra of rational functions 51.1. Characterizations and elimination. 61.2. Closure properties and coefficients. 72. Analysis of rational functions 92.1. General rational functions. 92.2. Positive rational functions and Perron-Frobenius theory. 113. Combinatorial applications of rational functions 183.1. Regular specifications. 183.2. Regular languages. 193.3. Paths in graphs, automata, and transfer matrices. 233.4. Permutations and local constraints. 323.5. Lattice paths and walks on the line. 374. Algebra of algebraic functions 464.1. Characterization and elimination. 474.2. Closure properties and coefficients. 545. Analysis of algebraic functions 595.1. General algebraic functions. 595.2. Positive functions and positive systems. 706. Combinatorial applications of algebraic functions 756.1. Context-free specifications. 756.2. Context-free languages. 796.3. Simple families of trees. 806.4. Walks and the kernel method. 816.5. Maps and the quadratic method. 867. Notes 89

Bibliography 91

95

Unit�e de re her he INRIA Lorraine, Te hnopole de Nan y-Brabois, Campus s ienti�que,615 rue du Jardin Botanique, BP 101, 54600 VILLERS L�ES NANCYUnit�e de re her he INRIA Rennes, Irisa, Campus universitaire de Beaulieu, 35042 RENNES CedexUnit�e de re her he INRIA Rhone-Alpes, 655, avenue de l'Europe, 38330 MONTBONNOT ST MARTINUnit�e de re her he INRIA Ro quen ourt, Domaine de Volu eau, Ro quen ourt, BP 105,78153 LE CHESNAY CedexUnit�e de re her he INRIA Sophia-Antipolis, 2004 route des Lu ioles, BP 93, 06902 SOPHIA-ANTIPOLISCedex�EditeurINRIA, Domaine de Volu eau, Ro quen ourt, BP 105, 78153 LE CHESNAY Cedex(Fran e)

http://www.inria.frISSN 0249-6399

analytic combinatorics: functional equations, rational and ... · functional equations—rational...

Documents