Generating Functions
Generating Functions
Generating functions are packages. A sequence lies in a messy pile on the floor, pieces
sharp and irregular jutting out in all directions. We pick up the pieces one by one and pack
them together just so . . . until suddenly we have what seems another object entirely. When
we find such an object, we can carry it, weigh it, discern its shape, find its center. We can
combine it with others like it to make new ones. And if we look at it just right, we can
discern the pieces that comprise it.
Generating functions are tools. They give an indirect representation of the sequence but
one that is easy to manipulate in many respects. Some of the uses to which we can use
generating functions include the following:
1. Calculate sums.
2. Solve recurrences.
3. Characterize the asymptotic behavior of sequences.
4. Prove useful properties of sequences.
5. Find expected values, variances, and other moments and cumulants of distributions.
6. Establish relationships between distributions.
We will see examples of all of these this semester.
Generating functions are fun. The feeling of cracking a tough sum or recurrence is
better than . . . well, no, but it’s pretty darn good. The mathematics is beautiful and
connects diverse areas, including algebra, number theory, complex analysis, combinatorics,
and probability. And the knowledge will come in handy in surprising ways.
A generating function like this has two modes of existence depending on how we use it. First,
we can view it as a function of a complex variable. Second, we can view it as a formal power
series. We will exploit both perspectives.
But first, a word about boundaries. It is most common to deal with sequences gn defined
for n ≥ 0 – let’s call them one-sided sequences for lack of a better name. Even with a
one-sided sequence (gn ), it is convenient to extend the sequence to all integers by defining
gn = 0 for n < 0. This leads to the following conventions.
Convention 2. If a sequence gn is defined for n ≥ n0 for some fixed integer n0 (e.g., 0),
then we will assume – unless otherwise indicated – that gn = 0 for n < n0 .
1 12 Feb 2006
Convention 3. If a sum is written with no limits specified for the summation variable (as
P
in n ), the summation is taken to be over all integers.
Hence, we will prefer to write our generating functions in the following form
X
G(z) = gn z n (2)
n
This is often useful, and all the theory is a direct generalization. I’ll freely use the multi-
variable form when needed.
2 12 Feb 2006
P
Example 6. The series n≥0 z n /n! has R = ∞, which can be seen for example via Stirling’s
approximation.
P
Given a power series G(z) = n≥0 gn z n expressed as a function of a complex variable,
the radius of convergence of the series determines properties of the function.
P
Theorem 7. If G(z) = n≥0 gn z n is a power series with radius of convergence R, then
G(z) is an analytic (a.k.a. holomorphic, which I tend to prefer) function on the disk |z| < R
and has at least one singularity on the circle |z| = R.
(Note: An analytic or holomorphic function on a region A in the complex plane is a
function that is complex differentiable at every point in the region or equivalently that it
has a convergent power series expansion in an open disk around every point of A. These
conditions have profound consequences for the behavior of the function.)
P
Example 8. If G(z) = n≥0 z n , then on the disk |z| < 1, we can write G(z) = 1/(1 − z),
which is analytic function there. This function has a pole (a simple singularity) at z = 1.
It follows that the identification of a power series with a closed-form expression for the
function – as a function of a complex variable – only holds within the region of analyticity.
Consider 1/(1 − z) evaluated at z = 2, for example.
Given a power series G(z) as above that is analytic within |z| < R for some R > 0,
we can recover the coefficients by taking derivatives. For example, G(0) = g 0 , G0 (0) = g1 ,
G00 (0)/2 = g2 , and G(k) (0)/k! = gk . Using Cauchy’s formla from complex analysis, we can
expresss this same relationship as a contour integral – an integral of a function along a closed
curve γ: [0, 1] → C in the complex plane. Cauchy’s formula states
Z
1 G(z)
gk = dz, (5)
2πi γ z k+1
for any closed curve γ around the origin that is contained in the open disk |z| < R. (For
example, γ(t) = (R/2)e2πit .)
This contour integation approach has several other uses. For example, consider a power
P
series G(z) = n∈Z gn (z − z0 )n , then a contour integral around z0 (say with γ(t) = z0 + e2πit )
gives
Z
1
g−1 = G(z) dz (6)
2πi γ
the so-called residue of G at z0 , and for n > 0,
Z
1
g−n = G(z)(z − z0 )n−1 dz (7)
2πi γ
3 12 Feb 2006
1.2. Generating Functions As Formal Power Series
The second perspective on generating functions is to view them as formal power series. That
is, we view the generating function as an algebraic expression for manipulating the sequence
of numbers – not as a function of a complex variable.
Definition 9. A formal power series (over C) is an algebraic “symbol”
X
G(z) = gn z n , (8)
n
for complex-valued coefficients gn . The sum above ranges over all integers, but we will
assume here that gn is eventually zero as n → −∞. Two formal power series are equal if
and only all of their coefficients are equal.
We also need a way to refer to a particular coefficient given a formal power series that
is expressed as a function rather than as a sequence.
Notation 10. If G(z) is a formal power series, then we write [z n ]G(z) to denote the
P
coefficient of z n in the series. That is, if G(z) = n gn z n , [z n ]G(z) = gn . Of course, the z in
these expressions is a dummy variable. It’s also true that [un ]G(u) = gn , for example.
This notation is standard and quite convenient in the end, though rather clunky. It
also does take some getting used to. As you get comfortable with it, try expressing the
coefficients in terms of a specific sequence (as we did with “gn ” above) and make the mapping.
Eventually, this notation will become familiar.
To make sense of the definition, let’s consider formal power series as algebraic objects
P P
and some of the operations we can perform on them. Let G(z) = n gn z n , H(z) = n hn z n ,
P
and F (z) = n fn z n .
First, we can add formal power series to produce a new formal power series:
X
G(z) + H(z) = (gn + hn )z n , (9)
n
and this addition is commutative (G(z) + H(z) = H(z) + G(z)) and associative (G(z) +
H(z)) + F (z) = G(z) + (H(z) + F (z)) by the commutativity and associativity of addition
for the coefficients.. Note that the set of one-sided series is closed under this operation (that
is, the sum of two one-sided series is itself one-sided.)
Second, we can multiply formal power series to produce a new formal power series:
!
X X
G(z)H(z) = gk hn−k z n , (10)
n k
That is, [z n ]H(z) = [z n+m ]G(z) for n ≥ 0. This is a truncated left shift, and the sum
above cannot in general be extended over all integeers. A two-sided left-shift is obtained
by [z n ]G(z)/z m = [z n+m ]G(z); this is valid for all m but usually less useful.
3. Derivatives. We define the derivative operator on formal power series by G0 (z) =
P n−1
n ngn z . Hence,
4. Shift-Derivative Combinations. Even more useful is when we combine shift and derivative
operations. Let S be the right-shift operator SG(z) = zG(z) and D be the derivative
operator DG(z) = G0 (z). Then, we find that
(Try this out for yourself.) Both of these are useful operators in their own right. The
second identity generalizes easily to a useful form. If Q(z) is some polynomial, then
Q(SD) is an operator consisting of a linear combination of (SD)k s. Using linearity, the
second equality above thus shows that
1 m−1
X
H(z) = G(ze2πij/m ) (26)
m j=0
X 1 m−1
X
= gn z n e2πinj/m (27)
n≥0 m j=0
X
= gm·n z mn (28)
n≥0
= F (z m ), (29)
for a power series F (z) with [z n ]F (z) = [z mn ]G(z). Denote the mapping from G to F by
the operator Mm .
The simplest example is the case m = 2, where the roots of unity are 1 and −1. In this
case, M2 G(z) consists of only the even terms in the series G(z). To get the odd terms,
we apply the same trick but with shifts: (M2 zG(z))/z does the job.
Thus, we can extract all subseequences with n mod m = k for k = 0, . . . , m − 1 by
Mm z m−k G(z)
Fk (z) = =⇒ [z n ]Fk (z) = [z mn+k ]G(z) (30)
z
It looks worse than it is. Try it for m = 3.
9. Composition. If G(z) and F (z) are two generating functions, we might wish to compute
the composition of the two functions. Is this well defined? Write
!n
X n
X X m
G(F (z)) = gn F (z) = gn fm z . (31)
n≥0 n≥0 m
Notice that if f0 6= 0, then the contribution of each term in the sum to [z n ]G(F (z))
could be non-zero if fn 6= 0 infinitely often. The algebraic construction of formal power
series does not support this infinite sum (no limits, remember), so the composition is not
well-defined as a formal power series in this case.. If f0 = 0, however, then [z n ]G(F (z))
depend on at most n terms in the sum, which is well defined. The composition operation
is thus well defined for two formal power series only if f0 = 0 or if G(z) has only finitely
many non-zero coefficients (i.e., it is a polynomial).
These are the most commonly used manipulations, though by no means the only ones.
7 12 Feb 2006
P
Example 11. Compute nk=1 k 2 .
We can do this in two ways. First, [z n ](SD)2 (1 − z)−1 = n2 and by partial summation
n
X 1 1 z2 + z
k 2 = [z n ] (SD)2 = [z n ] = [z n
] S 2
(1 − z) −4
+ S(1 − z) −4
. (32)
k=1 1−z 1−z (1 − z)4
By the binomial theorem, negating the upper index, and symmetry – as we saw in class and
on homework – we have
! ! !
−4
X −4 X n+3 n X n+3 n
(1 − z) = (−1)n z n = z = z . (33)
n≥0 n n≥0 n n≥0 3
Hence, ! !
n
X n+1
2 n+2 n (1 + n)(1 + 2n)
k = + = . (34)
k=1 3 3 6
For another approach, begin with the polynomial
n
X 1 − z n+1
zk = . (35)
k=0 1−z
P Pn
Note that (SD)2 nk=0 z k = k=0 k 2 z k , and evaluating this polynomial at z = 1 gives us the
first sum we seek. Hence
n
X
2 21 − z n+1
k = (SD) . (36)
k=0 1 − z z=1
Carrying out the algebra and evaluating (by taking limits via L’Hopital) yields:
n
X n (1 + n)(1 + 2n)
k2 = . (37)
k=0 6
8 12 Feb 2006
X 1 n
ez = z (43)
n≥0 n!
X (−1)n 2n+1
sin z = z (44)
n≥0 (2n + 1)!
X (−1)n 2n
cos z = z (45)
n≥0 (2n)!
1 X 1 n
log = z (46)
1 − z n≥1 n
X (−1)n+1 n
log(1 + z) = z (47)
n≥1 n
!
X r n
r
(1 + z) = z (48)
n n
! !
1 X n+m X m+n
n
m+1
= z = zn (49)
(1 − z) n n n m
2. Solving Recurrences
One of the most powerful uses of generating functions is to solving recurrence relations.
These are relations between elements of a sequence that enable us in principle to compute
every element. The problem with this approach is that we often want to use and compute
for arbitrary or very large index, making sequential computation either useless or infeasible.
This leads to the idea of a closed form: an explicit expression for a sequence or function in
terms of the free variable. We will actually consider two types of closed forms. The first
is where we express the elements of the sequence exactly in terms of the free variable; the
second, is when we find a closed-form expression for the generating function of the sequence.
In many cases, the latter is the best we can do, but it is quite often good enough for what
we need.
Let’s start with a very simple example to make the ideas concrete.
T0 = 0 (50)
Tn = 2Tn−1 + 1 for n > 0 (51)
The sequence as defined is one-sided, so by our convention, we also take Tn = 0 for n < 0.
As recurrences go, this one is fairly straigtforward because we can guess the closed form
and prove it simply by induction. But let’s illustrate the basic method of using generating
functions for this kind of problem anyway.
Step 1. Express the recurrence relation as a single equation that is valid for all integers.
z
T (z) = (59)
(1 − z)(1 − 2z)
A B
=z + . (60)
1 − z 1 − 2z
Multiplying both sides by (1 − 2z) and evaluating at z = 1/2 gives B = 2. Multiplying both
(right hand) sides by (1 − z) and evaluating at z = 1 gives A = −1. Hence,
2 1
T (z) = z − (61)
1 − 2z 1 − z
X
=z (2n+1 − 1)z n (62)
n≥0
X
= (2n+1 − 1)z n+1 (63)
n≥0
X
= (2n − 1)z n (64)
n≥1
X
= (2n − 1)z n . (65)
n≥0
Thus, Tn = 2n − 1 for n ≥ 0.
The process was somewhat pedantic in this case, but the key is that the same process
works just so in much, much harder cases.
10 12 Feb 2006
Let’s use the tools developed in the last section on a related but slightly harder case.
T0 = 1 (66)
Tn = 2Tn−1 − 1 + n for n > 0 (67)
so
X
T (z) = 2zT (z) + z nz n + 1 (70)
n≥0
P
1 + z n≥0 nz n
= . (71)
1 − 2z
P
What to make of that n≥0 nz n ? If we look at it for a moment and remember the last
section, we might recognize it as
X
n 1 z
nz = S D = . (72)
n≥0 1−z (1 − z)2
Excellent!
Step 3. Solve for T (z): It follows that
1 + z 2 /(1 − z)2
T (z) = (73)
1 − 2z
(1 − z)2 + z 2
= (74)
(1 − 2z)(1 − z)2
1 − 2z + 2z 2
= . (75)
(1 − 2z)(1 − z)2
Step 4. Expand and find coefficients. Since neither 1 nor 1/2 is a root of 1 − 2z + 2z 2 ,
we need to solve
1 − 2z + 2z 2 A B C
2
= + 2
+ . (76)
(1 − 2z)(1 − z) 1 − 2z (1 − z) 1−z
Multiplying both sides by (1 − 2z) and evaluating at z = 1/2 yields A = 2. Multiplying both
sides by (1 − z)2 and evaluating at z = 1 yields B = −1. Evaluating at z = 0 then yields
C = 1 − A − B = 0. Thus, using the common generating functions given earlier:
2 1
T (z) = − (77)
1 − 2z (1 − z)2
11 12 Feb 2006
X X
= 2n+1 z n − (n + 1)z n (78)
n≥0 n≥0
X
= (2n+1 − (n + 1))z n , (79)
n≥0
If Q(z) has roots with multiplicity, it’s not quite as easy, be we can often use this basic
approach to reduce the unknowns. For example, suppose Q(z) = (1 − ρ1 z)2 (1 − ρ2 z) · · · (1 −
ρd z). Then, the partial fractions expansion looks like
Xd
P (z) B1 Ak
= 2
+ . (83)
Q(z) (1 − ρ1 z) k=1 1 − ρk z
If we multiply both sides by (1 − ρ1 z)2 and evaluate at z = 1/ρ1 , we find B1 but not A1 .
Assuming that zero is not a root of Q, however, then once we have B1 and A2 , . . . , Ad , we
can find A1 by evaluating at z = 0 to get:
Xd
P (0)
A1 = − B1 − Ak . (84)
Q(0) k=2
12 12 Feb 2006
Another variation on the basic procedure comes when we have more than one generating
function to solve for simultaneously. This comes up, for instance, with coupled recurrences.
We saw this in the Pentagon Mole problem on homework and in solving for the random
walk return times and on this week’s homework as well. As this shows, introducing coupled
recurrences can sometimes make a problem easier.
Yet another variation is a recurrence in more than one variable. The ideas here are the
same, it’s just that we use multiple summations. Consider the following example.
Cn,0 = 1 (85)
Cn,m = Cn−1,m + Cn−1,m−1 , m > 0. (86)
P
Define Cn (z) = m Cn,m z m . Note that by our one-sided condition, C−1,m = 0 for all m > 0,
so C0 (z) = 1.
Step 1. Here the boundary condition has to be dealt with.
! !
X X n n m
C(y, z) = y z (96)
m n m
1
= (97)
1 − y − yz
1 1
= y (98)
1 − y 1 − 1−y z
!m
1 X y
= zm . (99)
1 − y m≥0 1 − y
1 ym
By a similar stretch of the notation [z m ] 1−(1+z)y = (1−y)m+1
. This gives us immediately two
useful sums:
!
X n m
z = (1 + z)n (100)
m m
!
X n n ym
y = . (101)
n m (1 − y)m+1
Nice.
Double summation like this can be a useful technique for calculating sums. Our sum – like
P n m P n n
m m z or n m y – is really a parameterized family of sums. We do the power series
trick – multiply by y or z m , respectively, and sum – to get a two dimensional generating
n
function. By interchanging the order of the sums and other tricks, we can often reduce the
sum to simpler form. Just as above.
14 12 Feb 2006
Theorem 12. (The Lagrange Inversion Formula). Let F (u) and G(u) be generating
functions with G(0) = 1. Then there is a unique formal power series U = U (z) that satisfies
the functional equation
U (z) = zG(U (z)). (102)
1 n−1 0
[z n ]F (U (z)) = [u ](F (u)Gn (u)). (103)
n
This is a variant (see References) of a traditional theorem on formal power series known
variously as the Lagrange Inversion Theorem or Lagrange Inversion Theorem. The idea
is that it shows how to compute the power series expansion of a function that is defined
implicitly in terms of a holomorphic function. As it’s traditionally written: Let u be defined
implicitly by u = c + zG(u). Then, we can expand F (u) in a power series about z = 0
(u = c) as
∞
X z n n−1 n
F (u) = F (c) + D (G · F 0 )(c). (104)
n=1 n!
In our case, we are taking c = 0, and notice that by our rules above
and so,
1 n−1 n
[z n ]F (u) = [u ](G (u)F 0 (u)). (106)
n
This isn’t a proof but it does connect it to the traditional form of the Lagrange theorem.
This theorem is applied in Statistics in developing Cornish-Fisher expansions and their gen-
eralizations.
Exercise 13. Let tn represent the number of labeled, rooted trees on n vertices. Let T (z)
be the generating function for this sequence. Then, T (z) satisfies
Use the LIF to find tn . (This is a fairly standard example for applying the theorem but is
good practice, cf. Homework 3.)
15 12 Feb 2006
4. Other Kinds of Generating Functions
The generating functions we’ve seen so far are power series, but it turns out to be useful to
consider alternative forms.
A critical feature of power series is that G(z) = 0 if and only that [z n ]G(z) = 0 for all
n. In principle, if we identify component functions φn (z) such that
X
A(z) = an φn (z) = 0 ⇐⇒ an = 0 for all n, (108)
n
then we can consider generating functions based on the φn . The ordinary generating functions
are based on φn (z) = z n .
Two other important examples are φn (z) = z n /n!, which give exponential generating
functions and φn (z) = n−z (n > 0) which give Dirichlet generating functions. There are
others as well, but these are the most important.
In each case, we can develop a formal theory and a set of manipulations that help us
solve problems, much as we did above. Depending on the situation, one or the other form of
generating functions might be more convenient. For our purposes now, the key thing is to
be aware of this as an option.
5. Asymptotics
In many applications, we will are interested in (or are willing to accept) an approximation
to a sequence gn for large n. The asymptotic behavior of the sequence is closely tied to the
location and nature of the singularities in G(z) and to the behavior of the function there.
Let’s start with a basic relation. Suppose G(z) is a power series with radius of conver-
gence R. Then, by definition, lim supn→∞ |an |1/n = 1/R. By the definition of lim sup, this
implies that for every > 0 we have
1
|an |1/n < + eventually (109)
R
1
|an |1/n > − infinitely often, (110)
R
which tells us that
n
1
|an | < + eventually (111)
R
n
1
|an | > − infinitely often, (112)
R
The radius of convergence thus gives us a basic asymptotic bound on the sequence. But
notice that R is determined by the modulus of the singularity closest to 0.
e
Suppose we can find a function G(z) that is easy to work with and has the same singular-
e
ities as G(z) on the circle |z| = R in the complex plane. Then, G(z) − G(z) has its smallest
singularities farther than R from the origin and is thus analytic on a disk |z| < S for S > R.
e
By the above, the coefficient [z n ](G(z) − G(z)) are eventually smaller than (1/S + )n for
16 12 Feb 2006
any > 0. But !n
(1/S + )n 1/S +
= → 0. (113)
(1/R + )n 1/S +
e
So, [z n ]G(z) should be like [z n ]G(z) plus terms of smaller order.
A meromorphic funtion G(z) on an open subset of the complex plane is a function that
can be expressed as a ratio F (z)/H(z) of two holomorphic (analytic) functions on that
domain, with the restriction that H(z) not be identically 0. Such a G(z) is analytic except
at (at most countably many) isolated points called poles where the function behaves like
a(z − z0 )k for some k.
If G(z) is meromorphic and z0 is a pole of order p, then in some neighborhood of z0 (but
excluding z0 itself), we can expand G(z) as
∞
X
G(z) = gn (z − z0 )n . (114)
n=−p
This is called the Laurent expansion of G around z0 . The part of the above sum with negative
indices, call it G− (z; z0 ), is called the principal part of that expansion.
If G(z) has radius of convergence R and has finitely many poles z1 , . . . , zm on the circle
P
|z| = R, then G(z) − m k=1 G− (z; zk ) is analytic on some disk of radius S > R, so by the
above argument for any > 0,
m
X n
n n 1
[z ]G(z) = [z ] G− (z, zk ) + O + . (115)
k=1
S
We could take this analysis even farther. Two famous results are Darboux’s method for
branch points and Hayman’s method for entire functions. These are discussed very nicely in
[Wilf 2005] and generalized further in [Flajolet and Odlyzko 1990]. See references below.
6. Selected References
Flajolet, P. and Odlyzko, A. (1990). Singularity Analysis of Generating Functions. SIAM
Journal of Discrete Mathematics, 3, 216.
[New and improved asymptotic methods for generating fucntions. Very technical overall, but the
introduction gives a clear feel for the results. Generalizations of Darboux’s theorem. Available
on-line from Flajolet’s web site, see next entry.]
17 12 Feb 2006
∗
Graham, R., Knuth, D., and Patashnik, O. (1994). Concrete Mathematics.
[This is one of my favorite books period. Clearly written, fun, inspiring, and filled with challenging
ideas and problems. I’ve only used the first edition, but the second edition is supposedly even
better. Chapter 7 on Generating Functions is a good place to start on the topic period, but I’d
suggest reading the whole thing.]
Wilf, H. and Zeilberger, D. (1990) Rational Functions Certify Combinatorial Identities. Jour-
nal of the American Mathematical Society, 3, 147.
[The details on one method Wilf presents in his book. Quite readable; available on-line from
Carnegie Mellon.]
18 12 Feb 2006