0% found this document useful (0 votes)
130 views

Generating Functions

This document introduces generating functions and their applications in probability theory. Generating functions provide a way to simplify counting problems and calculations involving probabilities. Key properties and theorems about generating functions such as the convolution theorem are presented along with examples of applications to probability distributions and random processes.

Uploaded by

Everton Artuso
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views

Generating Functions

This document introduces generating functions and their applications in probability theory. Generating functions provide a way to simplify counting problems and calculations involving probabilities. Key properties and theorems about generating functions such as the convolution theorem are presented along with examples of applications to probability distributions and random processes.

Uploaded by

Everton Artuso
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

Generating functions

Even quite straightforward counting problems can lead to laborious and lengthy
calculations. These are greatly simplified by using generating functions. 2
Definition 1.1. Given a collection of real numbers (ak )k0 , the function
def

G(s) = Ga (s) =

ak sk

(1.1)

k=0

is called the generating function of (ak )k0 .


def

Definition 1.2. If X is a discrete random variable with values in Z+ = {0, 1, 2, . . . },


its (probability) generating function,
def

G(s) GX (s) = E s

sk P(X = k) ,

(1.2)

k=0

is just the generating function of the probability mass function of X.


One of the most useful applications of generating functions in probability
theory is the following observation:
b

Theorem 1.3. If X and Y are independent random variables with values in


def

{0, 1, 2, . . . } and Z = X + Y , then their generating functions satisfy3


GZ (s) = GX+Y (s) = GX (s) GY (s) .
Example 1.4. Let X1 , X2 , . . . , Xn be independent identically distributed
random variables 4 with values in {0, 1, 2, . . . } and let Sn = X1 + + Xn .
Then

n
GSn (s) = GX1 (s) . . . GXn (s) GX (s) .
Z

Example 1.5. Let X1 , X2 , . . . , Xn be i.i.d.r.v. with values in {0, 1, 2, . . . } and


let N 0 be an integer-valued random variable independent of {Xk }k1 . Then
def

SN = X1 + + XN has generating function



GSN (s) = GN GX (s) .

(1.3)

Solution. This is a straightforward application of the partition theorem for expectations. Alternatively, the result follows from the standard properties of conditional
expectations:
h `

i
N
`

E z SN = E E z SN | N = E GX (z)
= GN GX (z) .
2 introduced

by de Moivre and Euler in the early eighteenth century.


if X and Y are discrete random variables, and f , g : Z+ R are arbitrary functions,

then f (X) and g(Y ) are independent random variables and E f (X)g(Y ) = Ef (X) Eg(Y );
4 from now on we shall often abbreviate this to just i.i.d.r.v.
3 recall:

http://maths.dur.ac.uk/stats/courses/StochProc34/

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

Example 1.6. [Renewals] Imagine a diligent janitor who replaces a light bulb
the same day as it burns out. Suppose the first bulb is put in on day 0 and let
Xi be the lifetime of the ith light bulb. Assume that the individual lifetimes
Xi are i.i.d.r.v.s with values in {1, 2, . . . } having a common distribution with
generating function Gf (s). Consider the probabilities

def
rn = P a light bulb was replaced on day n ,

def
fk = P the first light bulb was replaced on day k .
We obviously have r0 = 1, f0 = 0, and for n 1,
rn = f1 rn1 + f2 rn2 + + fn r0 =

n
X

fk rnk .

k=1

A standard computation implies that Gr (s) = 1 + Gf (s) Gr (s) for all |s| 1,
so that Gr (s) = 1/(1 Gf (s)).
In general, we say a sequence (cn )n0 is the convolution of (ak )k0 and
(bm )m0 (write c = a b), if
cn =

n
X

ak bnk ,

n 0,

(1.4)

k=0

The key property of convolutions is given by the following result


b

Theorem 1.7. [Convolution thm] If c = a b, then Gc (s) = Ga (s) Gb (s).


Why do we care? If the generating function Ga (s) of (an )n0 is analytic around
the origin, then there is a one-to-one correspondence between Ga (s) and (an )n0 ;
namely, ak can be recovered via 5

1 dk

G
(s)
.
ak =
a
s=0
k! dsk

(1.5)

This result is often referred to as the uniqueness property of generating functions.


Example 1.8. Let X Poi() and Y Poi() be independent. Then Z =
X + Y is Poi( + ).
Solution. A straightforward computation gives GX (s) = e(s1) , and Theorem 1.3
implies
GZ (s) = GX (s) GY (s) = e(s1) e(s1) e(+)(s1) ,
so that the result follows by uniqueness.

A similar argument implies the following result.


5 if

a power series Ga (s) is finite for |s| < r with r > 0, then it can be differentiated in the
disk |s| < r.

http://maths.dur.ac.uk/stats/courses/StochProc34/

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

Example 1.9. If X Bin(n, p) and Y Bin(m, p) are independent, then


X + Y Bin(n + m, p).
Another useful property of probability generating function GX (s), is that it
can be used to compute moments of X:
b

Theorem 1.10. If X has generating function G(s), then 6




E X(X 1) . . . (X k + 1) = G(k) (1) .


The quantity E X(X 1) . . . (X k + 1) is called the kth factorial moment
of X. Notice also that
2
Var(X) = GX (1) + GX (1) GX (1) .
(1.6)
Proof. Fix s (0, 1) and differentiate G(s) k times 7 to get

G(k) (s) = E sXk X(X 1) . . . (X k + 1) .

Taking the limit s 1 and using the Abel theorem,8 we obtain

G(k) (s) E X(X 1) . . . (X k + 1) .

Notice also that

lim GX (s) lim E[sX ] = P(X < ) .

s1

s1

 
Exercise
1.11.
Let
S
be
defined
as
in
Example
1.5.
Use
(1.3)
to
compute
E
SN
N
 
and Var
 S
 N in terms
 of E[N ], E[N ], Var[X] and Var[N ]. Now check your result
for E SN and Var SN by directly applying the partition theorem for expectations.
Example 1.12. A bird lays N eggs, each being pink with probability p and
blue otherwise. Assuming that N Poi(), find the distribution of the total
number K of pink eggs.
Solution. a) A standard approach in the spirit of Core A gives the value P(K = k)
directly; as a by-product of this computation one deduces that K and M = N K
are independent (do this!).
b) One can also think in terms of a two-stage probabilistic experiment similar to one
described in Example 1.5. Here K has the same distribution as SN with N Poi()
and X Ber(p); consequently, GN (s) = e(s1) and GX (s) = 1 + p(s 1), so that
GSN (s) = ep(s1) , ie., K Poi(p).
6 here,

G(k) (1) is the shorthand for G(k) (1) lims1 G(k) (s), the limiting value of the
kth derivative of G(s) at s = 1;
7 As |G (s)| E|sX | 1 for all |s| 1, the generating function G (s) can be differentiated
X
X
many times for all s inside the disk, |s| < 1.
8 The Abel theorem says: If a is non-negative for all k and G (s) is finite for |s| < 1, then
a
k

P
lim Ga (s) =
ak , whether this sum is finite or equals +. By the previous remark, the
s1

k=0

Abel theorem applies to all probability generating functions.

http://maths.dur.ac.uk/stats/courses/StochProc34/

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

Exercise 1.13. A mature individual produces immature offspring according to a


probability generating function F (s).
a) Suppose we start with a population of k immature individuals, each of which
grows to maturity with probability p and then reproduces, independently of
other individuals. Find the probability generating function of the number of
immature individuals in the next generation.
b) Find the probability generating function of the number of mature individuals
in the next generation, given that there are k mature individuals in the parent
generation.
c) Show that the distributions in a) and b) above have the same mean, but not
necessarily the same variance.
[Hint: You might prefer to first consider the case k = 1, and then generalise.]

The next example is very important for applications.


Example 1.14. Let Xk , k 1 be i.i.d.r.v.s with the common distribution
P(Xk = 1) = p ,

P(Xk = 1) = q = 1 p .

Define the simple random walk Sn , n 0, via S0 = 0 and, for n 1,


Sn = X1 + X2 + + Xn .


def
Let T be the first time this random walk hits 1, ie, T = inf n 1 : Sn = 1 .
Find the generating function of T , ie., GT (s) E sT .
P
Solution. Write pk = P(T = k), so that GT (s) E[sT ] = k0 sk pk . Conditioning
on the outcome of the first step, and applying the partition theorem for expectations,
we get


GT (s) E sT = E sT | X1 = 1 p + E sT | X1 = 1 q = ps + qs E sT2 ,

where T2 is the moment of first visit to state 1 starting from 1. By partitioning on


the moment T1 of the first visit to 0, decompose
P(T2 = m) =

m1
X
k=1

P T1 = k, T2 = m .

Of course, on the event {S0 = 1},


`

P(T2 = m | S0 = 1) P S1 < 1, . . . , Sm1 < 1, Sm = 1 | S0 = 1 ,


`

P(T1 = k | S0 = 1) P S1 < 0, . . . , Sk1 < 0, Sk = 0 | S0 = 1 ,

where by translation invariance the last probability is just P(T = k | S0 = 0) = pk .


We also observe that
`

P T2 = m | T1 = k = P first hit 1 from 0 after m k steps pmk .


http://maths.dur.ac.uk/stats/courses/StochProc34/

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

The partition theorem now implies


P(T2 = m) =

m1
X
k=1

m1
m
X
X
`

P T2 = m | T1 = k P(T1 = k)
pk pmk =
pk pmk ,
k=1

k=0

`
2
ie., GT2 (s) = GT (s) . We deduce that GT (s) solves the quadratic equation =
ps + qs2 , so that 9
p
1 1 4pqs2
2ps
p
=
.
GT (s) =
2qs
1 + 1 4pqs2
We finally deduce that

1 |p q|
P(T < ) GT (1) =
=
2q

(
1,
p/q ,

p q,
p < q.

In particular, E(T ) = for p < q (because P(T = ) = (q p)/q > 0). For p q we
obtain
(
1
,
p > q,
1

|p

q|
E(T ) GT (1) =
= pq
2q|p q|
,
p = q,
ie., E(T ) < if p > q and E(T ) = otherwise.
Notice that at criticality (p = q = 1/2), the variable T is finite with probability 1,
but has infinite expectation.

Example 1.15. In a sequence of independent Bernoulli experiments with success probability p (0, 1), let D be the first moment of two consecutive successful outcomes. Find the generating function of D.
Solution. Every sequence contributing to dn = P(D = n) contains at least two successes. The probability of having exactly two successes is q n2 p2 (of course, n 2).
Alternatively, let 1 k < n 2 be the position of the first success in the sequence; it
necessarily is followed by a failure, and the remaining n k 1 results produce any
possible sequence corresponding to the event {D = n k 1}. By independence, we
deduce the following renewal relation:
dn = q n2 p2 +

n3
X

q k1 pq dnk1 ,

k=1

and the standard method now implies


GD (s) =

pqs2
p2 s2
+
GD (s) ,
1 qs
1 qs

or

GD (s) =

p2 s2
.
1 qs pqs2

1+p
, so that on average it takes 42
p2
tosses of a standard symmetric dice until the first two consecutive sixes appear.

A straightforward computation gives GD (1) =

Exercise 1.16. Generalize the argument above to the first moment of m consecutive successes.
9 by

recalling the fact that GT (s) 0 as s 0;

http://maths.dur.ac.uk/stats/courses/StochProc34/

O.H.

Stochastic Processes III/IV MATH 3251/4091

M11

Exercise 1.17. In the setup of Example 1.15, show that d2 = p2 , d3 = qp2 , and,
conditioning on the value of the first outcome, that
dn = q dn1 + pq dn2 ,

n > 3.

Use these relations to again derive the generating function GD (s).


Exercise 1.18. Use the method of Exercise 1.17, to derive an alternative solution
to Exercise 1.16. Compare the obtained expectation to that from Example 1.15.
Exercise 1.19. A biased coin showing heads with probability p (0, 1) is flipped
repeatedly. Let Cw be the first moment when the word w appears in the observed
 
sequence of results. Find the generating function of Cw and the expectation E Cw
for each of the following words: HH, HT, TH and TT.
Probability generating functions can also be used to check convergence of
distributions; this is justified by the following continuity result:
Theorem 1.20. Let for every fixed n the
P sequence a0,n , a1,n , . . . be a probability distribution, ie., ak,n 0 and
k,n = 1, and let Gn (s) be the
k0 a
P
k
corresponding generating function, Gn (s) =
k0 ak,n s . In order that for
every fixed k
lim ak,n = ak
(1.7)
n

it is necessary and sufficient that for every fixed s [0, 1),


lim Gn (s) = G(s) ,

where G(s) =
b

k0

ak sk , the generating function of the limiting sequence (ak ).

Remark 1.20.1. The convergence in (1.7) is known as convergence in distribution!


Example 1.21. If Xn Bin(n, p) with p = pn satisfying n pn as n ,
then for every fixed s [0, 1)
n
GXn (s) 1 + pn (s 1) exp{(s 1)} ,
so that the distribution of Xn converges to that of X Poi().
(n)

Exercise 1.22. For n 1 let Xk , k = 1, . . . , n, be independent Bernoulli


random variables with
(n)

P(Xk

(n)

= 1) = 1 P(Xk

Assume that as n

def

(n)

(n) = max pk
1kn

and that

n
X
k=1

(n)
pk

n
X

(n)

Xk

(n)

= 0) = pk .
0

(0, ) .

k=1

Pn
(n)
Using generating functions or otherwise, show that the distribution of k=1 Xk
converges to that of a Poisson() random variable. This result is known as the law
of rare events.

http://maths.dur.ac.uk/stats/courses/StochProc34/

You might also like