Branching Processes and Their Applications
Branching Processes and Their Applications
Abstract
Studied within this report is the idea of family lineage and how surnames
can become extinct over time. To begin, the report discusses the history of
branching processes, relating to the likelihood of extinction of a surname within
a region, in Section 1. Subsequently, the definition of branching processes and
their properties are discussed in Sections 2.1 and 2.2 respectively, before ex-
ploring the Galton-Watson process. This is done through the discussion of
the formation of generating functions in Section 3.1, the total number of in-
dividuals in Section 3.2, and the ultimate extinction in Section 3.3. Other
applications of branching processes are then discussed including their appli-
cations to nuclear chain reactions in Section 4.1 and genes and mutations in
Section 4.2.
1
1 Introduction
In 1873, Francis Galton proposed a problem regarding the likelihood of extinction of
a surname within a region. The problem was originally proposed in a similar manner
to the following:
A large population of men, where each man in the first generation has a different
surname, has reproduction laws where ak % of men of a generation have k male
children. Find what proportion of these surnames are extinct after a set number of
generations; and how many instances there are of the same surname being held by a
set number of people in a generation.
Whilst the original problem specified a maximum of 5 male children, Reverend H.
W. Watson and Galton extended it to an unknown large number, which is so large it
is inconsequential to the distribution. Since Watson and Galton’s initial discussion
in their 1874 paper [1] this problem has been further discussed and expanded by
reducing assumptions to produce a more accurate model whilst still following some
of the initial ideas.
In this report we will first define branching processes, in particular the Galton-
Watson process, in the context of the original problem and discuss its properties.
We will then explore the Galton-Watson process as detailed in the original paper
[1] and how it relates to the ultimate extinction theorem. Finally we will discuss
further applications of branching processes outside the extinction of surnames. These
applications include nuclear chain reactions and genetic mutations.
2 Galton-Watson Process
2.1 Definition
Branching processes are mathematical representations depicting how populations
grow over multiple generations [2]. In each generation, n, the members give birth to
a certain number of offspring (subject to laws of chance), making up the (n + 1)th
generation.
In order for us to study these branching processes, a number of observations (out-
lined by Fewster in her lecture notes [3] from the University of Auckland) must be
made.
1. Firstly, it is vital to assume that members reproduce individually of one another
in order to uphold the probabilities of reproduction - and therefore extinction.
2
2. Another assumption is that the size of a generation is independent of the size
of the previous generations.
3. In addition, we assume that the number of offspring of different individuals are
independent, identically distributed random variables.
In practical terms this means that the size of a generation doesn’t depend on the
size of the generations that came before it. In addition to this, the number of
offspring a particular member has, is not affected by the number of members currently
present.
In reality however, these may not always be realistic assumptions. For example, an
individual with fewer siblings may be more likely to have fewer children themselves
as this is the type of family environment they are used to. Or in-fact, the number
of cancerous cells in one’s body may be restricted by the number of cancerous cells
currently surrounding it. Nonetheless, they are invaluable when trying to model
real-world applications.
One particular branching process is the Galton-Watson process, which can be ex-
pressed using a fundamental formula, given by Zitkovic [4] from the University of
Texas at Austin.
Definition 2.1. Under the above assumptions, a discrete time process, Zt , is called
a (Galton-Watson) branching process if Z0 = 1 and the population of the nth gener-
ation, Zn for n ≥ 1, is given by the formula:
Zn−1
X
Zn = Z(n−1)j ,
j=1
2.2 Properties
Branching processes satisfy a number of properties particularly relating to Zn , the
size of the population at the nth generation. These properties include the below and
3
are inspired by lecture notes by Zitkovic [4] from the University of Texas at Austin
and Fewster [3] from the University of Auckland.
1. When a new generation is formed, the members of the previous generation die
out leaving only the members of this new generation.
2. Relating to the population sizes, Zn , of the generations n = 1 onwards:
• The generation n = 1 will die out if Z1 = 0, and generations n ≥ 2 will
not exist, resulting in Zn = 0 for all n ≥ 2.
• If Z1 6= 0, then all members of Z1 will produce a random number of mem-
bers who, collectively, will make up the generation n = 2, Z2 . For example,
the first member of Z1 produces Z11 members, the second member of Z1
produces Z12 members, etc. In general, the j th member of Z1 produces Z1j
members, where the largest value j is Z1 . Following from the assumption
in Section 2.1 that the number of descendants of different members of
a given generation are independent and identically distributed, we know
that the number of members of the generation n = 2, Z2 , is:
Z1
X
Z2 = Z1j . (2.1)
j=1
4
couples to reproduce. However we shall not review this here and shall instead focus
our efforts into exploring the original Galton-Watson Process detailed in 1873.
Proof. For every member of the (m + n)th generation, there exists a unique member
of the mth generation. Therefore,
Zm+n = X1 + X2 + . . . + XZm
where Xj = the number of members from the (m + n)th generation originating from
the j th member of the mth generation. By the assumptions above, this sum of random
variables is independent and identically distributed with an equal distribution to Zn
(the number of nth generation particles originating from the very first particle in
Z0 ). We can then obtain that Gm+n (s) = Gm (GX1 (s)) where GX1 (s) = Gn (s). If we
iterate this, we get
Gn (s) = G1 (Gn−1 (s)) = G1 (G1 (Gn−2 (s))) = G1 (G1 (. . . (G1 (s)) . . .)). (3.1)
Note that G1 (s) is equivalent to G(s) as before, therefore the theorem is proven.
This theorem demonstrates that theoretically, we could learn a lot about Zn and its
distribution using generating functions.
5
3.2 Total Number of Individuals
To eventually be able to evaluate the total number of individuals with ease, we
must firstly introduce the moments of Zn as follows. This allows us to calculate the
expectation and variance of the number of people in a given generation.
Lemma 3.2. Let µ = E(Z1 ) and σ 2 = Var(Z1 ). Then,
E(Zn ) = µn (3.2)
(
nσ 2 µ = 1;
Var(Zn ) = (3.3)
σ 2 (µn − 1)µn−1 (µ − 1)−1 µ 6= 1.
Proof. We will prove this lemma using ideas from E. J. McCoy [7]. For the expecta-
tion, recall that µ = E(Z1 ) by substituting n = 1 into (3.2). Then,
µ = G0n (1).
Recall that Gn (s) = G1 (Gn−1 (s)) from (3.1). Differentiating the last equality gives:
For the variance, recall that σ 2 = Var(Z1 ) by substituting n = 1 into (3.3). Further-
more, let σn2 = Var(Zn ). Differentiating the generating function twice gives:
and
G00n−1 (1) = σn−1
2
− µn−1 + µ2n−2 .
6
From (3.4),
G00n (1) = G00n−1 (1)G0 (1)2 + G0n−1 (1)G00 (1)
σn2 − µn + µ2n = (σn−1 2
− µn−1 + µ2n−2 )µ2 + µn−1 (σ 2 − µ + µ2 ).
The above lemma details how to obtain the expected number of people in a given
generation. Subsequently, we shall calculate the number of people to have ever
carried a particular surname. Let Tn be the total number of individuals up to and
including the nth generation, that means, Tn = Z0 + Z1 + Z2 + . . . + Zn . Then, by
the linearity of expectation and inspired by lecture notes [7] from Imperial College
London,
E(Tn ) = E(Z0 ) + E(Z1 ) + E(Z2 ) + . . . + E(Zn )
= 1 + µ + µ2 + . . . + µn
Xn
= µi
i=0
7
This allows us to conclude the following theorem, as stated by Feller [8].
Theorem 3.3 (Ultimate Extinction). If µ < 1, the probability of ultimate extinction
is 1 as extinction is guaranteed. If µ = 1, extinction is also guaranteed (unless Zn = 1
at every generation, n). If µ > 1, the probability of ultimate extinction is between 0
and 1.
For the cases when µ < 1 or µ > 1, we can clearly show why this would be the case
by using the Equation (3.5) and letting n tend to infinity:
1
µ < 1;
lim E(Tn ) = 1 − µ
n→∞ ∞ µ > 1.
At first glance, it would appear that the number of people with a specific surname
would either eventually be 0, if µ < 1, or increase unboundedly, if µ > 1. However,
when µ = 1 it is more complicated. In general, when µ = 1 the surname will die
out because at some point Zn < 1 ie. Zn will be 0 for one of the generations. But
clearly when Zn = 1 for every generation this is not the case as there will always
be 1 male to carry on the family name, and therefore the surname will not become
extinct.
Having looked at the ideas behind the above theorem, we can now formally prove
this using graphical methods with a proof inspired by Chapter 7 of Fewster’s lecture
notes [9].
8
Proof. In order to prove Theorem 3.3, we shall study graphs we have drawn depicting
curves of G(s) for the different values of µ.
In each instance, the graph of G(s) satisfies the following underlying conditions:
1. G(s) is increasing and strictly convex, as long as Zn can be ≥ 2.
2. G(0) = P(Zn = 0) ≥ 0.
3. G(1) = 1.
0
4. G (1) = µ, so the slope of G(s) at s = 1 gives the value µ.
5. The extinction probability, γ, is the smallest value ≥ 0 for which G(s) = s.
By evaluating each possible value of µ individually, we can determine the extinction
probability in instance.
9
Conversely, when µ > 1, the gradient of
G(s) is steeper than the line y = s at
the point s = 1. Therefore, the curve
sits below the line from this point, mean-
ing that in order to intersect the y axis
at P(Zn = 0), the curve must once again
cut the line at another point. As a re-
sult, there are two roots to the equa-
tion G(s) = s, namely 1 and s1 where
0 < s1 < 1. Hence, here the extinction Figure 3: Graph of y = G(s) when µ > 1.
probability, γ, is a positive value less than
1.
This section has explored many theorems relating to branching processes in the
context of passing on family surnames to successive generations. However, as stated
earlier, this is not the only application of branching processes - an idea which will
be studied in further detail in the following section.
4 Further Applications
There are numerous other applications of branching processes in addition to Galton’s
investigation into the extinction of family names. This section gives a brief overview
of some of these.
The processes described in Sections 4.1 and 4.2 each operate under this simple model
proposed by Feller [8]:
(i) the first generation consists of 1 particle,
(ii) the probability of a particle producing exactly k particles, is fixed and denoted
pk (k = 0, 1, 2, . . .),
(iii) the (n + 1)th generation is formed from the offspring of the nth ,
(iv) the particles act independently of each other.
10
4.1 Nuclear Chain Reactions
The above model can be used to describe the behaviour of neutrons in nuclear fission
reactions, such as that in an atomic bomb. In this case, the particles being considered
are the neutrons. Their offspring are the neutrons released from the splitting of larger
nuclei into smaller nuclei by collision with a neutron. This problem was initially
discussed by Feller [8].
Let Znj denote the number of offspring for the j th particle of the nth generation.
Assume that all collisions produce the same number of neutrons, say m. Also as-
sume that each neutron collides with a larger nuclei with probability p. Then the
probability P(Znj = m) = pm that a particle (neutron) produces m offspring is equal
to p. Also, the probability P(Znj = 0) = p0 that each particle produces zero offspring
is given by 1 − p. Clearly, the only possible numbers of offspring are 0 and m, giving
a probability density function of
p
i=m
P(Znj = i) = pi = 1 − p i = 0
0 otherwise
11
Clearly Znj ∼ Bin(200, 1/200), and so
i 200−i
200 1 1
P(Znj = i) = pi = 1−
i 200 200
Figure 4: Bin(200,1/200)
12
Figure 5: Poisson(1)
5 Conclusion
Throughout this report we have seen how branching processes and their given prop-
erties allow us to explore the growth of various populations over multiple generations.
Through the use of generating functions, we have shown how to calculate the num-
ber of members in a particular generation, allowing for the exploration of when a
particular population is expected to become extinct (if ever). In general, we con-
clude that when the expected size of each generation is less than or equal to 1, it is
certain that this population will eventually die out. And therefore generations must
be of notable size (have an expectation greater than 1) in order to have a chance of
surviving.
When determining the extinction probability of a population, this report has mainly
focused on the Galton-Watson process - a model used to determine the expected
lifetime of a particular family name, carried by reproducing males. But Section 4
outlines a mere handful of other applications that branching processes can be used
to model. Other notable applications not included in this report are the spread of
bacteria and the expected time people spend waiting in lines.
However, many assumptions have been made in order for us to study these - with
13
a number of such assumptions being unrealistic compared to how these populations
act in real-life situations. We therefore acknowledge that it would be an interesting
exercise to repeat our investigation without these assumptions and compare these
new results with those explored here.
References
[1] H. W. Watson and F. Galton. On the Probability of the Extinction of Families.
The Journal of the Anthropological Institute of Great Britain and Ireland, 4, 1875.
Pages: 138–144. 2
[2] T. E. Harris. The Theory of Branching Processes, volume 119 of Grundlehren
der mathematischen Wissenschaften. Springer-Verlag Berlin Heidelberg, 1963. 2,
4
[3] R. Fewster. STATS 325 Stochastic Processes, 2014. Chapter 6: Branching Pro-
cesses: The Theory of Reproduction. 2, 4
[4] G. Zitkovic. M362K Intro to Stochastic Processes, Fall 2014. Lecture 7: Branch-
ing Processes. 3, 4
[5] H. Wilf. generatingfunctionology. Academic Press, second edition, 1994. 5
[6] G. R. Grimmett and D. R. Stirzaker. Probability and Random Processes. Oxford
University Press, 1992. 5
[7] E. J. McCoy. M3S4/M4S4 Applied Probability, 2008. Chapter 4: Branching
processes. 6, 7
[8] W. Feller. An Introduction into Probability Theory and its Applications, Vol 1.
John Wiley & Sons, 1950. 8, 10, 11
[9] R. Fewster. STATS 325 Stochastic Processes, 2014. Chapter 7: Exctinction in
Branching Processes. 8
14
Appendix A R Code for Figure 4
1 # # binomial sample
2 bin _ sample <- rbinom ( size =200 , n =10000 , p =1 / 200)
3 hist ( bin _ sample , col = " palegreen " , xlab = " representations " , main = "
Histogram of number of mutations of mutant gene " )
This is the R Code used to generate Figure 4. A random sample of size 10000 is
taken from a binomial distribution with parameters n = 200 and p = 1/200. A
histogram is then generated from this sample.
This is the R Code used to generate Figure 5. Here a random sample of 10000 is
taken from a Poisson distribution with parameter λ = 1. This sample is then also
used to generate a histogram.
15