0% found this document useful (0 votes)
6 views62 pages

STAT611

Uploaded by

ttquaye85
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views62 pages

STAT611

Uploaded by

ttquaye85
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Stochastic Processes

Dr. Louis Asiedu

2020
Chapter 1

Introduction

Mathematical models can be categorized as being probabilistic or deterministic.Among situations where


probabilistic models are more suitable, very often a better representation is given by considering a
collection or a family of random variables that are indexed by a parameter such as time and space are
known as stochastic processes.(a random or chance process).
Suppose that we have some system which moves from state to state over ”time”. We define the random
variables Xt or X(t) as the state of the process at time t.

Definition 1.0.0.1. A stochastic process is a set of random variables X(t), t ∈ T , where T is called the
parameter space of the process.

NB: t is called an indexing parameter.The values assumed by the process are known as States.
Examples:

1. Consumer preferences on monthly basis


SS= MTN,TIGO,AIRTEL
TS= January, February,...,December

2. Tossing a coin sequentially ; Xn , n = 1, 2...

1
SS= H,T
TS= 1,2,3,...

3. time series data X(t), t ∈ [0, ∞) or Xt : t = 0, 1, ..


The above introduction shows that there are four types of stochastic processes.

(a) Discrete parameter and state spaces


e.g: Consumer preferences observed on a monthly basis.

(b) Continuous parameter space and discrete state space


e.g: Number of students waiting for the bus at any time of the day.

(c) Discrete parameter space and continuous state space


e.g: Waiting time of the n − th student arriving at the bus stop.

(d) Continuous parameter and state spaces


e.g: Content of a dam over an interval of time

STAT 451 & 611 Page 2 of 61


Chapter 2

Probability Distributions

Let (t1 , t2 , ..., tn ) with t1 < t2 < t3 < ... < tn be a discrete set of points within T. The joint distribution
for the process X(t) at these points can be defined as:

P (X(t1 ) ≤ x1 , X(t2 ) ≤ x2 , .., X(tn ) ≤ xn )

.Let t0 and t1 be two points in T such that t0 ≤t1 ; then we may define the conditional transition
distribution function as:
F (x0 , x1 , t0 , t1 ) = P [X(t1 ) ≤ x1 /X(t0 ) = x0 ]...∗

When a stochastic process has discrete parameter and state spaces, we may define the transition
probabilities as:
(m,n)
P(i,j) = P (Xn = j/Xm = i)...m ≤ n.

where i and j are the state spaces. The probabilities are called transition probabilities.

Definition 2.0.0.1. A stochastic process X(t), t ∈ T is said to be tiime-homogeneous or


parameter-homogeneous if the transition distribution function * depends only on the difference t1 − t0
instead of t1 and t0 .

3
Then we have:

F (x0 , x; t0 , t0 + t) = F (x0 , x, 0, t)... ∗ ∗

for t0 ∈ T. For convenience, we can write (**) as F (X0 , X; t).The corresponding expression for the
process Xn , n = 0, 1, 2, ... would then be Pi,j
n
(you are at i moving to j in n steps)

(t,t+n)
Pi,j = P (Xn+t = j/Xt = i) (2.1)

= P (Xn = j/X0 = i), when t = 0 (2.2)


(0,n)
= Pi,j (2.3)
(n)
= Pi,j (2.4)

2.1 Markov Processes and Renewal Processes

Consider a finite (or countably infinite) set of (t0 , t1 , ..., tn , t) , t0 < t1 < t2 < ... < tn < t and t, tr ∈ T
(r = 0, 1, 2, .., n) where T is the parameter space of the process X(t), t ∈ T is called
Markov-dependence if the conditional distribution of X(t) for given values X(t1 ), X(t2 ), ..., X(tn )
depends only on X(tn ) which is the most recent known values of the process; i.e if:

P [X(t) ≤ x/X(tn ) = xn , X(tn−1 ) = xn−1 , · · · , X(t0 ) = x0 ] = P [X(t) ≤ x/X(tn ) = xn ] (2.5)

= F (xn , x; tn , t) (2.6)

The stochastic

P [X(t) ≤ x/X(tn ) = xn , X(tn−1 ) = xn−1 , · · · , X(t0 ) = x0 ] = P [X(t) ≤ x/X(tn ) = xn ] ∗

STAT 451 & 611 Page 4 of 61


for t0 < t1 < t2 < · · · < t.
A stochastic process satisfying * is called a Markov process.
A stochastic process with discrete state and parameter spaces, * becomes:

P [Xn = j/Xn1 = i1 , Xn2 = i2 , · · · , Xnk = ik ] = P [Xn = j/Xn1 = i1 ]

∀n > n1 > n2 > ... > nk and n, n1 , n2 , ..., nk ∈ T and i, j ∈ S.


For a Markov process, therefore if the state is known for a specific value of the time parameter t, that
information is sufficient to predict the behavior of the process beyond that point. As a consequence of
this, we have the following relation:

Z
F (x0 , x; t0 , t) = F (y, x, T, t)dF (x0 , y; t0 , T ),
y∈S

where t0 <T <t

2.2 The Chapman-Kolmogorov equations

1. For a Markov chain with discrete parameter and state spaces,

(m,n) (m,r) (r,n)


=
X
Pij Pik Pk,j
k∈S

where m < r < n, ∀i, j ∈ S. (the forward equation).

2. For a Markov process with aa discrete state space and a continuous parameter space T,

Pij (t + s) = Pik (t)Pkj (s)


X

k∈S

∀s ≥ 0, t ≥ 0, where,
Pij (t + s) = P (Xt+s = j/X0 = i)

STAT 451 & 611 Page 5 of 61


Proof :

(m,n)
Pij = P (Xn = j/Xm = i) (2.7)
X
= P (Xn = j, Xr = k/Xm = i) by the total probability rule. (2.8)
k∈S
X
= P [Xn = j/Xr = k, Xm = i]P [Xr = k/Xm = i] (2.9)
k∈S
X
= P [Xn = j/Xr = k]P [Xr = k/Xn = i] by the Markov dependency property (2.10)
k∈S
X (r,n) (m,r)
= Pk,j Pi,k (2.11)
k∈S
X (m,r) (r,n)
= Pi,k Pk,j (2.12)
k∈S
(2.13)

(t+s)
Pij = P (Xt+s = j/X0 = i) (2.14)
X
= P (Xt+s = j, Xt = k/X0 = i) by the total probability rule. (2.15)
k∈S
X
= P [Xt+s = j/Xt = k, X0 = i]P [Xt = k/X0 = i] (2.16)
k∈S
X
= P [Xt+s = j/Xt = k]P [Xt = k/X0 = i] by Markov dependency (2.17)
k∈S
X (t,t+s) (0,t)
= Pk,j Pi,k (2.18)
k∈S
X (0,t) (t,t+s)
= Pi,k Pk,j (2.19)
k∈S
(2.20)

2.2.1 Some Conventions in Markov Chains

1. First order dependency axiom ( Markovian Postulate)


P (Xn+1 = j/Xn = i, Xn−1 = i − 1, ..., X0 = i0 ) = P (Xn+1 = j/Xn = i), ∀i0 , i1, i2, ..., i and j and ∀ n ≥0.

2. Stationary or time homogeneous postulate.


P (Xn = j/Xm = i) = P (Xk = j/Xi = i)
(m,n) (l,k)
i.e Pi,j = Pi,j
provided n>m, k>l, n6=k, m6=l and n − m = k − l, ∀ i, j ∈ S.

STAT 451 & 611 Page 6 of 61


2.2.2 More Conventions

1. State space is usually represented with the index set 0, 1, 2, ..., m − 1 or 1, 2, ..., m; assuming m
discrete states. A similar convention holds for a discrete parameter space.

(t)
2. Pi,j = P (Xt = j/Xt−1 = i), one step dependency assumption.This is the probability of state j at
time t given state i at time t-1.
From the time-homogeneous assumption, we have

(t)
Pi,j = Pi,j ∀t ∈ T

If the equation in (2) does not hold, we have a non-homogeneous first order Markov Chain, otherwise, we
have a stationary (time homogeneous) first order Markov Chain.
( (
In the latter case, if Pj t) = P (Xt = j) then it can be shown that Pj t) = Pi (t − 1).Pj,i , j=1,2,..m
P
i∈S

Pi,j (t) = P (Xt+1 = j/Xt = i) = Pi,j is the first step transition probability.
Now by TPR,
Pj (t) = Pi (t − 1).Pj,i , j = 1, 2, ...; t = o, 1, 2...
X

i∈S

with t=o being the initial time and Pi (t) = P (Xt = i).
In matrix form,P (t) = P (0)pt ; where P t is the matrix P raised to the power t.
∼ ∼ ∼ ∼ ∼

P = (Pi,j ) satisfy the following postulates :


i 0 ≤ Pi,j ≤ 1, ∀ i, j ∈ S .

ii P i, j = 1, for each i∈ S.
P
i

A square matrix which satisfies these two properties is said to be a stochastic matrix or a transition
matrix.
E.g:  
0.4 0.6
  is a stochastic matrix.
0.2 0.8
 

STAT 451 & 611 Page 7 of 61


If in addition, Pi,j = 1 for each j∈S, then P is said to be a doubly stochastic matrix.
P
i∈S ∼

E.g:  
0.4 0.6
  is a doubly stochastic matrix.
0.6 0.4
 

2.3 Some common Stochastic Processes

1. Bernoulli Process: This is a process with discrete state and parameter space.Denote by Sn the
number of successes in n trials, clearly {Sn } is a Stochastic process with state space {0,1,2...} and

!
n k n−k
P (Sn = k) = p q , k = 0, 1, 2...n
k

The process {Sn } is called the Bernoulli Process.

2. Poisson Process: This is a process with discrete state and continuous parameter space. Consider
the events occurring under the following postulates:

* Events occuring in non-overlapping interval of time are independent of each other.

* There is a constant λ such that the probabilities of occurrence of events in a small interval of
length ∆t are given as follows;
-P[number of events occurring in (t,t+∆t)=0]=1-λ∆t+ 0(∆t)
-P[one event occurring in (t,t+∆t)]=λ∆t+ 0(∆t)
k
and P (X(t) = k) = e−λt (λt)
k!
; k=0,1,2..
The time interval between consecutives occurrences of events in a Poisson Process are
independent random variables indentically distributed with pdf:

f (x) = λe−λx , x > 0

STAT 451 & 611 Page 8 of 61


∴ the Poisson Process is also called the renewal counting process.

3. Gaussian Process: This is a process with continuous state and parameter spaces.

4. Weiner Process: This is a process with continuous state and parameter spaces.

Consider a Stochastic Process X(t) with the following properties;

1. The process X(t),t≥0 has stationary independent increments. This means that for t1 , t2 ∈ T and
t1 < t2 , the distribution of X(t2 ) − X(t1 ) is same as X(t2 + h) − X(t1 + h) for any h > 0 and for
any non-overlapping time interval (t1 , t2 ) and (t3 , t4 ) with t1 < t2 < t3 < t4 .

2. For any given time interval (t1 , t2 ), X(t2 ) − X(t1 ) is normally distributed with mean 0 and variance
σ 2 (t2 − t1 ).

The n-step transition probability matrix


Let P

be the transition probability matrix of a Markov Chain Xn , n = o, 1, 2... as defined earlier.
(
Let Pj n) be the probability that the process is in the state j after n transitions (unconditional
probability).
Denote by the row vector P ( n), the vector of probabilities P ( n), j∈S.The n-step transition probability
∼ ∼j
( (
Pi,j n) and the unconditional probabilities Pj n), i,j ∈S are determined as;
Theorem:
P

(
n)=P

n
and P

(
n)= P

(
0)P

n
.
Proof:
From Chapman-Kolmogorov Equation:

( (r) (s)
Pi,j r + s) =
X
Pi,k Pk,j
k∈S

for given r and s.


Let r = 1 and s = 1,
(2)
Pi,j =
X
Pi,k Pk,j
k∈S

STAT 451 & 611 Page 9 of 61


(2)
Clearly, Pi,j is the (i, j)th element of the matrix product:

P .P = P
∼ ∼ ∼
(2)

Assume P (r) =P (r) , r=1,2,...,n-1


Let r = n − 1, s = 1,
(n) (n−1)
Pi,j = is the (i,j)th element of
X
Pi,k Pk,j
k∈S

P n−1 .P = P n
∼ ∼

∴ P (n) = P n

For a two-state chain with state space S = 0, 1, then the transition matrix is denoted by:
 
P00 P01 
P = 
∼  
P10 P11

Theorem 2.3.0.1. For a two state chain with transition matrix


   
(n) (n)
P00 P01  P P01 
P = ,P  00
(n)  
∼   ∼  (n) (n)

P10 P11 P10 P11

Example:

Theorem 2.3.0.2. For a two-state Markov Chain with one step probability transition matrix,
 
1 − a a 
P

=   , 0 ≤ a, b ≤ 1, |1 − a − b|< 1
1−b
 
b

 
a(1−a−b)n a(1−a−b)n
b
 a+b + a+b
a
a+b
− a+b
Pn = 

 
∼ b(1−a−b)n b(1−a−b)n
+

b a
a+b
− a+b a+b a+b

This theorem can be proven easily by induction.

STAT 451 & 611 Page 10 of 61


Definition 2.3.0.3. The probability vector (π0 , π1 , π2 , ..., πn ) is said to be a limiting distribution or a
steady state or long run distribution of a Markov Chain with transition probability matrix P if

lim P n = π
n→∞ ∼
(2.21)

exists and (π0 , π1 , π2 , ..., πn ) is a row of π (where π is stable) and:


∼ ∼

(π0 , π1 , π2 , ..., πn )P

= (π0 , π1 , π2 , ..., πn )

Note: If a Markov Chain has a long-run or steady state distribution, then the long run distribution is
also stationary;However a Markov chain can have a stationary distribution without a limiting
distribution.  
0 1
Example: For a two-state Chain with transition matrix P

= 
1 0
 

 
0 1
π.P

= π =⇒ (1/2, 1/2)   = (1/2, 1/2)
1 0
 

 
1/2 1/2
π=  is the stationary distribution
1/2 1/2
 

  
0 1



if n is odd


 n 
  
0 1 1 0

  


However, limn→∞ P n =   =   Hence there exist no long-run limiting

1 0  1 0
  


if n is even


  
 0 1

  

distribution.

Questions:

1. Prove that if P is a k×k stochastic matrix, then P n is a stochastic matrix ∀n = 1, 2, ...


∼ ∼

STAT 451 & 611 Page 11 of 61


 
1 − a a 
2. Prove that if P =  , then
1−b
 
b

lim P n = π
n→∞ ∼
(2.22)

exists and πP =P π=π, where 0 < a, b < 1and|1 − a − b|< 1.


∼ ∼

3. Prove that if P

and Q are k×k stochastic matrices, then P

Q is a stochastic matrix.
∼ ∼

4. Prove that if P is a k×k stochastic matrix, then so is P n , ∀n = 1, 2, ...


∼ ∼

Definition 2.3.0.4. A Stochastic matrix with identical rows is said to be stable.


 
1/4 3/4
e.g:  is a stable stochastic matrix.
1/4 3/4
 

Definition 2.3.0.5. A vector (P1 , P2 , ...Pm ) is said to be a probability vector if ∀i = 1, 2, ..., m,


0 ≤ Pi ≤ 1 and Pi = 1.
Pm
i=1

Definition 2.3.0.6. A probability vector (P0 , P1 , P2 , ...Pm ) is said to be stationary with respect to a
stochastic matrix P

if (P0 , P1 , ...Pm )P

=(P0 , P1 , ...Pm ).

2.4 Modes of transition and Classification of states.

Definition 2.4.0.1. State j is said to be accessible from state i if j can be reached from i in a finite
number of steps. If two states i and j are accessible to each other, then they are said to communicate.

In probability terms, these definitions imply:


(n)
i → j (j accessible from i) if for some n > 0, Pi,j > 0
(n)
j → i (i accessible from j) if for some n > 0, Pj,i > 0
(n) (
i ↔ j (i and j communicate) if for some n > 0, Pj,i > 0 and for some m > 0, Pj,i m).

STAT 451 & 611 Page 12 of 61


(n)
i → j (j not accessible from i) if ∀n ≥ 0,Pi,j = 0
(n)
j → i (i not accessible from j) if ∀n ≥ 0,Pj,i = 0
(n) (m)
i ←→ j (i and j do not communicate) if Pi,j = 0, ∀n ≥ 0 or Pj,i = 0, ∀m ≥ 0

2.4.1 Properties of Communication Relation

1. Reflexivity: i ←→ i 
1 ,i = j



(0)
Pi,j = δi,j =
0 , i 6= j


2. Symmetry: i ←→ j then j ←→ i

3. Transitivity: if i ←→ j and j ←→ k, then i ←→ k

(r) (s)
Proof ∃r, s ∈ Z + /Pi,j > 0andPj,k > 0
(r+s) (r) (s) (r) (s)
But Pi,k = Pi,l Pl,k ≥ Pi,j Pj,k > 0
P
i∈S

Therefore, i → k.
Reflexivity, Symmetry and transitivity properties together form an equivalence class or equivalence
relation.The set of all states of a Markov Chain that communicate(with each other) can therefore be
grouped into a single equivalence class.
Markov Chains may have more than one such equivalence class. If there are more than one, then it is not
possible to have communicating states in different equivalence classes. However it is possible to have
states in one class that are accessible form another class.

Definition 2.4.1.1. If a Markov Chain has all its states belonging to one equivalence class, it is said to
be irreducible.

Clearly, in an irreducible chain, all states communicate.The period of a state i is defined as the greatest
(n)
common divisor of all integers n ≥ 1 for which Pi,i > 0

STAT 451 & 611 Page 13 of 61


  
0 1



if n is odd


  
  
0 1 1 0

  


Example: If P =   then P n =   State space S=0,1
∼ ∼
1 0 1 0
  


if n is even



  
 0 1

  

(2)
P0,0 = 1 > 0, P0,0 (4) = 1 > 0, P0,0 (6) = 1 > 0, ..., P0,0 (2k) = 1 > 0, ...k = 1, 2, 3.... The greatest common
divisor of 2,4,6,... is 2. Hence the period of state 0 is 2.( which is the same for state 1).
Note: If two states communicate, then they have the same period. A state is said to be aperiodic if it
has period 1.

Theorem 2.4.1.2. If i and j are states of a Markov Chain and i ←→ j, then i and j have the same
period.

It follows from the above theorem that, periodicity is also a class property. A class of states with
periodicity is said to be aperiodic.
If all the states of a Markov Chain communicates and have period 1, then the chain is said to be
irreducible and aperiodic.If P is a transition matrix of finite Markov Chain which is irreducible and

aperiodic, it is easy to show that ∃n ∈ Z + , n ≥ 1, for which P (n) has no zero element (all states are

accessible).The matrix P is then said to be regular or primitive, the chain also then is said to be regular.

Theorem: Let P

be the transition probability matrix of an irreducible, aperiodic m-state finite
time-homogeneous Markov Chain, then:
 
α
∼

 
α 
∼
 
lim P n = π =  .  , where α = (π1 , π2 , ..., πn )
 
n→∞ ∼ ∼  
  ∼
 
.
 
 
 
α

with 0 < πj < 1 and πj = 1


Pn
j=1

STAT 451 & 611 Page 14 of 61


• a ? P (t).π = α ; t = 0, 1, 2, ... and ? P (t) = [P1 (t), P2 (t), ..., Pm (t)]
∼ ∼ ∼ ∼

(n)
b ∃ constant c and r (c > 0, 0 < r < 1) such that |Pi,j − πj |≤ crn , ∀i, j = 1, 2, ..., m

c P π = πP = π
∼∼ ∼∼ ∼

Note: The convergence property in b) is known as geometric ergodicity. A chain with his type of
property is said to be strongly ergodic. (we are sure that the chain will certainly converge and even faster)
Any finite irreducible, time-homogenous Markov Chain Xm |m ∈ T with transition probability matrix P

has a stationary distribution V given by


V∼ P

= V∼

, where V is a row probability vector which can be found using the equation and the constraint

I = 1,
V∼ ∼

with I being a column vector of the same dimension as V and all entries equal to unity.

V∼ = [v0 , v1 , ..., vm ], I = [1, 1, ..., 1]T1×m


V I = v0 + v1 + ... + vm
∼∼
n
=
X
vj
j=1
= 1

A finite irreducible aperiodic Markov Chain Xm |m ∈ T with a regular transition probability matrix P

has a long run and stationary distribution α given by αP = α when α = (π1 , π2 , ..., πm ) assuming
∼ ∼∼ ∼ ∼

πi = limn→∞ P Xn = i)

STAT 451 & 611 Page 15 of 61


using the equation αP = α and αI1 =; α can be found.
∼∼ ∼ ∼

Note: A stable stochastic matrix is one that has identical rows.


e.g  
0.4 0.3 0.3
 
Pi,j = 0.4 0.3 0.3
 
 
 
0.4 0.3 0.3
 

Ever limiting distribution is a stble matrix.


e.g: Suppose a finite hoogeneous Markov Chain has the transition probability matrix:
 
 0 2/3 1/3
 
Pi,j = 3/8 1/8 1/2
 
 
 
1/2 1/2 0
 

1. how that the chain is regular and find the long-run distribution.

2. Show that the chain is ergodic and find limn→∞ P



n

Solution
δ = {0, 1, 2}.
The chain is irreducible because all the states {0, 1, 2} communicate.
(1)
P11 = 1/8 > 0. The period of state 1 is 1, hence state 1 is aperiodic since periodicity is a class property,
then the chain is aperiodic. Since it satisfies the aperiodicity property and all the states communicate,
then the chain is regular.
Note: If the chain is regular, then the limiting distribution exists.
∴, limn→∞ P

n


exists.
By inference, the chain is ergodic.

STAT 451 & 611 Page 16 of 61


Now, let the first row of π be (π0 , π1 , π2 )

   
 0 2/3 1/3 π0 
   
(π0 π1 π2 ) 
3/8 1/8 1/2 =
   
π 
  1
   
1/2 1/2 0
   
π2

and π0 + π1 + π2 = 1
Solving the system ofequations yield
 (π0 , π1 , π2 ) = (0.3, 0.4, 0.3)
0.3 0.4 0.3
 
∴, limn→∞ P n = π = 
0.3 0.4 0.3
 

∼ ∼  
0.3 0.4 0.3
 

Assignment: Prove that if a Markov Chain is irreducible and aperiodic and has M states, then the
limiting probabilities are given by:
1
πj = ; j = 1, 2, ..., m
m

2.5 Classification of States

2.5.1 Recurrence and Transient States

Let {Xn } be a Markov Chain with state space S = {0, 1, 2, ..., m − 1}. Define

(n)
fij = P (Xn = j, Xr 6= j, r = 1, 2, ..., n − 1|X0 = i)

This means that at time 0, the process was in state i, but in all r steps, the process never got to j until
at time n; and
∞ ∞
(∗) (n) (n)
fii = and µij =
X X
fii nfij
n=1 n=1

(n)
fii = P (Xn = i, Xr 6= i, r = 1, 2, ..., n − 1|X0 = i)

STAT 451 & 611 Page 17 of 61


(∗)
1. fii is the probability that starting from state i, the return to the initial state occurs in a ”finite
time”

(n) (n)
2. fij is the probability of first passage time when j = i, we shall call fii the recurrence time
distribution of state i

3. µij is the expected value of the first passage time µii = µi =mean recurrence time

Definition 2.5.1.1. A state i is said to be recurrent if and only if, starting from i, the eventual return to
(∗)
state i is certain, fii = 1

Types of recurrence

1. State i is null recurrent if µi = ∞

2. State i is recurrent µi < ∞

2.5.2 Directed Multigraphs

M = (I − Q)−1 ,F = M R = fij
kσij2 kv ariance = M (2MD − I) − M2 , i, j ∈ T

 
 µr+1,r+1 0 
..
 
.
 
 
 
..
 
MD = diag(M ) = .
 
 
 
..
 
.
 
 
 
 
0
 
µm,m
 
 µ2r+1,r+1 µ2r+1,r+2 ... µ2r+1,m 
.. ..
 
M2 = . .
 
 
 
 
 
µ2m,r+1 ... ... µ2m,m

STAT 451 & 611 Page 18 of 61


Chapter 3

Branching Process

Definition 3.0.0.1. Consider a population of individuals which gives rise to a new population. Assume
that the probability of an individual in his lifetime gives rise to r new individuals(offsprings) is Pr for
r = 0, 1, . . . and that individuals are independent rvs. The new population forms a first generation, which
in turn reproduces a second generation which in turn produces a third generation etc. For n = 0, 1, 2 . . .
let Xn be the size of the nth generation so that X0 ; the zeroth generation is initial population
then {Xn ; n = 0, 1, . . .} is a Markov Chain called a Branching Process. Its states space is {0, 1, 2, . . .}.

Note that 0 is a recurrent state since clearly if P = Pij is TPM, then P00 = 1.
Also, if P0 > 0, it can be shown that all other states are transient.

Let f (Z) = P rZ r , |Z|≤ 1 be a probability generating fxn (p.g.f)
P
r=0
Note:f (0) = P0 where Pr is the coefficient of Z r in the expression of f(Z).

f (Z) = P0 Z 0 + P1 Z 1 + P2 Z 2 + . . .

f (Z) = P0 + P1 Z + P2 Z 2 + . . .

f (0) = 0

19
df
f 0(Z) = = P1 + 2P2 Z + 3P3 Z 2 + . . .
dt

f 0(1) = P1 + 2P2 + 3P3 + . . .



=
X
r.Pr
r=0

= mean number of offsprings of an individual in a single generation

Let fn (Z) denote the p.g.f of Xn . By considering that

Xn
Xn+1 =
X
Zi
i=1

where Zi is the number of offsprings of the ith member of nt h generation.


Thus Zi (i ≥ 1) are independent and identically distributed rvs with distribution Pr = P (Zi = r),

r = 0, 1, . . . and Pr = 1
P
r=0
We assume that X0 = 1.The p.g.f of Xn+1 is given by

fn+1 (t) = E(tXn+1 )

= E(E[tXn+1 |Xn ])

= E(tXn+1 |Xn ).P (Xn = j)
X

j=0

Xn

P
Zi
= E[ti=1 |Xn ].P (Xn = j)
X

j=0


= E[tZ1 +Z2 +...+Zj |Xn = j].P (Xn = j)
X

j=0

∞ j
= E(tZi |Xn = j)].P (Xn = j)
X Y
E[
j=0 i=1

STAT 451 & 611 Page 20 of 61


Since the Zi (i = 1, 2, . . . , j) are independent and identically distributed with p.g.f f(t)


=⇒ fn+1 (t) = [f (t)]j P (Xn = j)
X

j=0

fn+1 (t) = fn (f (t))

fn+1 (t) = fn [f (t)]

Iterating this relation, we obtain fn+1 (t) = fn [f (t)]

fn+1 (t) = fn−1 .f [f (t)]

= fn−1 [f2 (t)]

= fn−2 [f.f2 (t)]

fn+1 (t) = fn−2 [f3 (t)]

It follows by mathematical induction that ∀k=0,1,...,n

fn+1 (t) = fn−k [fk+1 (t)]

In particular with k = n − 1, we have


fn+1 (t) = F1 [fn (t)]

= F [fn (t)]

Since X0 = 1 , F1 = f

Underlying assumptions of a Branching Process

1. X0 = 1

2. None of the probabilities {Pr } is equal to 1

STAT 451 & 611 Page 21 of 61


3. 0 < P0 < 1

4. P0 + P1 < 1

with these assumptions :the function f(t) is strictly convex on the unit interval of real axis depicted as;

Figure 3.1

STAT 451 & 611 Page 22 of 61


Assume the mean and variance of X1 are finite and let


m = E(X1 ) =
X
r.Pr
r=0


σ 2 = V ar(X1 ) = E(X12 ) − [E(X1 )]2 = r 2 Pr − m 2
X

r=0

Now fn+1 (t) = fn [f (t)] = f [fn (t)]

f 0n+1 (t) = f 0(fn (t))f n0(t)........(?) chain rule


However, f 0n(1) = rP (Xn = r|X0 = 1)
P
r=1

if n = 1;

f 0(1) = rP (X1 = r|X0 = 1)
X

r=1


= rP (X1 = r)
X

r=1

= E(X1 ) = m by independence

E(Xn ) = mn by induction

If we differentiate equation (?) and put t = 1;we find that

f 00n+1 (t) = f 0(fn (t))f 00n (t) + f 0n (t)[f 00(fn (t))f 0n (t)]

f 00n+1 (1) = f 0(fn (1))f 00n (1) + f 0n (1)[f 00(fn (1))f 0n (1)]

f 00n+1 (1) = f 0(1)f 00n (1) + f 0n (1)[f 00(1)f 0n (1)]

Note f (t) = E(tz )


f 0(t) = E[Z.tZ−1 ] f 0(1) = E(Z)

STAT 451 & 611 Page 23 of 61


f 00(t) = E[Z(Z − 1)tZ−2 ] f 00(t) = E(Z 2 ) − E(Z) =⇒ E(Z 2 ) − m

=⇒ m[f 00n (1)] + mn − mn [σ 2 + m2 − m]

=⇒ mf 00n (1) + m2n [σ 2 + m2 − m].......(1)

Now:
Xn
Xn+1 = Zr and Pij = P (Xn+1 = j|Xn = i)
X

r=1

Xn
!
Pij = P Zr = j|Xn = i
X

r=1

Xn
!
=P Zr = j
X

r=1

where Zr is as defined previously. It is clear that Xn+1 is a r.v and since Zr ; r = 1, 2, ... are iid, we get

E[Xn+1 ] = E[Xn ].E[Zr ].....(2)

V ar[Xn+1 ] = E(Xn )V ar(Zr ) + V ar(Xn )[E(Zr )]2 ........(3)


Suppose the population starts with an organism (X0 = 1)
Let E(Zr ) = µ and V ar(Zr ) = σ 2 ; r = 1, 2, . . .

STAT 451 & 611 Page 24 of 61


From 2:

E(Xn+1 ) = E(Xn )E(Zr )

E(Xn ) = E(Xn−1 )E(Zr )

= E(Xn−1 )µ

= [E(Xn−2 )E(Zr )]µ

= E(Xn−2 )µ2

= E(Xn−n+1 )µn−1

= E(Xn−n )µn

= E(X0 = 1)µn

E(Xn ) = µn ; n ≥ 1.......(4)

Also from 3:

V ar[Xn ] = E(Xn−1 )V ar(Zr ) + V ar(Xn−1 )[E(Zr )]2

= µn−1 σ 2 + µ2 V ar(Xn−1 )

V ar(Xn ) = µn−1 σ 2 + µ2 [µn−2 σ 2 + µ2 V ar(Xn−2 )]

= µn−1 σ 2 + µn σ 2 + µ4 V ar(Xn−2 )

STAT 451 & 611 Page 25 of 61


V ar(Xn−2 ) = µn−3 σ 2 + µ2 V ar(Xn−3 )

= µn−1 σ 2 + µn σ 2 + µ4 [µn−3 σ 2 + µ2 V ar(Xn−2 )

= µn−1 σ 2 + µn σ 2 + µn+1 σ 2 + µn−1 σ 2 + µ6 V ar(Xn−3 )


| {z }
. →µ 2(n−1)
V ar(Xn−(n−1) ) = µ2(n−1) σ 2

V ar(Xn ) = µn−1 σ 2 + µn σ 2 + µn+1 σ 2 + µn+2 σ 2 + . . . + µ2n−2 σ 2

= µn−1 σ 2 [1 + µ + µ2 + . . . + µn−1 ]
"n−1 #
= µn−1 σ 2 µr
X

r=0
1(1 − µn )
" #
= µ n−1 2
σ , µ 6= 1
1−µ
= nσ 2 if µ = 1


1(1−µn )
 
µn−1 σ 2 ; µ 6= 1



1−µ
V ar(Xn ) =
nσ 2 ; µ = 1..........(5)


From (4) and (5), it’s clear that the mean and variance of Xn increases or decreases geometrically
according as µ > 1 or µ < 1.
Now from the Chebyshev’s inequality which states that E(Xn ) = µn < ∞
and V ar(Xn ) < ∞, ∃ > 0|P (|Xn − E(Xn )|> ) ≤ V ar(Xn )
2

as n → ∞, when µ < 1
, E(Xn ) = 0

lim P (|Xn − E(Xn )|> ) ≤ lim V ar(Xn )


2
n→∞ n→∞
n
P (|Xn − 0|> ) → 0 because n→∞
lim µn−1 σ 2 ( 1−µ
1−µ
) but lim =0
n→∞µn−1

P (Xn = 0) → 1

When µ < 1, then it’s certain that the population of size Xn will be extinct as n → ∞

STAT 451 & 611 Page 26 of 61


When µ ≥, the probability of extinction of the population is not easy. In this case we may use a
generating fxn approach which is also useful in the determination of n-step transition probability of the
Branching Process.

Probability of Ultimate Extinction(q)


If the probability of ultimate extinction exist, let it be q.
Clearly q = n→∞
lim P (Xn = 0|X0 = 1)

Now the pgf of Xn , fn (t) = tr .P (X = r|X0 = 1)
P
n
r=0

= P (Xn = 0|X0 = 1) + tr .P (Xn = r|X0 = 1)
P
r=1
fn (0) = P (Xn = 0|X0 = 1)

lim fn (0) = n→∞


n→∞
lim P (Xn = 0|X0 = 1) = q

∴ q = n→∞
lim fn (0)

Generally; fn+1 (t) = f (fn (t))


 
q = lim fn+1 (0) = f lim fn (0)
n→∞ n→∞

→ q = f (q)

Which implies that the probability of ultimate extinction q satisfies


, q = f (q) or t = f (t) if t = q

Theorem 3.0.0.2. The probability of ultimate extinction q exists.

Proof:
fn+1 (t) = f [fn (t)]
Now let qn = P (Xn = 0)
= fn (0)
qn+1 = fn+1(0) = f [fn (0)]
= f (qn )

STAT 451 & 611 Page 27 of 61


Since f (t) is strictly increasing in t, it is a power series with non-negative co-efficients with 0 < P0 < 1.

q1 = f1 (0) − f (0) = P0 > 0

q2 = f1 (q1 ) > f (0) = f1 (0) = q1

Assume qn > qn+1


then qn+1 = f (qn ) > f (qn−1 ) = qn
This shows that q1 , q2 , ..., q3 is monotone increasing sequence bounded by 1
Hence q = lim qn exists and 0 < q ≤ 1
n→∞

Theorem 3.0.0.3. If the mean number of offsprings born to an individual µ = E(X1 |X0 = 1) = f 0(1) is
less than or equal to 1, then the probability of ultimate extinction of the population is surely one(1).

if the mean number is greater than one, then the probability of ultimate extinction is the unique

non-negative solution less than 1 of the equation f = f (t) where f (t) = tr P
P
r
r=0

Example: Suppose in a branching process, P0 = 21 , P1 = 14 , P2 = 14 , P3 = 0; n ≥ 3. Determine the


probability of the ultimate rxtinction (the prob. that the population will die out.)

solution: E(X) = x.P (x) = 0 × 12 + 1 × 14 + 2 × 41 + 3 × 0


P
∀x
E(X) = 3
4
=µ<1
∴ qn = P (Xn = 0) = 1, n ≥ 3

Example: Suppose in a branching process, P0 = 41 , P1 = 14 , P2 = 12 , Pn = 0; ∀n ≥ 3. Show that q = 1


2

solution: E(X) = 0 × 14 + 1 × 14 + 2 × 1
2
= 5
4

E(X) = 5
4
=µ>1
q = f (q)

f (q) = q r .P
P
r
r=0
= q 0 .P0 + q 1 .P1 + q 2 .P2

STAT 451 & 611 Page 28 of 61


= P0 + q.P1 + q 2 .P2
= 1
4
+ q. 14 + q 2 . 21
q2
f (q) = 1
4
+ 4q + 4
q2
1
4
+ 4q + 4
=q
4q = 1 + q + 2q 2 =⇒ 2q 2 − 3q + 1 = 0, q = 12 , q = 1
but we cannot choose q = 1 because when µ > 1, prob. of ultimate extinction is not certain.
so q = 1
2

STAT 451 & 611 Page 29 of 61


Chapter 4

The Poisson Process

Let the process X(t) represent the number of times an event occurs in the time (0, t]. Define s < t
Pij (s, t) = P (X(t) = j|X(j) = 1

Further let the event occur under the following postulates.

1. Events occurring in non-overlapping intervals are independent

2. For a sufficiently small ∆t, there is a constant λ such that the probability of occurrence of events in
(t, t + ∆t] are given as follows;

(a) Pii (t, t + t + ∆t] = 1 − λ∆t + 0(∆t)

(b) Pi,i+1 (t, t + ∆t) = λ∆t + 0(∆t)



(c) P (t, t + ∆t) = 0(∆t)
P
i+1

(d) Pij (t, ∆t) = 0 ; j < 1

where 0(∆t) contains all terms that tends to zero much faster than ∆t
ie. lim 0(∆t)
∆t
=0
∆t→0

and 0∆t represents terms of smaller order than ∆t with

30
0(∆t) + 0(∆t) = 0(∆t) and c.0(∆t) = 0(∆t), c →constant

Theorem 4.0.0.1. Under the above postulates, the number of events occurring in any interval of length t
is a Poisson random variable with parameter λt. Thus,

eλt .(λt)n
Pn (t) = P (X(t) = n|X(0) = 0 = , n = 0, 1, ..
n!

By the independence assumption of postulate 1 and third option under postulate 2 in conjunction with
the Chapman-Kolmogorov Equation we have,

P0 (t + ∆t) = [1 − λ∆t + 0(∆t)].P0 (t) (4.1)

Pn (t + ∆t) = Pn−1 (t)[λ∆t + 0(∆)] + Pn (t)[1 − λ∆t + 0(∆t)] (4.2)

Pn (t + ∆t) = λ∆tPn−1 (t) + (1 − λ∆t)Pn (t) + 0(∆t); n > 0

From (1):
P0 (t + ∆t) = P0 (t) − λ∆t.P0 (t) + 0(∆t)

P0 (t + ∆t) − P0 (t) = −λ∆t.P0 (t) + 0(∆t)

P0 (t + ∆t) − P0 (t) −λ∆t.P0 (t) 0(∆t)


= +
∆t ∆t ∆t
P0 (t + ∆t) − P0 (t) 0(∆t)
= −λ.P0 (t) + (4.3)
∆t ∆t

From (2):
Pn (t + ∆t) − Pn (t) = λ∆t.Pn−1 (t) − λ∆t.Pn (t) + 0(∆t)

Pn (t + ∆t) − Pn (t) λ∆t.Pn−1 (t) λ∆t.Pn (t) 0(∆t)


= − +
∆t ∆t ∆t ∆t
Pn (t + ∆t) − Pn (t) 0(∆t)
= λ.Pn−1 (t) − λ.Pn (t) + (4.4)
∆t ∆t

STAT 451 & 611 Page 31 of 61


From (3):
P0 (t + ∆t) − P0 (t) 0(∆t)
lim = −λ.P0 (t) + lim
∆t→0 ∆t ∆t→0 ∆t

P 00 (t) = −λP0 (t) (4.5)

From (4):
Pn (t + ∆t) − Pn (t) 0(∆t)
lim = λPn−1 (t) − λ.Pn (t) + lim
∆t→0 ∆t ∆t→0 ∆t

P 0n (t) = λPn−1 (t) − λPn (t) (4.6)

With initial conditions 


1, if n = 0



Pn (0) =
0, if n > 0


Eqn(5) and Eqn(6) from a system of different differential equation that can be solved by recursive
method as follows:
Multiply both sides of (5) and (6) by eλt

eλt P 00 (t) = −λeλt P0 (t) (4.7)

eλt P 0n (t) = λeλt Pn−1 (t) − λeλt Pn (t) (4.8)

Let Qn (t) = eλt Pn (t)


Q0n (t) = λeλt Pn (t) + eλt P 0n (t) (4.9)

Q00 (t) = λeλt P0 (t) + eλt P 00 (t) from (7)

From (8):
eλt P 0n (t) + λeλt Pn (t) = λeλt Pn−1 (t)

Q0n (t) = λeλt Pn−1 (t) from (9)

Q0n (t) = λQn−1 (t)

STAT 451 & 611 Page 32 of 61


With the boundary conditions;
Q0 (0) = P0 (0) = 1 and Qn (0) = Pn (0) = 1; n > 0
Now; Q00 (t) = 0 ; Q00 (t) = 0
R R

Q0 (t) = c
Let t = 0; Q0 (0) = 1 =⇒ c = 1

STAT 451 & 611 Page 33 of 61


Q0 (t) = 1
Q01 (t) = λQ0 (t)

Q01 (t) = λ
Z Z
Q01 (t) = λ

Q1 (0) = c ∴ c = 0

Q1 (t) = λt

Q02 (t) = λQ1 (t) = λ2 (t)


Z Z
Q02 (t) = λ2 t

λ2 .t2
Q2 (t) = +c
2

Q2 (0) = c = 0

λ2 t2 (λt)2
Q2 (t) = =
2 2!
λ3 t2
Q03 (t) = λQ2 (t) =
2
λ3 t3
Q3 (t) = +c Q3 (0) = 0 = c
6
λ3 t3 (λt)3
Q3 (t) = =
6 3!
..
.

λn−1 tn−1
Qn−1 (t) =
(n − 1)!

STAT 451 & 611 Page 34 of 61


Hence
Q0n (t) = λQn−1 (t)

λn−1 tn−1 λn tn−1


= λ[ ]=
(n − 1)! (n − 1)!
Z
λn tn−1 tn
Qn (t) = = λn . +c
(n − 1)! n(n − 1)

Qn (0) = c = 0

λn tn
Qn (t) =
n!

Put
Qn (t) = eλt Pn (t)

λn tn
= eλt Pn (t)
n!
eλt (λt)n
Pn (t) = ,n ≥ 0
n!

STAT 451 & 611 Page 35 of 61


Chapter 5

The Renewable Counting Process

WAITING TIME OF THE nth EVENT:

36
Then Tn is less than or equal to t iff the number of events that have occurred by time t is at least n.
That is if X(t) =the number of events in (0,t], then P (Tn ≤ t) = P (X(t) ≥ n)


P (X(t) = j)
X

j=n


X (λt)j e−λt
j=n j!

(λt)j e−λt
F (t) =
X

j=n j!

d X ∞
(λt)j e−λt
fn (t) =
dt j=n j!

j.(λt)j−1 .λ.e−λt (λt)j (−λe−λt )
" #
= +
X

j=n j! j!

λ.e−λt .(λt)j−1 λe−λt (λt)j
" #
=
X

j=n (j − 1)! j!

(λt)j−1 (λt)j
" #
= λ.e−λt
X

j=n (j − 1)! j!

(λt)n−1 (λt)n (λt)n+1 (λt)n (λt)n+1


" #
= + + + ... − − − . . . .λe−λt
(n − 1)! n! (n + 1)! n! (n + 1)!

(λt)n−1 .λe−λt λe−λt (λt)n−1


fn (t) = = ∼ Γ(n, λ)
(n − 1)! Γ(n)

Therefor T1 ∼ exp(λ) tie of first event.

Note: It is easy to show that if the initial observation of the process is made at s, s > 0 at which time
X(s) = i
s,t
Pin = P (X(t) = n|Xs = i)
e−λ(t−s) [λ(t−s)n−1 ]
s,t
Pin = (n−1)!

This is true as long as λ is time or state dependent.

STAT 451 & 611 Page 37 of 61


Theorem: If X0 = k (and not 1) and q is the probability of ultimate extinction for X0 = 1, the
probability of ultimate extinction corresponding to X0 = k is equal to q k

Proof:
Since the population will die out iff the families of each of the members of initial generation dies out and
since each family is assumed to act independently, the desired probability is q k .

Question: Suppose that in a discrete branching process, the probability of an individual having k
offspring is given by
e−k λk
Pk = , k = 0, 1, . . .
k!

However, because the population of interest is heterogenous, λ itself is a random variable distributed
distributed according to a gamma distribution

( q )α .λα−1 exp{ −q λ}
 p p
; λ≥0


Γ(α)
g(λ) =
0, otherwise


where q, p, α are strictly positive constants and p + q = 1. Find the p.g.f of the number of offspring of a
single individual in the population. Hence find the probability of ultimate extinction.

Solution:

STAT 451 & 611 Page 38 of 61


X|λ ∼ P (λ)

f (x, λ) = f (x|λ).g(λ)
Z Z
f (x, λ)d(λ) = f (x|λ).g(λ)d(λ)
∀x ∀λ
Z
h(x) = f (x|λ).g(λ)d(λ)
∀λ

..
.
Z
E(t) = tx .h(x)dx
∀x
−q
( pq )α .λα−1 e{ p
λ}
g(λ) = ;λ > 0
Γ(α)
q
=⇒ λ ∼ Γ(α, )
p
Z
h(x) = f (x|λ).g(λ)dλ
∀λ
−q
Z ∞ −λ x
e .λ ( pq )α .λα−1 e{ p
λ}
= . dλ
0 x! Γ(α)
( pq )α Z ∞ q
= λα+x−1 e−λ( p +1) dλ
Γ(α)x! 0

but
Z ∞
r!
λr e−λx dλ =
0 λr+1

Z ∞ q (α + x − 1)
λα+x−1 e−λ( p +1) dλ =
0 ( pq + 1)α+x

( pq )αΓ(α + x)
h(x) = .
Γ(α)x! ( pq + 1)α+x

Γ(α + x) ( pq )α
= .
Γ(α)x! ( pq + 1)α+x

STAT 451 & 611 Page 39 of 61


!α+x
(α + x − 1)! q α p+q
= ÷
(α − 1)! x! pα p
 
α + x − 1  q α .pα .px
=



x
 qα

 
α + x − 1  q α .px
=


  1
x
 
α+x−1 
=  .q α .(1 − q)x ; x = 0, 1, . . .

 
x

which is a negative binomial distribution

f (t) = E(tx ) = tx .h(x)


X

∀x

 
∞ α+x−1 
tx .   .q α .(1 − q)x
X 
 
∀x x
   
α+x−1  −α 
Identity;  = (−1)x 
 
 
   
x x

 
∞ −α 
f (t) = tx .(−1)x   .q α .(1 − q)x
X 
 
∀x x

 
∞  −α 
=  .q α .(−tp)x , p =1−q
X

 
∀x x
 
∞ −α 
Identity; (1 − t)− α =  (−t)
x
=⇒ q α (1 − tp)−α
P  

∀x
x

STAT 451 & 611 Page 40 of 61



qα q
f (t) = =
(1 − tp)α (1 − tp)

To find the prob of extinction, we find the E(X) = f 0(1).


If it’s greater than 1, then we can use f (t) = t, and solve for t to get the probability of extinction,
otherwise the probability of extinction is 1.

STAT 451 & 611 Page 41 of 61


Chapter 6

The Pure(Simple) Birth Process

The assumption of a constant parameter λ in the Poisson Process may not be realistic in physical
phenomena such as population growth. A pure general process can be obtained by making the parameter
λ dependent on the state of the process.

42
Consider a process of events occurring under the following postulates. Suppose the event has occurred n
times in time (0, t]. The the occurrence or non-occurrence of the event during (t, t + ∆t] for a sufficiently
small δt is independent of the time since the last occurrence. Further, the probabilities of events are
given as follows:

1. Probability that the event occurs once is λn ∆t + 0(∆t))

2. Probability that the event does not occur is 1 − λn ∆t + 0(∆t)

3. Probability that the event occurs more than once is 0(∆t)(negligible)

Let Pn (t) = P (X(t) = n) be the probability that n events occurs in time (0, t]. Using the Chapman
Kolmogorov equation for transitions in the interval of time (0, t] and (t, t + ∆t) we have

Pn (t + ∆t) = Pn (t)[1 − λn ∆t + 0(∆t)] or Pn−1 (t)[λn−1 ∆t + 0(∆t)]

Pn (t + ∆t) = Pn (t) − λn ∆tPn (t) + λn−1 ∆tPn−1 (t) + 0(∆t)

Pn (t + ∆t) − Pn (t) 0(∆t)


= λn−1 Pn−1 (t) − λn Pn (t) +
∆t ∆t
Pn (t + ∆t) − Pn (t) 0(∆t)
lim = λn−1 Pn−1 (t) − λn Pn (t) + lim
∆t→0 ∆t ∆t→0 ∆t

dPn (t)
Pn 0(t) = = λn−1 Pn−1 (t) − λn Pn (t) . . . (6.1)
dt

Yule Process =⇒ λn = nλ
Pn 0(t) = (n − 1)λPn−1 (t) − nλPn (t) (6.2)

STAT 451 & 611 Page 43 of 61


Consider the initial population size X(0) = j. Then we have the initial conditions


1, m=j



Pm (0) = P (X(0) = m) =

0, m 6= j

From (6.2), Pj 0(t) = (j − 1)λPj−1 (t) − jλPj (t) but (j − 1)λPj−1 (t) = 0, because if you start from j, you
cannot go back to j − 1 because it’s a birth process.

Pj 0 = −λjPj (t)
Pj 0(t)
= −λj
Pj (t)
Z
Pj 0(t) Z
dt = −λjdt
Pj (t)

ln Pj (t) = −λjt + c

Pj (t) = e−λjt · ec

at t = 0, Pj (0) = 1, 1 = e0 · ec so ec = 1

STAT 451 & 611 Page 44 of 61


=⇒ Pj (t) = e−λjt (6.3)

when j = n + 1, from 6.2,

Pj+1 0(t) = λjPj (t) − λ(j + 1)Pj+1 (t)

Pj+1 0 = λje−λjt − λ(j + 1)Pj+1 (t)

Pj+1 0(t) + λ(j + 1)Pj+1 (t) = λje−λjt

We use the integrating factor to solve this differential equation;

R
Integrating factor(I) = e λ(j+1)dt

I = eλ(j+1)t

This gives us

eλ(j+1)t Pj+1 0(t) + λ(j + 1)eλ(j+1)t Pj+1 (t) = λje−λjt · eλ(j+1)t


d h λ(j+1)t i
e Pj+1 (t) = λjeλt dt
dt

Take integral of both sides

Z
eλ(j+1)t Pj+1 (t) = λjeλt dt

eλ(j+1)t Pj+1 (t) = jeλt + c

At t = 0, Pj+1 (0) = 0 =⇒ 0 = j + c =⇒ c = −j

STAT 451 & 611 Page 45 of 61


eλ(j+1)t Pj+1 (t) = jeλt − j

Pj+1 (t) = e−λ(j+1)t [jeλt − j]

Pj+1 (t) = je−λjt − je−λ(j+1)t

Pj+1 (t) = jeλjt (1 − e−λt ) (6.4)

Thus, preceding in an iterative manner, we can infer that

j + k − 1 −λjt
!
Pj+k (t) = e (1 − eλt )k , k = 0, 1, . . . → negative binomial
j−1

or equivalently,
n − 1 −λjt
!
Pn (t) = e (1 − e−λt )n−j , n = j, j + 1, . . .
j−1

which shows that the population size at time t has a negative binomial distribution in which the
probability of success in a single trial is e−λt .

STAT 451 & 611 Page 46 of 61


Questions

1. Show that for the pure birth process

(a) E[X(t)|X(0) = j] = jeλt

(b) V ar[[X(t)|X(0) = j] = jeλt (eλt − 1)

2. For the pure birth process, find 1(a) and 1(b) by the method of p.g.f.

3. Derive for the pure birth process the results in 1(a) and 1(b) using the difference differential
equation Pn 0(t) = λ(n − 1)Pn−1 (t) − λnPn (t)

Solution:

1. The pure birth process is negative binomial

(a) E[X(t)|X(0) = j] = j
e−λt
= jeλt
j(1−e−λt )
(b) V ar[[X(t)|X(0) = j] = (e−λt )2
= j − je−λt · (eλt )2 = jeλt (eλt − 1)

STAT 451 & 611 Page 47 of 61


Chapter 7

The Pure(Simple) Death Process

Suppose an initial size, say i > 0 individuals die at a certain rate, eventually reducing the size to zero.
When the population size is n, let un be the death rate defined as follows: in an interval (t, t + ∆t], the
probability that one death occurs in the interval is un ∆t + 0(∆t), while the probability that no death
occurs is 1 − un ∆t + 0(∆t), and all other probabilities are negligible, 0(∆t).Also assume that the
occurrence of death in this interval (t, t + ∆t] is independent of time since the last death. Let
Pn (t) = P [X(t) = n] be the probability that there are n individuals in the population within time
interval (0, t]. By Chapman-Kolmogorov equations for transitions in the intervals of time (0, t] and
(t, t + ∆t], we have:

Pn (t + ∆t) = Pn (t)[1 − µ∆t + 0(∆t)] + Pn+1 [µn+1 ∆t + 0(∆t)]

Pn (t + ∆t) = Pn (t) − µn ∆tPn (t) + µn+1 ∆tPn+1 (t) + 0(∆t)


Pn (t + ∆t) − Pn (t) 0(∆t)
= µn+1 Pn+1 (t) − µn Pn (t) +
∆t ∆t
Pn (t + ∆t) − Pn (t) 0(∆t)
lim = µn+1 Pn+1 (t) − µn Pn (t) + lim
∆t→0 ∆t ∆t→0 ∆t

48
Pn 0(t) = µn+1 Pn+1 (t) − µn Pn (t) (7.1)

From the Yule’s process, µn = nµ


Pn 0(t) = Pn+1 (t) − µn Pn (t) (7.2)

Consider the initial conditions 


1, m=j



Pm (0) =

0, m 6= j

From 7.2, Pj 0(t) = µ(j + 1)Pj+1 (t) − µjPj (t).

µ(j + 1)Pj+1 (t) is 0 because you can’t get j + 1 from j if you’re in a death process.

Pj 0(t) = −µjPj (t)


Pj 0(t)
= −µj
Pj (t)
Z
Pj 0(t) Z
dt = −µjdt
Pj (t)

ln Pj (t) = −µjt + c

Pj (t) = e−µjt · ec

At t = 0, Pj (0) = 1 so ec = 1 which implies Pj (t) = e−µjt = (e−µt )j .

STAT 451 & 611 Page 49 of 61


NOTE: After the first person dies, you go to j − 1 not j + 1. When n = j − 1,

P 0j−1 (t) = µjPj (t) − µ(j − 1)Pj−1 (t)

P 0j−1 (t) + µ(j − 1)Pj−1 (t) = µjPj (t)

R
I=e µ(j−1)dt
= eµ(j−1)t

eµ(j−1)t Pj−1 (t) + µ(j − 1)eµ(j−1)t Pj−1 (t) = µjeµ(j−1)t Pj (t)


Z
eµ(j−1)t Pj−1 (t) = µjeµ(j−1)t Pj (t)dt
Z
eµ(j−1)t
Pj−1 (t) = µjeµ(j−1)t e−µtj dt
Z
eµ(j−1)t
Pj−1 (t) = µje−µt dt

= −je−µt + c

At t = 0, Pj−1 (0) = 0 so c = j.

eµ(j−1)t Pj−1 (t) = −je−µt + j

Pj−1 (t) = e−µ(j−1)t (−je−µt + j)

Pj−1 (t) = −je−µjt + je−µ(j−1)t

Pj−1 (t) = je−µ(j−1)t (1 − e−µt )

Hence, preceding in an iterative manner, we infer that

!
j
Pn (t) = (e−µt )n (1 − e−µt )j−n , n = 0, 1, . . . , j ∼ bin(j, e−µt )
n

This shows that the population size is binomially distributed with mean and variance given by

1. E[X(t)|X(0) = j] = je−µt

2. V ar[X(t)|X(0) = j] = je−µt (1 − e−µt )

STAT 451 & 611 Page 50 of 61


Chapter 8

Birth And Death Process

Birth is an event which signifies an increase in the population. Death is an event which signifies a
decrease in population. Suppose these two types of events occur under the following postulates:

1. Birth: If the population size n (20) at time t during the following infinitesimal interval (t,t+∆t),
the probability that birth will occur is λn ∆t + 0(λt)
lim0(∆t)
=0 hence 1 - λn ∆t + 0(∆t)
∆t→0 ∆t

Births occurring in (t, t+ ∆t] are independent of time since the last occurrence.

2. Death: If the population size n > 0 at time t during the following infinitesimal interval (t,t+∆t)
and 1 - µn ∆t is no death occurring in the interval (t,t+∆t). Death occurring in (t,t+∆t) are
independent of time since the last occurrence

3. When the population size is 0 at time t, the probability is 0 that death occurring during (t,t+∆t).
Note: More than one change (birth or death) takes palce in 0(∆t)

4. For the same population size, births and deaths occur independent of each other.

51
Let X(t) be the population size at t. Define

s,t
Pi,n = P (X(t) = n|X(s) = i)

This process is time homogenous and therefore we shall use the definition given by

Pnt = P (X(t) = n|X(0) = i)

As a consequence of postulates 1-4, we may write

t,t+∆t
Pn,n−1 = [µn ∆t + 0(∆t)][1 − λn ∆t + 0(∆t)]

= µn ∆t + µn ∆tλn ∆t + 0(∆t)

= µn ∆t ... (1)

t,t+∆t
Pn,n = [1 − µn ∆t + 0(∆t)][1 − λn ∆t + 0(∆t)]

1 − λn ∆t − µn ∆t + 0(∆t) ... (2)

t,t+∆t
Pn,n+1 = [λn ∆t + 0(∆t)][1 − µn ∆t + 0(∆t)]

= λn ∆t + 0(∆t) ... (3)

Pnj (t, t + ∆t) = 0(∆t) ... (4)


X

j6=n−1,n+1

Any other steps that do not reduce the population by 1 or increase it by 1 moves towards 0. eg n-2 =
0(∆t), n+2 = 0(∆t)
For transitions occurring in non-overlapping intervals (0,t] and (t,t+∆ t] based on equations (1) through
(4), then the Chapman-Kolmogorov equations will take the form

STAT 451 & 611 Page 52 of 61


Pn (t + ∆t) = [1 − λn ∆t − µn ∆t + 0(∆t)]Pn (t) + [λn ∆t + 0(∆t)]Pn−1 (t) + [µn ∆t + 0(∆t)Pn+1 (t)]

Pn (t + ∆t) = (1 − λn ∆t − µn ∆t)Pn (t) + [λn−1 ∆t]Pn−1 (t) + [µn+1 ∆t]Pn+1 (t) + 0(∆t)

Pn (t + ∆t) = Pn (t) − (λn ∆t + µn ∆t)Pn (t) + λn−1 ∆tPn−1 (t) + µn+1 ∆tPn+1 + 0(∆t)

Pn (t + ∆t) − Pn (t) 0(∆t)


lim = −(λn + µn )Pn (t) + λn−1 Pn−1 (t) + µn+1 Pn+1 (t) + lim
∆t→0 ∆t ∆t→0 ∆t

Pn0 (t) = −(λn + µn )Pn (t) + λn−1 Pn−1 (t) + µn+1 Pn+1 (t), n = 0, 1, 2...

1 if n=i and Pn−1 (t) = 0



With initial conditions, Pn (0) =  If λn = nλ and µn = nµ
0 if n 6= i

Then the simple linear birth and death process becomes

STAT 451 & 611 Page 53 of 61


Pn0 (t) = −n(λ + µ)Pn (t) + (n − 1)λPn−1 (t) + (n + 1)µPn+1 (t)... (6)

E[X(t) = n|X(0) = i] = E(X(t)) = nPn (t)
X

n=0

∞ ∞
d d
E[X(t)] = n Pn (t) = nPn0 (t)
X X
dt n=0 dt n=0


n[−n(λ + µ)Pn (t) + (n − 1)λPn−1 (t) + (n + 1)µPn+1 (t)]
X

n=0

∞ ∞ ∞
n (λ + µ)Pn (t) + λ
2
n(n − 1)Pn−1 (t) + µ n(n + 1)Pn+1 (t)
X X X

n=0 n=0 n=0

let n-1=n and n+1=n


∞ ∞ ∞
= −(λ + µ) n2 Pn (t) + λ (n + 1)nPn (t) + µ (n − 1)nPn (t)
X X X

n=0 n=0 n=0

d
E[X(t)] = (λ − µ)E[X(t)]
dt

(λ − µ) is the intrinsic growth rate

dE[X(t)]
= (λ − µ)dt
E[X(t)]

take integral of both sides


Z
dE[X(t)] Z
= (λ − µ)dt
E[X(t)]

=> lnE[X(t)] = (λ − µ)t + c

E[X(t)] = e(λ−µ)t ec

X(0) = i i = e(λ−µ)(0) .ec => ec = i

E[X(t)|X(0) = i] = ie(λ−µ)t

STAT 451 & 611 Page 54 of 61


Recall equation 6

[1 − e−(λ−µ)t ]
P0 (t) = t = µ
[λ − µe(λ−µ) ]

Pn (t) = (1 − t )(1 − nt )ntn−1

λ[1 − e−(λ−µ)t ]
!
λ
where nt = t =
µ λ − e−(λ−µ)t

Clearly Pn (t) has a geometric distribution modified with the initial term. lim P0 (t)=Probability of
t→∞

ultimate extinction of the population.

x 1−0 µ
 
let ρ = ; lim t = µ = = ρ−1
µ t→∞ λ−0 λ

e(µ−λ)t µe−(µ−λ)t − µ
" #
and lim t = lim (µ−λ)t ;ρ ≤ 1
t→∞ t→∞ e λe−(µ−λ)t − µ
λ
=> λ ≤ µ
µ

1 if ρ ≤ 1 or λ ≤ µ



=> lim P0 (t) =
t→∞ 
ρ−1 if ρ > or λ > µ

Meaning, ultimate extinction is certain if death rate is larger than birth rate Meaning ultimate extinction
is certain if death rate is larger than birth rate. Question: The p.g.f.of X(t), the number of individuals
in the population at time t of the simple linear pure birth and death process with X(0) = 1 is


µ(1−α)−(λ−µα)z
if λ 6= µ

 µ−λα−λ(1−α)z ,


φ(z, t) =

 1−(λt−1)(z−1) , if λ = µ


1−λt(z−1)

where α = e(λ−µ)t . Use the above p.g.g. to derive an expression for

1. P0 (t) when λ 6= µ.

2. P0 (t) when λ = µ

STAT 451 & 611 Page 55 of 61


3. Pn (t), n ≥ 1 when λ 6= µ

Hence show that 


1, if λ ≤ µ



lim P0 (t) =
t→∞ 
µ,


λ
λ>µ

and comment on the results with special reference to the intrinsic growth rate(λ − µ).

STAT 451 & 611 Page 56 of 61


Comments:The generalized birth and death process is a continuous Markov process. It is said to be
time-homogeneous because the parameters λn and µn do not depend on time, but they depend only on
the size of the population(n = 0, 1, 2, . . . ). If λ0 = 0, then state 0 becomes an absorbing state(recurrent).
All states are transient, that is, the process is not irreducible to this case. In the case where λn = nλ,
and µn = nµ(n ≥ 1) the simple birth and death or simple linear growth process, where λ0 = 0 and for
X(0) = 1, then the ultimate extinction probability


1, if λ ≤ µ



lim P0 (t) =
t→∞ 
µ,


λ
λ>µ

Theorem 8.0.0.1. Where X(0) = i, λn = nλ and µn = nµ(n ≥ 1), the corresponding


absorption(ultimate extinction) probabilities are


1,

if λ ≤ µ


lim P0 (t) =
t→∞ 
( µ )i ,


λ
λ > µ, i is the initial size of the population

The preceding theorem can be proved by noting that since we have assumed independence and no
interaction among the members, we may view the population as the sum of i independent simple birth
and death processes each beginning with a single member thus if φ(z, t) is the p.g.f. corresponding to
X(0) = 1, then the p.g.f. corresponding to X(0) = i is


µ(1−α)−(λ−µα)z i
h i
if µ = λ

,


 µ−λα−λ(1−α)z
ψ(z, t) = h ii
 1−(λt−1)(z−1) , if µ 6= λ


1−λt(z−1)

where α = e(λ−µ)t .

Definition 8.0.0.2. A birth and death process is called a linear growth process with immigration if
λn = nλ +  and µn = nµ with λ, µ,  > 0, such processes occur naturally in the study of the biological
reproduction and population growth.

STAT 451 & 611 Page 57 of 61


If the state n describes the current population size, then the average instantaneous rate of growth is
λn + . The probability of the state of the process decreasing by one in a small time interval of length ∆t
is µn (∆t) + 0(∆t). The factor λn represents the natural growth of the population owing to its current size
while the second factor  may be interpreted as the infinitesimal rate of increase of the population due to
an external source such as immigration.

Note:This process is irreducible and therefore there is no absorbing state. The probability of ultimate
extinction is zero for λ0 =  > 0. Consider a population subject to death, birth, emigration and
immigration under the following assumptions:

1. In a small time interval of length ∆t, the probability for a given individual to

(a) die or emigrate is µ∆t

(b) give birth to a new individual λ∆t

2. Immigrants are subject to death, emigration and give birth in the same manner as existing
members.

3. In a small time interval of length ∆t, the probability of the population being

(a) increased by one immigrant is ∆t + 0(∆t).

(b) increased or decreased by more than one is 0(∆t).

Question:Based on the above assumptions, if X(t) is the total population size at time t and
Pn (t) = P [X(t) = n]; n = 0, 1, 2, . . . ,prove that

d
Pn (t) = [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t) − [n(λ + µ) + ]Pn (t)
dt

STAT 451 & 611 Page 58 of 61



1, if n = i



Deduce that if Pn (0) = then

0, if n 6= i


h i

e(λ−µ)t − 1 + ie(λ−µ)t , if λ 6= µ



 λ−µ
E[X(t) = n|X(0) = i] = 
t + i, if λ = µ

Solution:
Let

(s,t)
Pin = P [X(t) = n|X(s) = i]

Pn (t) = Pin (0, t) = P [X(t) = n|X(0) = i]

Pn,n−1 (t, t + ∆t) = [µn∆t + 0(∆t)][1 − λn∆t + 0(∆t)][1 − ∆t + 0(∆t)]

= µn∆t + 0(∆t)

Pn,n (t, t + ∆t) = [1 − µn∆t + 0(∆t)][1 − λn∆t + 0(∆t)][1 − ∆t + 0(∆t)] + [λn∆t + 0(∆t)][µn∆t + 0(∆t)][1 − ∆

= 1 − [n(λ + µ) + ]∆t + 0(∆t)

Pn,n (t, t + ∆t) = [µn∆t + 0(∆t)][λn∆t + 0(∆t)][∆t + 0(∆t)] + [1 − µn∆t + 0(∆t)][λn∆t + 0(∆t)][1 − ∆t + 0(∆

= (λn + )∆t + 0(∆t)

Pnj (t, t + ∆t) = 0(∆t) (8.1)


X

j6=n−1,n,n+1

from Chapman-Kolmogorov equation for Markov process with discrete state space.

STAT 451 & 611 Page 59 of 61


But

Pn−1,n (t, t + ∆t) = [λn∆t + 0(∆t)][1 − ∆t + 0(∆t)][1 − µn∆t + 0(∆t)] + [1 − λn∆t + 0(∆t)][1 − µn∆t + 0(∆t)][

= [λ(n − 1) + ]∆t + 0(∆t)

This implies

Pn (t + ∆t) = [1 − [n(λ + µ) + ]∆t + 0(∆t)] · Pn (t) + [[λ(n − 1) + ]∆t + 0(∆t)] · Pn−1 (t) + [µ(n + 1)∆t + 0(∆t)]

= Pn (t) − [n(λ + µ) + ]∆t · Pn (t) + [λ(n − 1) + ]∆tPn−1 (t) + µ(n + 1)∆tPn+1 (t) + 0(∆t)

P (t + ∆t) − Pn (t) 0(∆t)


lim = −[n(λ + µ) + ]Pn (t) + [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t) + lim
∆t→0 ∆t ∆t→0 ∆t
d
Pn (t) = −[n(λ + µ) + ]Pn (t) + [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t)
dt


E[X(t)|X(0) = i] = n · Pn (t)
X

n=0

d d
E[X(t)] = n · Pn (t)
X
dt n=0 dt

= n[[−n(λ + µ) + ]Pn (t) + [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t)]
X

n=0
∞ ∞ ∞ ∞ ∞
= −(λ + µ) n2 Pn (t) +  nPn (t) + λ n(n − 1)Pn−1 (t) +  nPn−1 (t) + µ n(n + 1)P
X X X X X

n=0 n=0 n=0 n=0 n=0

Theorem 8.0.0.3. If a Markov Process is irreducible, then the limiting distribution limn→∞ Pn exists
and is independent of the initial conditions of the process. The limits(Pn , n ∈ S, the state space) are such
that they either vanish identically, i.e.Pn = 0 ∀n ∈ S or are all positive and form a probability
distribution or vector, i.e.Pn > 0 ∀n ∈ S and Pn = 1.
P
n∈S

STAT 451 & 611 Page 60 of 61


Note: In this case Pn 0(t) → 0 as t → ∞, thus for the birth and death process with immigration, whether
or not a steady rate distribution exists if X(0) = i and µ > λ, Pn 0(t) → 0 as t → ∞ and

[λ(n − 1) + ]Pn−1 (t) + (n + 1)µPn+1 (t) − [n(µ + λ) + ]Pn (t) = 0

STAT 451 & 611 Page 61 of 61

You might also like