STAT611
STAT611
2020
Chapter 1
Introduction
Definition 1.0.0.1. A stochastic process is a set of random variables X(t), t ∈ T , where T is called the
parameter space of the process.
NB: t is called an indexing parameter.The values assumed by the process are known as States.
Examples:
1
SS= H,T
TS= 1,2,3,...
Probability Distributions
Let (t1 , t2 , ..., tn ) with t1 < t2 < t3 < ... < tn be a discrete set of points within T. The joint distribution
for the process X(t) at these points can be defined as:
.Let t0 and t1 be two points in T such that t0 ≤t1 ; then we may define the conditional transition
distribution function as:
F (x0 , x1 , t0 , t1 ) = P [X(t1 ) ≤ x1 /X(t0 ) = x0 ]...∗
When a stochastic process has discrete parameter and state spaces, we may define the transition
probabilities as:
(m,n)
P(i,j) = P (Xn = j/Xm = i)...m ≤ n.
where i and j are the state spaces. The probabilities are called transition probabilities.
3
Then we have:
for t0 ∈ T. For convenience, we can write (**) as F (X0 , X; t).The corresponding expression for the
process Xn , n = 0, 1, 2, ... would then be Pi,j
n
(you are at i moving to j in n steps)
(t,t+n)
Pi,j = P (Xn+t = j/Xt = i) (2.1)
Consider a finite (or countably infinite) set of (t0 , t1 , ..., tn , t) , t0 < t1 < t2 < ... < tn < t and t, tr ∈ T
(r = 0, 1, 2, .., n) where T is the parameter space of the process X(t), t ∈ T is called
Markov-dependence if the conditional distribution of X(t) for given values X(t1 ), X(t2 ), ..., X(tn )
depends only on X(tn ) which is the most recent known values of the process; i.e if:
= F (xn , x; tn , t) (2.6)
The stochastic
Z
F (x0 , x; t0 , t) = F (y, x, T, t)dF (x0 , y; t0 , T ),
y∈S
2. For a Markov process with aa discrete state space and a continuous parameter space T,
k∈S
∀s ≥ 0, t ≥ 0, where,
Pij (t + s) = P (Xt+s = j/X0 = i)
(m,n)
Pij = P (Xn = j/Xm = i) (2.7)
X
= P (Xn = j, Xr = k/Xm = i) by the total probability rule. (2.8)
k∈S
X
= P [Xn = j/Xr = k, Xm = i]P [Xr = k/Xm = i] (2.9)
k∈S
X
= P [Xn = j/Xr = k]P [Xr = k/Xn = i] by the Markov dependency property (2.10)
k∈S
X (r,n) (m,r)
= Pk,j Pi,k (2.11)
k∈S
X (m,r) (r,n)
= Pi,k Pk,j (2.12)
k∈S
(2.13)
(t+s)
Pij = P (Xt+s = j/X0 = i) (2.14)
X
= P (Xt+s = j, Xt = k/X0 = i) by the total probability rule. (2.15)
k∈S
X
= P [Xt+s = j/Xt = k, X0 = i]P [Xt = k/X0 = i] (2.16)
k∈S
X
= P [Xt+s = j/Xt = k]P [Xt = k/X0 = i] by Markov dependency (2.17)
k∈S
X (t,t+s) (0,t)
= Pk,j Pi,k (2.18)
k∈S
X (0,t) (t,t+s)
= Pi,k Pk,j (2.19)
k∈S
(2.20)
1. State space is usually represented with the index set 0, 1, 2, ..., m − 1 or 1, 2, ..., m; assuming m
discrete states. A similar convention holds for a discrete parameter space.
(t)
2. Pi,j = P (Xt = j/Xt−1 = i), one step dependency assumption.This is the probability of state j at
time t given state i at time t-1.
From the time-homogeneous assumption, we have
(t)
Pi,j = Pi,j ∀t ∈ T
If the equation in (2) does not hold, we have a non-homogeneous first order Markov Chain, otherwise, we
have a stationary (time homogeneous) first order Markov Chain.
( (
In the latter case, if Pj t) = P (Xt = j) then it can be shown that Pj t) = Pi (t − 1).Pj,i , j=1,2,..m
P
i∈S
Pi,j (t) = P (Xt+1 = j/Xt = i) = Pi,j is the first step transition probability.
Now by TPR,
Pj (t) = Pi (t − 1).Pj,i , j = 1, 2, ...; t = o, 1, 2...
X
i∈S
with t=o being the initial time and Pi (t) = P (Xt = i).
In matrix form,P (t) = P (0)pt ; where P t is the matrix P raised to the power t.
∼ ∼ ∼ ∼ ∼
i 0 ≤ Pi,j ≤ 1, ∀ i, j ∈ S .
ii P i, j = 1, for each i∈ S.
P
i
A square matrix which satisfies these two properties is said to be a stochastic matrix or a transition
matrix.
E.g:
0.4 0.6
is a stochastic matrix.
0.2 0.8
E.g:
0.4 0.6
is a doubly stochastic matrix.
0.6 0.4
1. Bernoulli Process: This is a process with discrete state and parameter space.Denote by Sn the
number of successes in n trials, clearly {Sn } is a Stochastic process with state space {0,1,2...} and
!
n k n−k
P (Sn = k) = p q , k = 0, 1, 2...n
k
2. Poisson Process: This is a process with discrete state and continuous parameter space. Consider
the events occurring under the following postulates:
* There is a constant λ such that the probabilities of occurrence of events in a small interval of
length ∆t are given as follows;
-P[number of events occurring in (t,t+∆t)=0]=1-λ∆t+ 0(∆t)
-P[one event occurring in (t,t+∆t)]=λ∆t+ 0(∆t)
k
and P (X(t) = k) = e−λt (λt)
k!
; k=0,1,2..
The time interval between consecutives occurrences of events in a Poisson Process are
independent random variables indentically distributed with pdf:
3. Gaussian Process: This is a process with continuous state and parameter spaces.
4. Weiner Process: This is a process with continuous state and parameter spaces.
1. The process X(t),t≥0 has stationary independent increments. This means that for t1 , t2 ∈ T and
t1 < t2 , the distribution of X(t2 ) − X(t1 ) is same as X(t2 + h) − X(t1 + h) for any h > 0 and for
any non-overlapping time interval (t1 , t2 ) and (t3 , t4 ) with t1 < t2 < t3 < t4 .
2. For any given time interval (t1 , t2 ), X(t2 ) − X(t1 ) is normally distributed with mean 0 and variance
σ 2 (t2 − t1 ).
( (r) (s)
Pi,j r + s) =
X
Pi,k Pk,j
k∈S
P .P = P
∼ ∼ ∼
(2)
Let r = n − 1, s = 1,
(n) (n−1)
Pi,j = is the (i,j)th element of
X
Pi,k Pk,j
k∈S
P n−1 .P = P n
∼ ∼
∴ P (n) = P n
∼
For a two-state chain with state space S = 0, 1, then the transition matrix is denoted by:
P00 P01
P =
∼
P10 P11
Example:
Theorem 2.3.0.2. For a two-state Markov Chain with one step probability transition matrix,
1 − a a
P
∼
= , 0 ≤ a, b ≤ 1, |1 − a − b|< 1
1−b
b
a(1−a−b)n a(1−a−b)n
b
a+b + a+b
a
a+b
− a+b
Pn =
∼ b(1−a−b)n b(1−a−b)n
+
b a
a+b
− a+b a+b a+b
lim P n = π
n→∞ ∼
(2.21)
(π0 , π1 , π2 , ..., πn )P
∼
= (π0 , π1 , π2 , ..., πn )
Note: If a Markov Chain has a long-run or steady state distribution, then the long run distribution is
also stationary;However a Markov chain can have a stationary distribution without a limiting
distribution.
0 1
Example: For a two-state Chain with transition matrix P
∼
=
1 0
0 1
π.P
∼
= π =⇒ (1/2, 1/2) = (1/2, 1/2)
1 0
1/2 1/2
π= is the stationary distribution
1/2 1/2
0 1
if n is odd
n
0 1 1 0
However, limn→∞ P n = = Hence there exist no long-run limiting
∼
1 0 1 0
if n is even
0 1
distribution.
Questions:
lim P n = π
n→∞ ∼
(2.22)
3. Prove that if P
∼
and Q are k×k stochastic matrices, then P
∼
Q is a stochastic matrix.
∼ ∼
Definition 2.3.0.6. A probability vector (P0 , P1 , P2 , ...Pm ) is said to be stationary with respect to a
stochastic matrix P
∼
if (P0 , P1 , ...Pm )P
∼
=(P0 , P1 , ...Pm ).
Definition 2.4.0.1. State j is said to be accessible from state i if j can be reached from i in a finite
number of steps. If two states i and j are accessible to each other, then they are said to communicate.
1. Reflexivity: i ←→ i
1 ,i = j
(0)
Pi,j = δi,j =
0 , i 6= j
2. Symmetry: i ←→ j then j ←→ i
(r) (s)
Proof ∃r, s ∈ Z + /Pi,j > 0andPj,k > 0
(r+s) (r) (s) (r) (s)
But Pi,k = Pi,l Pl,k ≥ Pi,j Pj,k > 0
P
i∈S
Therefore, i → k.
Reflexivity, Symmetry and transitivity properties together form an equivalence class or equivalence
relation.The set of all states of a Markov Chain that communicate(with each other) can therefore be
grouped into a single equivalence class.
Markov Chains may have more than one such equivalence class. If there are more than one, then it is not
possible to have communicating states in different equivalence classes. However it is possible to have
states in one class that are accessible form another class.
Definition 2.4.1.1. If a Markov Chain has all its states belonging to one equivalence class, it is said to
be irreducible.
Clearly, in an irreducible chain, all states communicate.The period of a state i is defined as the greatest
(n)
common divisor of all integers n ≥ 1 for which Pi,i > 0
Theorem 2.4.1.2. If i and j are states of a Markov Chain and i ←→ j, then i and j have the same
period.
It follows from the above theorem that, periodicity is also a class property. A class of states with
periodicity is said to be aperiodic.
If all the states of a Markov Chain communicates and have period 1, then the chain is said to be
irreducible and aperiodic.If P is a transition matrix of finite Markov Chain which is irreducible and
∼
aperiodic, it is easy to show that ∃n ∈ Z + , n ≥ 1, for which P (n) has no zero element (all states are
∼
accessible).The matrix P is then said to be regular or primitive, the chain also then is said to be regular.
∼
Theorem: Let P
∼
be the transition probability matrix of an irreducible, aperiodic m-state finite
time-homogeneous Markov Chain, then:
α
∼
α
∼
lim P n = π = . , where α = (π1 , π2 , ..., πn )
n→∞ ∼ ∼
∼
.
α
∼
(n)
b ∃ constant c and r (c > 0, 0 < r < 1) such that |Pi,j − πj |≤ crn , ∀i, j = 1, 2, ..., m
c P π = πP = π
∼∼ ∼∼ ∼
Note: The convergence property in b) is known as geometric ergodicity. A chain with his type of
property is said to be strongly ergodic. (we are sure that the chain will certainly converge and even faster)
Any finite irreducible, time-homogenous Markov Chain Xm |m ∈ T with transition probability matrix P
∼
V∼ P
∼
= V∼
, where V is a row probability vector which can be found using the equation and the constraint
∼
I = 1,
V∼ ∼
with I being a column vector of the same dimension as V and all entries equal to unity.
∼
V I = v0 + v1 + ... + vm
∼∼
n
=
X
vj
j=1
= 1
A finite irreducible aperiodic Markov Chain Xm |m ∈ T with a regular transition probability matrix P
∼
has a long run and stationary distribution α given by αP = α when α = (π1 , π2 , ..., πm ) assuming
∼ ∼∼ ∼ ∼
πi = limn→∞ P Xn = i)
1. how that the chain is regular and find the long-run distribution.
Solution
δ = {0, 1, 2}.
The chain is irreducible because all the states {0, 1, 2} communicate.
(1)
P11 = 1/8 > 0. The period of state 1 is 1, hence state 1 is aperiodic since periodicity is a class property,
then the chain is aperiodic. Since it satisfies the aperiodicity property and all the states communicate,
then the chain is regular.
Note: If the chain is regular, then the limiting distribution exists.
∴, limn→∞ P
∼
n
=π
∼
exists.
By inference, the chain is ergodic.
0 2/3 1/3 π0
(π0 π1 π2 )
3/8 1/8 1/2 =
π
1
1/2 1/2 0
π2
and π0 + π1 + π2 = 1
Solving the system ofequations yield
(π0 , π1 , π2 ) = (0.3, 0.4, 0.3)
0.3 0.4 0.3
∴, limn→∞ P n = π =
0.3 0.4 0.3
∼ ∼
0.3 0.4 0.3
Assignment: Prove that if a Markov Chain is irreducible and aperiodic and has M states, then the
limiting probabilities are given by:
1
πj = ; j = 1, 2, ..., m
m
Let {Xn } be a Markov Chain with state space S = {0, 1, 2, ..., m − 1}. Define
(n)
fij = P (Xn = j, Xr 6= j, r = 1, 2, ..., n − 1|X0 = i)
This means that at time 0, the process was in state i, but in all r steps, the process never got to j until
at time n; and
∞ ∞
(∗) (n) (n)
fii = and µij =
X X
fii nfij
n=1 n=1
(n)
fii = P (Xn = i, Xr 6= i, r = 1, 2, ..., n − 1|X0 = i)
(n) (n)
2. fij is the probability of first passage time when j = i, we shall call fii the recurrence time
distribution of state i
3. µij is the expected value of the first passage time µii = µi =mean recurrence time
Definition 2.5.1.1. A state i is said to be recurrent if and only if, starting from i, the eventual return to
(∗)
state i is certain, fii = 1
Types of recurrence
M = (I − Q)−1 ,F = M R = fij
kσij2 kv ariance = M (2MD − I) − M2 , i, j ∈ T
µr+1,r+1 0
..
.
..
MD = diag(M ) = .
..
.
0
µm,m
µ2r+1,r+1 µ2r+1,r+2 ... µ2r+1,m
.. ..
M2 = . .
µ2m,r+1 ... ... µ2m,m
Branching Process
Definition 3.0.0.1. Consider a population of individuals which gives rise to a new population. Assume
that the probability of an individual in his lifetime gives rise to r new individuals(offsprings) is Pr for
r = 0, 1, . . . and that individuals are independent rvs. The new population forms a first generation, which
in turn reproduces a second generation which in turn produces a third generation etc. For n = 0, 1, 2 . . .
let Xn be the size of the nth generation so that X0 ; the zeroth generation is initial population
then {Xn ; n = 0, 1, . . .} is a Markov Chain called a Branching Process. Its states space is {0, 1, 2, . . .}.
Note that 0 is a recurrent state since clearly if P = Pij is TPM, then P00 = 1.
Also, if P0 > 0, it can be shown that all other states are transient.
∞
Let f (Z) = P rZ r , |Z|≤ 1 be a probability generating fxn (p.g.f)
P
r=0
Note:f (0) = P0 where Pr is the coefficient of Z r in the expression of f(Z).
f (Z) = P0 Z 0 + P1 Z 1 + P2 Z 2 + . . .
f (Z) = P0 + P1 Z + P2 Z 2 + . . .
f (0) = 0
19
df
f 0(Z) = = P1 + 2P2 Z + 3P3 Z 2 + . . .
dt
Xn
Xn+1 =
X
Zi
i=1
= E(E[tXn+1 |Xn ])
∞
= E(tXn+1 |Xn ).P (Xn = j)
X
j=0
Xn
∞
P
Zi
= E[ti=1 |Xn ].P (Xn = j)
X
j=0
∞
= E[tZ1 +Z2 +...+Zj |Xn = j].P (Xn = j)
X
j=0
∞ j
= E(tZi |Xn = j)].P (Xn = j)
X Y
E[
j=0 i=1
∞
=⇒ fn+1 (t) = [f (t)]j P (Xn = j)
X
j=0
= F [fn (t)]
Since X0 = 1 , F1 = f
1. X0 = 1
4. P0 + P1 < 1
with these assumptions :the function f(t) is strictly convex on the unit interval of real axis depicted as;
Figure 3.1
∞
m = E(X1 ) =
X
r.Pr
r=0
∞
σ 2 = V ar(X1 ) = E(X12 ) − [E(X1 )]2 = r 2 Pr − m 2
X
r=0
∞
However, f 0n(1) = rP (Xn = r|X0 = 1)
P
r=1
if n = 1;
∞
f 0(1) = rP (X1 = r|X0 = 1)
X
r=1
∞
= rP (X1 = r)
X
r=1
= E(X1 ) = m by independence
E(Xn ) = mn by induction
f 00n+1 (t) = f 0(fn (t))f 00n (t) + f 0n (t)[f 00(fn (t))f 0n (t)]
f 00n+1 (1) = f 0(fn (1))f 00n (1) + f 0n (1)[f 00(fn (1))f 0n (1)]
Now:
Xn
Xn+1 = Zr and Pij = P (Xn+1 = j|Xn = i)
X
r=1
Xn
!
Pij = P Zr = j|Xn = i
X
r=1
Xn
!
=P Zr = j
X
r=1
where Zr is as defined previously. It is clear that Xn+1 is a r.v and since Zr ; r = 1, 2, ... are iid, we get
= E(Xn−1 )µ
= E(Xn−2 )µ2
= E(Xn−n+1 )µn−1
= E(Xn−n )µn
= E(X0 = 1)µn
E(Xn ) = µn ; n ≥ 1.......(4)
Also from 3:
= µn−1 σ 2 + µ2 V ar(Xn−1 )
= µn−1 σ 2 + µn σ 2 + µ4 V ar(Xn−2 )
= µn−1 σ 2 [1 + µ + µ2 + . . . + µn−1 ]
"n−1 #
= µn−1 σ 2 µr
X
r=0
1(1 − µn )
" #
= µ n−1 2
σ , µ 6= 1
1−µ
= nσ 2 if µ = 1
1(1−µn )
µn−1 σ 2 ; µ 6= 1
1−µ
V ar(Xn ) =
nσ 2 ; µ = 1..........(5)
From (4) and (5), it’s clear that the mean and variance of Xn increases or decreases geometrically
according as µ > 1 or µ < 1.
Now from the Chebyshev’s inequality which states that E(Xn ) = µn < ∞
and V ar(Xn ) < ∞, ∃ > 0|P (|Xn − E(Xn )|> ) ≤ V ar(Xn )
2
as n → ∞, when µ < 1
, E(Xn ) = 0
P (Xn = 0) → 1
When µ < 1, then it’s certain that the population of size Xn will be extinct as n → ∞
∴ q = n→∞
lim fn (0)
→ q = f (q)
Proof:
fn+1 (t) = f [fn (t)]
Now let qn = P (Xn = 0)
= fn (0)
qn+1 = fn+1(0) = f [fn (0)]
= f (qn )
Theorem 3.0.0.3. If the mean number of offsprings born to an individual µ = E(X1 |X0 = 1) = f 0(1) is
less than or equal to 1, then the probability of ultimate extinction of the population is surely one(1).
if the mean number is greater than one, then the probability of ultimate extinction is the unique
∞
non-negative solution less than 1 of the equation f = f (t) where f (t) = tr P
P
r
r=0
solution: E(X) = 0 × 14 + 1 × 14 + 2 × 1
2
= 5
4
E(X) = 5
4
=µ>1
q = f (q)
∞
f (q) = q r .P
P
r
r=0
= q 0 .P0 + q 1 .P1 + q 2 .P2
Let the process X(t) represent the number of times an event occurs in the time (0, t]. Define s < t
Pij (s, t) = P (X(t) = j|X(j) = 1
2. For a sufficiently small ∆t, there is a constant λ such that the probability of occurrence of events in
(t, t + ∆t] are given as follows;
where 0(∆t) contains all terms that tends to zero much faster than ∆t
ie. lim 0(∆t)
∆t
=0
∆t→0
30
0(∆t) + 0(∆t) = 0(∆t) and c.0(∆t) = 0(∆t), c →constant
Theorem 4.0.0.1. Under the above postulates, the number of events occurring in any interval of length t
is a Poisson random variable with parameter λt. Thus,
eλt .(λt)n
Pn (t) = P (X(t) = n|X(0) = 0 = , n = 0, 1, ..
n!
By the independence assumption of postulate 1 and third option under postulate 2 in conjunction with
the Chapman-Kolmogorov Equation we have,
From (1):
P0 (t + ∆t) = P0 (t) − λ∆t.P0 (t) + 0(∆t)
From (2):
Pn (t + ∆t) − Pn (t) = λ∆t.Pn−1 (t) − λ∆t.Pn (t) + 0(∆t)
From (4):
Pn (t + ∆t) − Pn (t) 0(∆t)
lim = λPn−1 (t) − λ.Pn (t) + lim
∆t→0 ∆t ∆t→0 ∆t
Eqn(5) and Eqn(6) from a system of different differential equation that can be solved by recursive
method as follows:
Multiply both sides of (5) and (6) by eλt
From (8):
eλt P 0n (t) + λeλt Pn (t) = λeλt Pn−1 (t)
Q0 (t) = c
Let t = 0; Q0 (0) = 1 =⇒ c = 1
Q01 (t) = λ
Z Z
Q01 (t) = λ
Q1 (0) = c ∴ c = 0
Q1 (t) = λt
λ2 .t2
Q2 (t) = +c
2
Q2 (0) = c = 0
λ2 t2 (λt)2
Q2 (t) = =
2 2!
λ3 t2
Q03 (t) = λQ2 (t) =
2
λ3 t3
Q3 (t) = +c Q3 (0) = 0 = c
6
λ3 t3 (λt)3
Q3 (t) = =
6 3!
..
.
λn−1 tn−1
Qn−1 (t) =
(n − 1)!
Qn (0) = c = 0
λn tn
Qn (t) =
n!
Put
Qn (t) = eλt Pn (t)
λn tn
= eλt Pn (t)
n!
eλt (λt)n
Pn (t) = ,n ≥ 0
n!
36
Then Tn is less than or equal to t iff the number of events that have occurred by time t is at least n.
That is if X(t) =the number of events in (0,t], then P (Tn ≤ t) = P (X(t) ≥ n)
∞
P (X(t) = j)
X
j=n
∞
X (λt)j e−λt
j=n j!
∞
(λt)j e−λt
F (t) =
X
j=n j!
d X ∞
(λt)j e−λt
fn (t) =
dt j=n j!
∞
j.(λt)j−1 .λ.e−λt (λt)j (−λe−λt )
" #
= +
X
j=n j! j!
∞
λ.e−λt .(λt)j−1 λe−λt (λt)j
" #
=
X
−
j=n (j − 1)! j!
∞
(λt)j−1 (λt)j
" #
= λ.e−λt
X
−
j=n (j − 1)! j!
Note: It is easy to show that if the initial observation of the process is made at s, s > 0 at which time
X(s) = i
s,t
Pin = P (X(t) = n|Xs = i)
e−λ(t−s) [λ(t−s)n−1 ]
s,t
Pin = (n−1)!
Proof:
Since the population will die out iff the families of each of the members of initial generation dies out and
since each family is assumed to act independently, the desired probability is q k .
Question: Suppose that in a discrete branching process, the probability of an individual having k
offspring is given by
e−k λk
Pk = , k = 0, 1, . . .
k!
However, because the population of interest is heterogenous, λ itself is a random variable distributed
distributed according to a gamma distribution
( q )α .λα−1 exp{ −q λ}
p p
; λ≥0
Γ(α)
g(λ) =
0, otherwise
where q, p, α are strictly positive constants and p + q = 1. Find the p.g.f of the number of offspring of a
single individual in the population. Hence find the probability of ultimate extinction.
Solution:
f (x, λ) = f (x|λ).g(λ)
Z Z
f (x, λ)d(λ) = f (x|λ).g(λ)d(λ)
∀x ∀λ
Z
h(x) = f (x|λ).g(λ)d(λ)
∀λ
..
.
Z
E(t) = tx .h(x)dx
∀x
−q
( pq )α .λα−1 e{ p
λ}
g(λ) = ;λ > 0
Γ(α)
q
=⇒ λ ∼ Γ(α, )
p
Z
h(x) = f (x|λ).g(λ)dλ
∀λ
−q
Z ∞ −λ x
e .λ ( pq )α .λα−1 e{ p
λ}
= . dλ
0 x! Γ(α)
( pq )α Z ∞ q
= λα+x−1 e−λ( p +1) dλ
Γ(α)x! 0
but
Z ∞
r!
λr e−λx dλ =
0 λr+1
Z ∞ q (α + x − 1)
λα+x−1 e−λ( p +1) dλ =
0 ( pq + 1)α+x
( pq )αΓ(α + x)
h(x) = .
Γ(α)x! ( pq + 1)α+x
Γ(α + x) ( pq )α
= .
Γ(α)x! ( pq + 1)α+x
α + x − 1 q α .px
=
1
x
α+x−1
= .q α .(1 − q)x ; x = 0, 1, . . .
x
∀x
∞ α+x−1
tx . .q α .(1 − q)x
X
∀x x
α+x−1 −α
Identity; = (−1)x
x x
∞ −α
f (t) = tx .(−1)x .q α .(1 − q)x
X
∀x x
∞ −α
= .q α .(−tp)x , p =1−q
X
∀x x
∞ −α
Identity; (1 − t)− α = (−t)
x
=⇒ q α (1 − tp)−α
P
∀x
x
The assumption of a constant parameter λ in the Poisson Process may not be realistic in physical
phenomena such as population growth. A pure general process can be obtained by making the parameter
λ dependent on the state of the process.
42
Consider a process of events occurring under the following postulates. Suppose the event has occurred n
times in time (0, t]. The the occurrence or non-occurrence of the event during (t, t + ∆t] for a sufficiently
small δt is independent of the time since the last occurrence. Further, the probabilities of events are
given as follows:
Let Pn (t) = P (X(t) = n) be the probability that n events occurs in time (0, t]. Using the Chapman
Kolmogorov equation for transitions in the interval of time (0, t] and (t, t + ∆t) we have
dPn (t)
Pn 0(t) = = λn−1 Pn−1 (t) − λn Pn (t) . . . (6.1)
dt
Yule Process =⇒ λn = nλ
Pn 0(t) = (n − 1)λPn−1 (t) − nλPn (t) (6.2)
1, m=j
Pm (0) = P (X(0) = m) =
0, m 6= j
From (6.2), Pj 0(t) = (j − 1)λPj−1 (t) − jλPj (t) but (j − 1)λPj−1 (t) = 0, because if you start from j, you
cannot go back to j − 1 because it’s a birth process.
Pj 0 = −λjPj (t)
Pj 0(t)
= −λj
Pj (t)
Z
Pj 0(t) Z
dt = −λjdt
Pj (t)
ln Pj (t) = −λjt + c
Pj (t) = e−λjt · ec
at t = 0, Pj (0) = 1, 1 = e0 · ec so ec = 1
R
Integrating factor(I) = e λ(j+1)dt
I = eλ(j+1)t
This gives us
Z
eλ(j+1)t Pj+1 (t) = λjeλt dt
At t = 0, Pj+1 (0) = 0 =⇒ 0 = j + c =⇒ c = −j
j + k − 1 −λjt
!
Pj+k (t) = e (1 − eλt )k , k = 0, 1, . . . → negative binomial
j−1
or equivalently,
n − 1 −λjt
!
Pn (t) = e (1 − e−λt )n−j , n = j, j + 1, . . .
j−1
which shows that the population size at time t has a negative binomial distribution in which the
probability of success in a single trial is e−λt .
2. For the pure birth process, find 1(a) and 1(b) by the method of p.g.f.
3. Derive for the pure birth process the results in 1(a) and 1(b) using the difference differential
equation Pn 0(t) = λ(n − 1)Pn−1 (t) − λnPn (t)
Solution:
(a) E[X(t)|X(0) = j] = j
e−λt
= jeλt
j(1−e−λt )
(b) V ar[[X(t)|X(0) = j] = (e−λt )2
= j − je−λt · (eλt )2 = jeλt (eλt − 1)
Suppose an initial size, say i > 0 individuals die at a certain rate, eventually reducing the size to zero.
When the population size is n, let un be the death rate defined as follows: in an interval (t, t + ∆t], the
probability that one death occurs in the interval is un ∆t + 0(∆t), while the probability that no death
occurs is 1 − un ∆t + 0(∆t), and all other probabilities are negligible, 0(∆t).Also assume that the
occurrence of death in this interval (t, t + ∆t] is independent of time since the last death. Let
Pn (t) = P [X(t) = n] be the probability that there are n individuals in the population within time
interval (0, t]. By Chapman-Kolmogorov equations for transitions in the intervals of time (0, t] and
(t, t + ∆t], we have:
48
Pn 0(t) = µn+1 Pn+1 (t) − µn Pn (t) (7.1)
µ(j + 1)Pj+1 (t) is 0 because you can’t get j + 1 from j if you’re in a death process.
ln Pj (t) = −µjt + c
Pj (t) = e−µjt · ec
R
I=e µ(j−1)dt
= eµ(j−1)t
= −je−µt + c
At t = 0, Pj−1 (0) = 0 so c = j.
!
j
Pn (t) = (e−µt )n (1 − e−µt )j−n , n = 0, 1, . . . , j ∼ bin(j, e−µt )
n
This shows that the population size is binomially distributed with mean and variance given by
1. E[X(t)|X(0) = j] = je−µt
Birth is an event which signifies an increase in the population. Death is an event which signifies a
decrease in population. Suppose these two types of events occur under the following postulates:
1. Birth: If the population size n (20) at time t during the following infinitesimal interval (t,t+∆t),
the probability that birth will occur is λn ∆t + 0(λt)
lim0(∆t)
=0 hence 1 - λn ∆t + 0(∆t)
∆t→0 ∆t
Births occurring in (t, t+ ∆t] are independent of time since the last occurrence.
2. Death: If the population size n > 0 at time t during the following infinitesimal interval (t,t+∆t)
and 1 - µn ∆t is no death occurring in the interval (t,t+∆t). Death occurring in (t,t+∆t) are
independent of time since the last occurrence
3. When the population size is 0 at time t, the probability is 0 that death occurring during (t,t+∆t).
Note: More than one change (birth or death) takes palce in 0(∆t)
4. For the same population size, births and deaths occur independent of each other.
51
Let X(t) be the population size at t. Define
s,t
Pi,n = P (X(t) = n|X(s) = i)
This process is time homogenous and therefore we shall use the definition given by
t,t+∆t
Pn,n−1 = [µn ∆t + 0(∆t)][1 − λn ∆t + 0(∆t)]
= µn ∆t + µn ∆tλn ∆t + 0(∆t)
= µn ∆t ... (1)
t,t+∆t
Pn,n = [1 − µn ∆t + 0(∆t)][1 − λn ∆t + 0(∆t)]
t,t+∆t
Pn,n+1 = [λn ∆t + 0(∆t)][1 − µn ∆t + 0(∆t)]
j6=n−1,n+1
Any other steps that do not reduce the population by 1 or increase it by 1 moves towards 0. eg n-2 =
0(∆t), n+2 = 0(∆t)
For transitions occurring in non-overlapping intervals (0,t] and (t,t+∆ t] based on equations (1) through
(4), then the Chapman-Kolmogorov equations will take the form
Pn (t + ∆t) = (1 − λn ∆t − µn ∆t)Pn (t) + [λn−1 ∆t]Pn−1 (t) + [µn+1 ∆t]Pn+1 (t) + 0(∆t)
Pn (t + ∆t) = Pn (t) − (λn ∆t + µn ∆t)Pn (t) + λn−1 ∆tPn−1 (t) + µn+1 ∆tPn+1 + 0(∆t)
Pn0 (t) = −(λn + µn )Pn (t) + λn−1 Pn−1 (t) + µn+1 Pn+1 (t), n = 0, 1, 2...
1 if n=i and Pn−1 (t) = 0
With initial conditions, Pn (0) = If λn = nλ and µn = nµ
0 if n 6= i
n=0
∞ ∞
d d
E[X(t)] = n Pn (t) = nPn0 (t)
X X
dt n=0 dt n=0
∞
n[−n(λ + µ)Pn (t) + (n − 1)λPn−1 (t) + (n + 1)µPn+1 (t)]
X
n=0
∞ ∞ ∞
n (λ + µ)Pn (t) + λ
2
n(n − 1)Pn−1 (t) + µ n(n + 1)Pn+1 (t)
X X X
−
n=0 n=0 n=0
d
E[X(t)] = (λ − µ)E[X(t)]
dt
dE[X(t)]
= (λ − µ)dt
E[X(t)]
E[X(t)] = e(λ−µ)t ec
E[X(t)|X(0) = i] = ie(λ−µ)t
[1 − e−(λ−µ)t ]
P0 (t) = t = µ
[λ − µe(λ−µ) ]
λ[1 − e−(λ−µ)t ]
!
λ
where nt = t =
µ λ − e−(λ−µ)t
Clearly Pn (t) has a geometric distribution modified with the initial term. lim P0 (t)=Probability of
t→∞
x 1−0 µ
let ρ = ; lim t = µ = = ρ−1
µ t→∞ λ−0 λ
e(µ−λ)t µe−(µ−λ)t − µ
" #
and lim t = lim (µ−λ)t ;ρ ≤ 1
t→∞ t→∞ e λe−(µ−λ)t − µ
λ
=> λ ≤ µ
µ
1 if ρ ≤ 1 or λ ≤ µ
=> lim P0 (t) =
t→∞
ρ−1 if ρ > or λ > µ
Meaning, ultimate extinction is certain if death rate is larger than birth rate Meaning ultimate extinction
is certain if death rate is larger than birth rate. Question: The p.g.f.of X(t), the number of individuals
in the population at time t of the simple linear pure birth and death process with X(0) = 1 is
µ(1−α)−(λ−µα)z
if λ 6= µ
µ−λα−λ(1−α)z ,
φ(z, t) =
1−(λt−1)(z−1) , if λ = µ
1−λt(z−1)
1. P0 (t) when λ 6= µ.
2. P0 (t) when λ = µ
and comment on the results with special reference to the intrinsic growth rate(λ − µ).
1, if λ ≤ µ
lim P0 (t) =
t→∞
µ,
λ
λ>µ
1,
if λ ≤ µ
lim P0 (t) =
t→∞
( µ )i ,
λ
λ > µ, i is the initial size of the population
The preceding theorem can be proved by noting that since we have assumed independence and no
interaction among the members, we may view the population as the sum of i independent simple birth
and death processes each beginning with a single member thus if φ(z, t) is the p.g.f. corresponding to
X(0) = 1, then the p.g.f. corresponding to X(0) = i is
µ(1−α)−(λ−µα)z i
h i
if µ = λ
,
µ−λα−λ(1−α)z
ψ(z, t) = h ii
1−(λt−1)(z−1) , if µ 6= λ
1−λt(z−1)
where α = e(λ−µ)t .
Definition 8.0.0.2. A birth and death process is called a linear growth process with immigration if
λn = nλ + and µn = nµ with λ, µ, > 0, such processes occur naturally in the study of the biological
reproduction and population growth.
Note:This process is irreducible and therefore there is no absorbing state. The probability of ultimate
extinction is zero for λ0 = > 0. Consider a population subject to death, birth, emigration and
immigration under the following assumptions:
1. In a small time interval of length ∆t, the probability for a given individual to
2. Immigrants are subject to death, emigration and give birth in the same manner as existing
members.
3. In a small time interval of length ∆t, the probability of the population being
Question:Based on the above assumptions, if X(t) is the total population size at time t and
Pn (t) = P [X(t) = n]; n = 0, 1, 2, . . . ,prove that
d
Pn (t) = [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t) − [n(λ + µ) + ]Pn (t)
dt
h i
e(λ−µ)t − 1 + ie(λ−µ)t , if λ 6= µ
λ−µ
E[X(t) = n|X(0) = i] =
t + i, if λ = µ
Solution:
Let
(s,t)
Pin = P [X(t) = n|X(s) = i]
= µn∆t + 0(∆t)
Pn,n (t, t + ∆t) = [1 − µn∆t + 0(∆t)][1 − λn∆t + 0(∆t)][1 − ∆t + 0(∆t)] + [λn∆t + 0(∆t)][µn∆t + 0(∆t)][1 − ∆
Pn,n (t, t + ∆t) = [µn∆t + 0(∆t)][λn∆t + 0(∆t)][∆t + 0(∆t)] + [1 − µn∆t + 0(∆t)][λn∆t + 0(∆t)][1 − ∆t + 0(∆
j6=n−1,n,n+1
from Chapman-Kolmogorov equation for Markov process with discrete state space.
Pn−1,n (t, t + ∆t) = [λn∆t + 0(∆t)][1 − ∆t + 0(∆t)][1 − µn∆t + 0(∆t)] + [1 − λn∆t + 0(∆t)][1 − µn∆t + 0(∆t)][
This implies
Pn (t + ∆t) = [1 − [n(λ + µ) + ]∆t + 0(∆t)] · Pn (t) + [[λ(n − 1) + ]∆t + 0(∆t)] · Pn−1 (t) + [µ(n + 1)∆t + 0(∆t)]
= Pn (t) − [n(λ + µ) + ]∆t · Pn (t) + [λ(n − 1) + ]∆tPn−1 (t) + µ(n + 1)∆tPn+1 (t) + 0(∆t)
∞
E[X(t)|X(0) = i] = n · Pn (t)
X
n=0
∞
d d
E[X(t)] = n · Pn (t)
X
dt n=0 dt
∞
= n[[−n(λ + µ) + ]Pn (t) + [λ(n − 1) + ]Pn−1 (t) + µ(n + 1)Pn+1 (t)]
X
n=0
∞ ∞ ∞ ∞ ∞
= −(λ + µ) n2 Pn (t) + nPn (t) + λ n(n − 1)Pn−1 (t) + nPn−1 (t) + µ n(n + 1)P
X X X X X
Theorem 8.0.0.3. If a Markov Process is irreducible, then the limiting distribution limn→∞ Pn exists
and is independent of the initial conditions of the process. The limits(Pn , n ∈ S, the state space) are such
that they either vanish identically, i.e.Pn = 0 ∀n ∈ S or are all positive and form a probability
distribution or vector, i.e.Pn > 0 ∀n ∈ S and Pn = 1.
P
n∈S