Stoch Procs Lecture3 2025
Stoch Procs Lecture3 2025
Stochastic Processes
www.smstc.ac.uk
Contents
(i)
SMSTC (2024/25)
Stochastic Processes
Chapter 3: Markov chains in continuous time
Mateusz Majka, Heriot-Watt Universitya
www.smstc.ac.uk
Contents
3.1 Markov property and Q-matrices . . . . . . . . . . . . . . . . . . . . . 3–1
3.2 The Chapman–Kolmogorov and Kolmogorov forward/backward equa-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–3
3.3 Construction of the Markov chain . . . . . . . . . . . . . . . . . . . . 3–5
3.4 Stationary distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
3.5 Ergodic theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–10
3.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–11
3.6.1 Birth-and-death processes . . . . . . . . . . . . . . . . . . . . . . . . . . 3–11
3.6.2 Simple Markovian queueing models . . . . . . . . . . . . . . . . . . . . . 3–11
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–13
The law of the Markov chain (X(t))t≥0 is determined by its initial distribution and the pij :
X
P(X(t) = j) = P(X(0) = i)pij (t).
i∈S
a
[email protected]
3–1
SMSTC: Stochastic Processes 3–2
We also have (
1 if i = j
pij (0) = δij where δij = . (3.2)
0 ̸ j
if i =
P
In this course we shall only consider the case where, for each fixed t and i, j∈S pij (t) = 1.
In continuous time there are no smallest time steps and hence we cannot speak about one-step
transition matrices any more. However, often we can boil down the information in the functions
pij (t) into a single fundamental matrix associated with the Markov chain, which will serve as
an analogue to the P -matrix in the discrete theory. This is the Q-matrix.
To proceed we also need to assume some regularity. We call the process standard if the transition
probabilities are continuous at 0, i.e., if
lim pij (t) = pij (0). (3.3)
t↓0
Lemma 3.1. For a standard process, pij (t) is a continuous function of t for all i, j.
We leave the proof as an exercise. Henceforth we assume the continuity of the functions pij .
This implies (via a non-trivial argument, see e.g. [4]) also differentiability of the pij .
For all i, j ∈ S, define qij = p′ij (0). Then for all t, h ≥ 0,
P(X(t + h) = j | X(t) = i) = pij (h)
= pij (0) + qij h + o(h) as h ↓ 0
= δij + qij h + o(h) as h ↓ 0. (3.4)
Here, for i ̸= j, qij is the (instantaneous) transition rate of the process from state i to state j.
The matrix Q = (qij )i,j∈S is called the transition rate matrix, the generator matrix or simply
the Q-matrix of the Markov chain.
We shall assume the following:
0 ≤ qij < ∞ for all i, j with j ̸= i, (3.5)
0 ≤ −qii < ∞ for all i, (3.6)
X
qij = 0 for all i. (3.7)
j∈S
TheseP conditions are satisfied for all reasonable processes. (For example, for finite S (3.7) follows
since j∈S pij (t) = 1 for all t implies j∈S p′ij (0) = 0.)
P
P
It is convenient to define qi = −qii for all i (so qi ≥ 0). Then, by (3.7), qi = j̸=i qij .
The matrix Q effectively describes the dynamics of the Markov chain. It plays a role analogous
to that of the transition matrix of a discrete-time Markov chain. In particular, under reasonable
conditions (see below) the functions pij (·) are uniquely determined by Q. Thus, in applica-
tions, the distribution of a Markov process is usually defined via the Q-matrix and the initial
distribution.
Conversely, given a matrix Q = (qij )i,j∈S satisfying (3.5), (3.6), (3.7) above, there always exists
a homogeneous Markov process with Q as transition rate matrix. This fact can be proved by
actually constructing the paths of such a process: see Section 3.3 below.
Example 3.1 (Birth-and-death process). Here S = {0, 1, 2, . . . } and X(t) may be thought of
as, for example, a population size at time t. We have
qi,i+1 = λi , i ≥ 0 (birth rate in state i),
qi,i−1 = µi , i ≥ 1 (death rate in state i),
qij = 0 for all other j ̸= i,
q00 = −λ0 and qii = −(λi + µi ), i ≥ 1.
SMSTC: Stochastic Processes 3–3
For a linear birth-and-death process we would take λi = λi (so λ can be thought of as a birth
rate per individual ) and µi = µi (so µ can be thought of as a death rate per individual ).
where we write Pi ( · ) = P( · | X(0) = i). From P one can get the full information about the
law of the Markov chain: for 0 < t1 < · · · < tn and j1 , . . . , jn ∈ S,
As in the case of discrete time it is convenient to express things in matrix notation. Let P (t) =
(pij (t))i,j∈S . Then (3.8) can be written as
The Kolmogorov forward and backward equations can also be written in matrix form,
namely
P ′ (t) = QP (t) and P ′ (t) = P (t)Q.
Theorem 3.1 (and Theorem 3.4 below) are stated for a countable state space, but are only
proved in the finite case. The difficulty in the infinite case is one of an interchange of limits
and the proofs can be found in e.g. [4].
Usually, and always when S is finite, given Q, the transition probabilities (pij (·), i, j ∈
S) are uniquely determined by either (3.10) or (3.11), together with the initial condi-
tion pij (0) = δij (that is, P (0) = I in matrix notation). Hence one way to “solve” these
equations is to guess the answer (sometimes easy) and to verify that it is indeed a solution.
For the case of a scalar function p(t) and a scalar q instead of the matrix-valued function
P and matrix Q the corresponding equation
would have the unique solution p(t) = etq . In the matrix case one can argue similarly,
under suitable conditions, defining
∞
X Qk tk
eQt = ,
k!
k=0
with Q0 = I (identity). Then we get the transition matrix function P (t) = eQt which
satisfies the Kolmogorov forward and backward equations.
Example 3.2 (General birth process). A general birth process is a Markov process on the state
space S = {0, 1, 2, . . .} with the transition rate matrix (generator)
−q0 q0 0 0 ···
Q = 0 −q1 q1 0 · · · .
··· ··· ··· ··· ···
SMSTC: Stochastic Processes 3–5
Let us now consider a birth process which has constant intensity of births, namely qi = q for all
i ∈ S. The forward equation yields
dpjk (t)
= −qpjk (t) + qpj,k−1 (t),
dt
where we interpret pj,−1 (t) ≡ 0. In particular, if j = 0, we have
dp0k (t)
= −qp0k (t) + qp0,k−1 (t). (3.12)
dt
The initial conditions are assumed to be p00 (0) = 1 and p0i (0) = 0 for i ≥ 1, so that the process
starts in state 0. In order to solve this differential equation, we will attempt to convert it into
a partial differential equation for the probability generating function
X
G(s; t) = EsX(t) = p0k (t)sk , |s| < 1.
k≥0
∂G(s; t)
= −qG(s; t) + qsG(s; t),
∂t
where we can inter-change the derivative and the sum because |s| < 1 and the derivatives are
bounded. For a fixed value of s we see that
∂G(s; t)
= −q(1 − s)G(s; t)
∂t
so that
G(s; t) = G(s; 0)e−q(1−s)t .
From the initial condition G(s; 0) = 1, we see that X(t) follows a Poisson distribution with mean
qt. In general
(qt)j−i
pij (t) = pij (s, s + t) = e−qt , j ≥ i.
(j − i)!
We observe that the process we just derived is the Poisson process discussed in the previous
chapter.
and, for j ̸= i,
so that, for qi ̸= 0,
qij h + o(h)
P(X(t + h) = j | X(t) = i, X(t + h) ̸= i) =
qi h + o(h)
qij
= + o(1) as h → 0.
qi
Suppose the chain starts in a fixed state X(0) = i for i ∈ S. Let T0 = 0 and define recursively
for n ≥ 0,
Tn+1 = inf{t ≥ Tn : X(t) ̸= X(Tn )}.
Thus Tn is the nth jump time of X, that is, the nth time at which the process changes its state.
Theorem 3.2. Under the law Pi of the Markov chain started in X(0) = i the random P variables
T1 and X(T1 ) are independent. The distribution of T1 is exponential with rate qi := j̸=i qij ,
which means
Pi (T1 > t) = e−qi t for t ≥ 0.
Moreover,
qij
Pi (X(T1 ) = j) = ,
qi
and the chain starts afresh at time T1 .
Let Xn∗ = X(Tn ). Then (Xn∗ )n∈Z+ defines a discrete-time Markov chain, called the jump chain
associated with X, with one-step transition matrix P ∗ given by
qij
∗ qi if i ̸= j,
pij = ,
0 if i = j.
Exponential times
(Please also refer to Lecture 2 for an explanation of where the exponential distribution comes
from.) Why does the exponential distribution play a special role for continuous Markov chains?
Recall the Markov property in the following way: Suppose X(0) = i and let T1 be the time of
the first jump and t, h > 0. Then, using the Markov property in the second step,
P(T1 > t + h | T1 > t) = P(T1 > t + h | T1 > t, X(t) = i) = P(T1 > t + h | X(t) = i) = P(T1 > h).
Hence the time T1 we have to wait for a jump satisfies the lack of memory property: if you have
waited for t time units and no jump has occurred, the remaining waiting time has the same
distribution as the original waiting time.
The only distribution with the lack of memory property is the exponential distribution. Re-
minder: if X is exponential with parameter λ, then P(X > x) = e−λx , x ≥ 0, and E(X) = 1/λ.
Another important property of the exponential distribution is the following:
Theorem 3.3. If S and T are independent exponentially distributed random variables with
positive rates α and β, then their minimum S ∧ T is also exponentially distributed with rate
α + β and it is independent of the event {S ∧ T = S}. Moreover,
α β
P(S ∧ T = S) = and P(S ∧ T = T ) = .
α+β α+β
SMSTC: Stochastic Processes 3–7
Recall the results from discrete-time Markov chain theory that (a) a non-closed class is
necessarily transient (all states are transient) and (b) a finite closed class is recurrent (all
states are recurrent). It follows from the above that these carry over into the continuous-
time setting.
Note that there are never any problems of periodicity with the process X(·) (although
there may be with the associated jump chain).
Example 3.3. Take the state space S = {1, 2, 3, 4, 5} and the matrix of transition rates to be
given by
−4 1 0 0 3
2 −6 3 1 0
Q= 0 0 −5 5 0
0 0 4 −4 0
0 0 0 0 0
Then the transition matrix P ∗ of the jump chain is given by
0 41 0 0 43
1 0 1 1
0
3 2 6
P∗ =
0 0 0 1 0
0 0 1 0 0
0 0 0 0 1
1 - 2 - 3
J J
J J
^
J J
^
5 4
Hence also we have the following division of the state space into classes:
{1,2} non-closed ; hence transient
{3,4} closed, finite; hence recurrent
{5} closed, finite (single absorbing state); hence recurrent.
Calculation of absorption probabilities: Let
Clearly,
y5 = 1, y3 = y4 = 0.
Moreover, writing P ∗ = (p∗ij ) for the transition matrix of the jump chain, we have, by the
first-step analysis,
Hence
Pi (X(·) eventually absorbed in class {3, 4}) = 1 − yi .
pij (t) > 0 for all t > 0 and for all i, j ∈ S. (3.13)
(that is, it is possible to get from any state to any other state in any given time t > 0). In fact,
it is easy to see from Section 3.3 that if quv > 0 for some u, v ∈ S, then puv (t) > 0 for all t > 0.
Now, for any i, j ∈ S, there are some states i0 , i1 , . . . , in with i0 = i and in = j such that
Hence
pij (t) ≥ pi0 i1 (t/n) · · · pin−1 in (t/n) > 0.
Recall also that S is recurrent/transient with respect to the process X(·) if and only if it is so
with respect to the jump chain.
This definition should be compared with that for the stationary distribution of a discrete-time
Markov chain, where it is only necessary to make the definition for one time step (the extension
to any number of time steps then being automatic). However, in the continuous-time setting
SMSTC: Stochastic Processes 3–9
there is no smallest time step, so the definition must be made for all times. It follows that
this definition is of no use for checking whether a distribution is stationary or for finding the
stationary distribution. Fortunately we can use instead Theorem 3.4 below.
First, the following result gives a useful property of the stationary distribution of an irreducible
process.
Lemma 3.2. Let π be stationary. Then πj > 0 for all j ∈ S.
πQ = 0 (3.15)
P
(i.e. i∈S πi qij = 0 for all j ∈ S).
Proof [Proof for S finite] Suppose first that π is stationary. Then, differentiating (3.14) we
obtain that X
πi p′ij (t) = 0 for all j ∈ S and for all t ≥ 0.
i∈S
P
Putting t = 0 we obtain i∈S πi qij = 0 for all j ∈ S as required.
P
To prove the converse, suppose that (3.15) holds, i.e. that i∈S πi qij = 0 for all j ∈ S. Recall
the backward differential equations (3.11):
X
p′ij (t) = qik pkj (t) for all t ≥ 0 and all i, j ∈ S.
k∈S
Putting t = 0, and recalling that pij (0) = δij for all i, it follows that the above constant is
necessarily equal to πj as required. □
Then the process is irreducible and its stationary distribution π is given by the solution of
πQ = 0, i.e. by
−5π1 + 5π2 + π3 = 0
2π1 − 6π2 + 2π3 = 0
3π1 + π2 − 3π3 = 0
SMSTC: Stochastic Processes 3–10
π1 + π2 + π3 = 1
(since π is required to be a distribution). We thus obtain that the stationary distribution is
given by π = (1/3, 1/4, 5/12).
Note that the equation πQ = 0 may also be written as
X X
πi qij = πj qji for all j ∈ S. (3.16)
i̸=j i̸=j
3.6 Applications
3.6.1 Birth-and-death processes
We take the state space to be S = {0, 1, . . . }, i.e. the nonnegative integers.
Example 3.5 (Simple continuous-time random walk with reflection at the origin). Let S =
{0, 1, 2, . . .}, and let the transition rate matrix Q be given by
M/M/1 queue In this model there is a single server with Exp(σ) service-time distributions
and all (Poisson) arrivals are accepted, queueing if necessary. Recall that X(t) is defined to
be the total number of individuals in system at time t, including that being served. Then the
state-space for the Markov process X(·) is S = {0, 1, 2, . . .} and the transition rate matrix Q is
given by
qi,i+1 = ν, i ≥ 0,
qi,i−1 = σ, i ≥ 0.
For the general M/M/1 queue, the detailed balance equations (3.18) become
πi ν = πi+1 σ, i ≥ 0.
Hence
πi+1 = ρπi = ρi+1 π0 ,
where ρ = ν/σ. It follows that a stationary distribution exists if and only if ρ < 1 and the
stationary distribution is a geometric distribution given by
πi = (1 − ρ)ρi , i ≥ 0.
Further the expectation (of X(t)) under this stationary distribution is given by
X ρ ν
Eπ (X) = iπi = = .
1−ρ σ−ν
i≥0
Simple Erlang loss system Consider a system with finite capacity C. Hence the state space
for the process X(·) describing the number of individuals in the system is S = {0, 1, . . . , C}.
As usual arrivals form a Poisson process with rate ν; however, any arrival is accepted into the
system if and only if the current state i (immediately prior to the arrival) is less that C, and is
rejected if the current state i is equal to C. There is no queueing, and as usual service times are
independent with Exp(σ) distributions and are independent of the arrivals process.
Hence (X(t))t≥0 is a birth-and-death Markov process on S with transition rate matrix given by
(
ν, i = 0, . . . , C − 1,
qi,i+1 =
0, i = C,
qi,i−1 = iσ, i = 1, . . . C
qij = 0 for all other j ̸= i.
SMSTC: Stochastic Processes 3–13
Blocking probability. Since arrivals are Poisson, when the system is in equilibrium, the blocking
probability that a typical arrival is rejected is simply the stationary probability that the system
is full, i.e. is
ρC /C!
πC = PC . (3.21)
ρi /i!
i=0
3.7 Exercises
3–1. A general two-state Markov chain is a process on S = {0, 1} with
−λ λ
Q=
µ −µ
for some λ, µ > 0. The forward differential equations (3.10) for the initial state i become
p′i0 (t) = pi0 (t)(−λ) + pi1 (t)(µ)
p′i1 (t) = pi0 (t)(λ) + pi1 (t)(−µ)
and we also have the initial condition pij (0) = δij . Show that
µ λ −(λ+µ)t
p00 (t) = + e ,
λ+µ λ+µ
λ λ −(λ+µ)t
p01 (t) = − e ,
λ+µ λ+µ
µ µ −(λ+µ)t
p10 (t) = − e ,
λ+µ λ+µ
λ µ −(λ+µ)t
p11 (t) = + e .
λ+µ λ+µ
Hence show that !
µ λ
λ+µ λ+µ
P (t) → µ λ as t → ∞.
λ+µ λ+µ
Compute
yi = Pi (X(·) eventually absorbed in state 5), i ∈ S.
3–5. Let the Markov process (X(t))t≥0 on the state space S = {1, 2, 3, 4} have the matrix of
transition rates
−6 1 2 3
3 −5 1 1
Q= .
1 2 −3 0
3 2 2 −7
Find its stationary distribution.
3–6. For the Markov chain in the previous question, find the one-step transition matrix of the
associated jump chain. Calculate the stationary distribution for this discrete-time chain.
Why is it different to the stationary distribution of the continuous-time process?
3–7. A continuous time Markov process has state space S = {0, 1, . . . , c}. Its matrix of transition
rates Q = (qij ) is given by
References
[1] W.J. Anderson, Continuous-Time Markov Chains, Springer, 1991.
[2] K.L. Chung, Markov Chains with Stationary Transition Probabilities, Springer, 2nd edition,
1967.