1 IEOR 4701: Continuous-Time Markov Chains
1 IEOR 4701: Continuous-Time Markov Chains
Pij (t) is an example of a transition probability for the CTMC and represents the probability
that the chain will be in state j, t time units from now, given it is in state i now. As for discrete-
time Markov chains, we are assuming here that the distribution of the future, given the present
state X(s), does not depend on the present time s, but only on the present state X(s) = i,
whatever it is, and the amount of time that has elapsed, t, since time s. In particular, Pij (t) =
P (X(t) = j|X(0) = i). (This is the continuous-time analog of time-stationary transition
probabilities, P (Xn+k = j|Xn = i) = P (Xk = j|X0 = i), n ≥ 0, k ≥ 0, for discrete-time
Markov chains.)
But unlike the discrete-time case, there is no smallest “next time” until the next transition,
there is a continuum of such possible times t. For each fixed i and j, Pij (t), t ≥ 0 defines a
function which in principle can be studied by use of calculus and differential equations. Although
this makes the analysis of CTMC’s more difficult/technical than for discrete-time chains, we
will, non-the-less, find that many similarities with discrete-time chains follow, and many useful
results can be obtained.
1
Pii > 0 is allowed, meaning that a transition back into state i from state i can ocurr. Each time this happens
though, a new Hi , independent of the past, determines the new length of time spent in state i. See Section 1.9
for details.
1
A little thought reveals that the holding times must have the memoryless property and thus
are exponentially distributed. To see this, suppose that X(t) = i. Time t lies somewhere in
the middle of the holding time Hi for state i. The future after time t tells us, in particular, the
remaining holding time in state i, whereas the past before time t, tells us, in particular, the age
of the holding time (how long the process has been in state i). In order for the future to be
independent of the past given X(t) = i, we deduce that the remaining holding time must only
depend (in distribution) on i and be independent of its age; the memoryless property follows.
Since an exponential distribution is completely determined by its rate we conclude that for each
i ∈ S, there exists a constant (rate) ai > 0, such that the chain, when entering state i, remains
there, independent of the past, for an amount of time Hi ∼ exp(ai ):
A CTMC makes transitions from state to state, independent of the past, according to
a discrete-time Markov chain, but once entering a state remains in that state, inde-
pendent of the past, for an exponentially distributed amount of time before changing
state again.
Thus a CTMC can simply be described by a transition matrix P = (Pij ), describing how the
chain changes state step-by-step at transition epochs, together with a set of rates {ai : i ∈ S},
the holding time rates. Each time state i is visited, the chain spends, on average, E(Hi ) = 1/ai
units of time there before moving on. ai can be interpreted as the rate out of state i given that
X(t) = i; the intuitive idea being that the holding time will end, independent of the past, in
the next dt units of time w.p. ai dt.
Letting τn denote the time at which the nth change of state (transition) occurs, we see
that Xn = X(τn ), the state right after the nth transition, defines the underlying discrete-time
Markov chain, called the embedded Markov chain. {Xn } keeps track, consecutively, of the
states visited right after each transition, and moves from state to state according to the one-
step transition probabilities Pij = P (Xn+1 = j|Xn = i). This transition matrix (Pij ), together
with the holding-time rates {ai }, completely determines the CTMC.
As for discrete-time chains, the proof involves first conditioning on what state k the chain
is in at time s given that X(0) = i, yielding Pik (s), and then using the Markov property to
compute the probability that the chain, now in state k, would then be in state j after an
additional t time units, Pkj (t).
2
ai = λ, i ≥ 0: If N (t) = i then, by the memoryless property, the next arrival, arrival
i + 1, will, independent of the past, occur after an exponentially distributed amount of
time at rate λ. The holding time in state i is simply the interarrival time, ti+1 − ti , and
τn = tn since N (t) only changes state at an arrival time. Assuming that N (0) = 0 we
conclude that Xn = N (tn ) = n, n ≥ 0; the embedded chain is deterministic. This is a
very special kind of CTMC for several reasons. (1) all holding times Hi have the same
rate ai = λ, and (2) N (t) is a non-decreasing process; it increases by one at each arrival
time, and remains constant otherwise. As t → ∞, N (t) → ∞ step by step. The graph of
the sample paths of {N (t)} are said to be step functions since they look like the steps of
a stairway. This is an example of a Pure Birth process, since we can view each arrival as
a new birth in a population in which no one ever dies.
2. Consider the rat in the closed maze, in which at each transition, the rat is equally likely
to move to one of the neighboring two cells, but where now we assume that the holding
time, Hi , in cell i is exponential at rate ai = i, i = 1, 2, 3, 4. Time is in minutes (say).
Let X(t) denote the cell that the rat is in at time t. Given the rat is now in cell 2 (say),
he will remain there, independent of the past, for an exponential amount of time with
mean 1/2, and then move, independent of the past, to either cell 1 or 4 w.p.=1/2. The
other transitions are similarly explained. {X(t)} forms a CTMC. Note how cell 4 has the
shortest holding time (mean 1/4 minutes), and cell 1 has the longest (mean 1 minute). Of
intrinisic interest is to calculate the long-run proportion of time (continuous time now)
that the rat spends in each cell;
1
Z t
def
pi = lim I{X(s) = i}ds, i = 1, 2, 3, 4.
t→∞ t 0
We will learn how to compute these later; they serve as the continuous-time analog to the
discrete-time stationary probabilities πi for discrete-time Markov chains. (p1 , p2 , p3 , p4 ) is
called the stationary distribution for the CTMC.
3. FIFO M/M/1 queue: Arrivals to a single-server queue are Poisson at rate λ. There is
one line (queue) to wait in, and customers independently (and independent of the Poisson
arrival process) have service times {Sn } that are exponentially distributed at rate µ. We
assume that customers join the tail of the queue, and hence begin service in the order
that they arrive (FIFO). Let X(t) denote the number of customers in the system at time
t, where “system” means the line plus the service area. So (for example), X(t) = 2 means
that there is one customer in service and one waiting in line. Note that a transition can
only occur at customer arrival or departure times, and that departures occur whenever a
service completion occurs. At an arrival time X(t) jumps up by the amount 1, whereas
at a departure time X(t) jumps down by the amount 1.
Determining the rates ai : If X(t) = 0 then only an arrival can occur next, so the holding
time is given by H0 ∼ exp(λ) the time until the next arrival; a0 = λ, the arrival rate.
If X(t) = i ≥ 1, then the holding time is given by Hi = min{Sr , X} where Sr is the
remaining service time of the customer in service, and X is the time until the next
arrival. The memoryless property for both service times and interarrival times implies
that Sr ∼ exp(µ) and X ∼ exp(λ) independent of the past. Also, they are independent
3
r.v.s. because the service times are assumed independent of the Poisson arrival process.
Thus Hi ∼ exp(λ + µ) and ai = λ + µ, i ≥ 1. The point here is that each of the two
independent events “service completion will ocurr”, “new arrival will ocurr” is competing
to be the next event so as to end the holding time.
The transition probabilities Pij for the embedded discrete-time chain are derived as fol-
lows: Xn denotes the number of customers in the system right after the nth transition.
Transitions are caused only by arrivals and departures.
If Xn = 0, then someone just departed leaving the system empty (for it is not possible for
the system to be empty right after an arrival). Thus P (Xn+1 = 1|Xn = 0) = 1 since only
an arrival can occur next if the system is empty. But whenever Xn = i ≥ 1, Xn+1 = i + 1
w.p. P (X < Sr ) = λ/(λ + µ), and Xn+1 = i − 1 w.p. P (Sr < X) = µ/(λ + µ), depending
on whether an arrival or a departure is the first event to occur next. So, P0,1 = 1, and
for i ≥ 1, Pi,i+1 = p = λ/(λ + µ), and Pi,i−1 = 1 − p = µ/(λ + µ). We conclude that
The embedded Markov chain for a FIFO M/M/1 queue is a simple random walk
(“up” probability p = λ/(λ + µ), “down” probability 1 − p = µ/(λ + µ)) that is
restricted to be non-negative (P0,1 = 1).
This CTMC is an example of a Birth and Death process, since we can view each arrival
as a birth, and each departure as a death in a population. The birth rate when in state
i, denoted by λi , is λ for all i ≥ 0, since the time until the next birth (arrival) is always
exponentially distributed with rate λ. The death rate when in state i ≥ 1, denoted by µi ,
is µ since the time until the next death (departure) is always exponentially distributed
with rate µ. (µ0 = 0 since there can be no death when the system is empty.) Whenever
X(t) = i, the rate out of state i is the holding time rate, the sum of the birth and death
rates, ai = λi + µi .
4. M/M/c multi-server queue: This is the same as the FIFO M/M/1 queue except there
are now c servers working in parallel. As in a USA postoffice, arrivals wait in one line
(queue) and enter service at the first available free server. Once again we let X(t) denote
the number of customers in the system at time t. For illustration, let’s assume c = 2.
Then, for example, X(t) = 4 means that two customers are in service (each with their
own server) and two others are waiting in line. When X(t) = i ∈ {0, 1}, the holding times
are the same as for the M/M/1 model; a0 = λ, a1 = λ + µ. But when X(t) = i ≥ 2, both
remaining service times, denoted by Sr1 and Sr2 , compete to determine the next departure.
Since they are independent exponentials at rate µ, we deduce that the time until the next
departure is given by min{Sr1 , Sr2 } ∼ exp(2µ). The time until the next arrival is given
by X ∼ exp(λ) and is independent of both remaining service times. We conclude that
the holding time in any state i ≥ 2 is given by Hi = min{X, Sr1 , Sr2 } ∼ exp(λ + 2µ).
For the general case of c ≥ 2, the rates are determined analogously: ai = λ + iµ, 0 ≤
i ≤ c, ai = λ + cµ, i > c. This CTMC is another example of a Birth and Death process;
λi = λ, i ≥ 0 and µi = iµ, 0 ≤ i ≤ c, µi = cµ, i > c. Whereas the birth rate remains
constant, the death rate depends on how many busy servers there are, but is never larger
than cµ because that is the maximum number of busy servers possible at any given time.
4
5. M/M/∞ infinite-server queue: Here we have a M/M/c queue with c = ∞; a special case
of the M/G/∞ queue. Letting X(t) denote the number of customers in the system at
time t, we see that ai = λ + iµ, i ≥ 0 since there is no limit on the number of busy
servers. This CTMC is yet another example of a Birth and Death process; λi = λ, i ≥ 0
and µi = iµ, i ≥ 0. Thus the death rate is proportional to the population size, whereas
the birth rate remains constant.
For the embedded chain: P0,1 = 1 and Pi,i+1 = λ/(λ + iµ), Pi,i−1 = iµ/(λ + iµ), i ≥
1. This is an example of a simple random walk with state-dependent “up”, “down”
probabilities: at each step, the probabilities for the next increment depend on i, the
current state. Note how, as i increases, the down probability increases, and approaches 1
as i → ∞: when the system is heavily congested, departures occur rapidly.
5
the chain will visit state i at time H0 + H1 + · · · + Hi−1 , the sum of the first i holding times.
Thus the chain will visit all of the states by time
∞
X
T = Hi .
i=0
(the expected sum of all the holding times), and we conclude that on average, all states i ≥ 0
have been visited by time t = 2, a finite amount of time! But this implies that w.p.1., all states
will be visited in a finite amount of time; P (T < ∞) = 1. Consequently, w.p.1., X(T + t) =
∞, t ≥ 0. This is an example of an explosive Markov chain: The number of transitions in a
finite interval of time is infinite.
We shall rule out this kind of behavior in the rest of our study, and assume from now on that
all CTMC’s considered are non-explosive, by which we mean that the number of transitions in
any finite interval of time is finite. This will always hold for any CTMC with a finite state space,
or any CTMC for which there are only a finite number of distinct values for the rates ai , and
more generally whenever sup{ai : i ∈ S} < ∞. Every Example given in the previous Section
was non-explosive. Only the M/M/∞ queue needs some clarification since ai = λ + iµ → ∞ as
i → ∞. But only arrivals and departures determine transitions, and the arrivals come from the
Poisson process at fixed rate λ, so the arrivals can not cause an explosion; N (t) < ∞, t ≥ 0.
Now observe that during any interval of time, (s, t], the number of departures can be no larger
than N (t), the total number of arrivals thus far, so they too can not cause an explosion. In
short, the number of transitions in any interval (s, t] is bounded from above by 2N (t) < ∞; the
non-explosive condition is satisfied. This method of bounding the number of transitions by the
underlying Poisson arrival process will hold for essentially any CTMC queueing model.
Notions of recurrence, transience and positive recurence are similar as for discrete-time
chains: Let Ti,i denote the amount of (continuous) time until the chain re-visits state i (at a
later transition) given that X(0) = i (defined to be ∞ if it never does return); the return time to
state i. The chain will make its first transition at time Hi (holding time in state i), so Tii ≥ Hi .
6
State i is called recurrent if, w.p.1., the chain re-visits state i, that is, if P (Tii < ∞) = 1. The
state is called transient otherwise. This (with a little thought) is seen to be the same property
as for the embedded chain (because X(t) returns to state i for some t if and only if Xn does so
for some n):
Communication classes all have the same type of states: all together they are transient or
all together they are recurrent.
State i is called positive recurrent if, in addition to being recurrent, E(Tii ) < ∞; the
expected amount of time to return is finite. State i is called null recurrent if, in addition
to being recurrent, E(Tii ) = ∞; the expected amount of time to return is infinite. Unlike
recurrence, positive (or null) recurrence is not equivalent to that for the embedded chain: It
is possible for a state i to be positive recurrent for the CTMC and null recurrent for the
embedded chain (and vice versa). But positive and null recurrence are still class properties, so
in particular:
For an irreducible CTMC, all states together are transient, positive recurrent, or
null recurrent.
A CTMC is called positive recurrent if it is irreducible and all states are positive recurrent.
As for discrete-time chains (where we can use “π = πP ” to determine positive recurrence), it
turns out that determining positive recurrence for an irreducible CTMC is equivalent to finding
a probability solution {Pj : j ∈ S} to a set of linear equations called balance equations. These
probabilities are the stationary (limiting) probabilities for the CTMC; and can be interpreted
(regardless of initial condition X(0) = i) as the long-run proportion of time the chain spends
in state j:
1 t
Z
Pj = lim I{X(s) = j|X(0) = i}ds, w.p.1., (1)
t→∞ t 0
As in discrete-time,
7
1.6 Positive recurrence imples existence of Pj
As for discrete-time Markov chains, positive recurrence implies the existence of stationary
probabilities by use of the SLLN. The basic idea is that for fixed state j, we can break up the
evolution of the CTMC into i.i.d. cycles, where a cycle begins every time the chain makes a
transition into state j. This yields an example of what is called a regenerative process because
we say it regenerates every time a cycle begins. The cycle lengths are i.i.d. distributed as Tjj ,
and during a cycle, the chain spends an amount of time in state j equal in distribution to the
holding time Hj . This leads to
Proposition 1.1 If {X(t)} is a positive recurrent CTMC, then the stationary probabilities Pj
as defined by Equation (1) exist and are given by
E(Hj ) 1/aj
Pj = = > 0, j ∈ S.
E(Tjj ) E(Tjj )
In words: “The long-run proportion of time the chain spends in state j equals the expected
amount of time spent in state j during a cycle divided by the expected cycle length (between
visits to state j)”.
Proof : Fixing state j, we can break up the evolution of the CTMC into i.i.d. cycles, where a
cycle begins every time the chain makes a transition into state j. This follows by the Markov
property, since every time the chain enters state j, the chain starts over again from scratch
stochastically, and is independent of the past. Letting τn (j) denote the nth time at which
the chain makes a transition into state j, with τ0 (j) = 0, the cycle lengths, Tn (j) = τn (j) −
τn−1 (j), n ≥ 1, are i.i.d., distributed as the return time Tjj . {τn (j) : n ≥ 1} forms a renewal
point process, and we let Nj (t) denote the number of such points during (0, t]. From the
Elementary Renewal Theorem,
Nj (t) 1
lim = . (5)
t→∞ t E(Tjj )
Letting
Z τn (j)
Jn = I{X(s) = j}ds,
τn−1 (j)
(the amount of time spent in state j during the nth cycle) we conclude that {Jn } forms an i.i.d.
sequence of r.v.s. distributed as the holding time Hj ; E(J) = E(Hj ). Thus
Z t Nj (t)
X
I{X(s) = j}ds ≈ Jn ,
0 n=1
t Nj (t)
1 Nj (t) 1 X
Z
I{X(s) = j}ds ≈ × Jn .
t 0 t Nj (t) n=1
Letting t → ∞ yields
E(Hj )
Pj = ,
E(Tjj )
8
where the denominator is from (5) and the numerator is from the SLLN applied to {Jn }. Pj > 0
since E(Tjj ) < ∞ (positive recurrence assumption).
That (3) also holds, is a consequence of the fact that the cycle length distribution is contin-
uous; the distribution of Tjj has a density. In general, a positive recurrent regenerative process
with a continuous cycle-length distribution converges in the sense of (3). The details of this are
beyond the scope of the present course.
Theorem 1.1 An irreducible (and non-explosive) CTMC is positive recurrent if and only if
there is a (necessarily unique) probability solution {Pj : j ∈ S} (e.g., Pj ≥ 0, j ∈ S and
P
j∈S Pj = 1) to the set of “balance equations”:
X
aj Pj = Pi ai Pij , j ∈ S.
i6=j
In words, “the rate out of state j is equal to the rate into state j, for each state j”
In this case, Pj > 0, j ∈ S and are the stationary (limiting) probabilities for the CTMC;
Equations (1)-(4) hold.
We will not prove all of Theorem 1.1 here, but will be satisfied with proving one direction:
We already know from Section 1.6 that positive recurrence implies the existence of the stationary
probabilities Pj . In Section 1.13 a proof is provided (using differential equations) as to why
these Pj must satisfy the balance equations.
As for discrete-time Markov chains, when the state space is finite, we obtain a useful and
simple special case:
Theorem 1.2 An irreducible CTMC with a finite state space is positive recurrent; there is
always a unique probability solution to the balance equations.
That aj Pj represents the long-run rate out of state j (that is, the long-run number of times
per unit time that the chain makes a transition out of state j) is argued as follows:
Pj is the proportion of time spent in state j, and whenever X(t) = j, independent of the
past, the remaining holding time in state j has an exponential distribution with rate aj .
P
Similarly, that i6=j Pi ai Pij represents the long-run rate into state j is argued as follows:
The chain will enter state j from states i 6= j. Pi is the proportion of time spent in state
i. Whenever X(t) = i, independent of the past, the remaining holding time in state i has an
9
exponential distribution with rate ai , and then the chain will make a transition into state j
with probability Pij . Summing up over all state i 6= j yields the result.
The idea here is that whenever X(t) = j, independent of the past, the probability that
the chain will make a transition (hence leave state j) within the next (infinitesimal) interval of
length dt is given by aj dt (it is here we are using the Markov property). Thus aj is interpreted
as the instantaneous rate out of state j given that X(t) = j. Similarly whenever X(t) = i,
independent of the past, the probability that the chain will make a transition from i to j
within the next (infinitesimal) interval of length dt is given by ai Pij dt (it is here we are using
the Markov property), so ai Pij is interpreted as the instantaneous rate into state j given that
X(t) = i. (All of this is analogous to the Bernoulli trials interpretation of a Poisson process at
rate λ; λdt is the probability of an arrival ocurring in the next dt time units, independent of
the past.)
It is important to understand that whereas “the rate out of state j equals the rate into
state j” statement would hold even for non-Markov chains (it is a deterministic sample-path
statement, essentially saying “what goes up must come down”), the fact that these rates can
be expressed in a simple form merely in terms of aj Pj and Pi ai Pij depends crucially on the
Markov property. Thus, in the end, it is the Markov propery that allows us to solve for the
stationary probabilities in such a nice algebraic fashion, a major reason why CTMC’s are so
useful in applications. See Section 1.13 for the proof of why positive recurrence implies that
the Pj must satisfy the balance equations.
Final Comment
We point out that π = πP for a discrete-time MC also has the same “rate out = rate in”
interpretation, and therefore can be viewed as discrete-time balance equations. To see why,
first observe that since time is discrete, the chain spends exactly one unit of time in a state
before making a next transition. Thus πj can be interpreted not only as the long-run proportion
of time the chain spends in state j but also as the long-run rate (long-run number of times per
unit time) at which the chain enters state j and as the long-run rate at which the chain leaves
state j. (Every time the chain enters state j, in order to do so again it must first leave state
j.) πi Pij thus can be interpreted as the rate at which the chain makes a transition i → j, and
P
so i∈S πi Pij can be interpreted as the rate at which the chain makes a transition into state j.
So
X
πj = πi Pij ,
i∈S
says that “the rate out of state j equals the rate into state j”.
10
1. M/M/1 loss system: This is the M/M/1 queueing model, except there is no waiting room;
any customer arriving when the server is busy is “lost”, that is, departs without being
served. In this case S = {0, 1} and X(t) = 1 if the server is busy and X(t) = 0 if the
server is free, at time t. Since P01 = 1 = P10 , (after all, the chain alternates forever
between being empty and having one customer) the chain is irreducible. Since the state
space is finite we conclude from Theorem 1.2 that the chain is positive recurrent for any
λ > 0 and µ > 0. We next solve for P0 and P1 . We let ρ = λ/µ. There is only one balance
equation, λP0 = µP1 (since the other one is identical to this one): Whenever X(t) = 0, λ
is the rate out of state 0, and whenever X(t) = 1, µ is the rate into state 0 (equivalently,
out of state 1). So P1 = ρP0 and since P0 + P1 = 1, we conclude that P0 = 1/(1 + ρ),
P1 = ρ/(1 + ρ). So the long-run proportion of time that the server is busy is ρ/(1 + ρ)
and the long-run proportion of time that the server is free (idle) is 1/(1 + ρ).
2. A three state chain: Consider a CTMC with three states 0, 1, 2 in which whenever it is
in a given state, it is equally likely next to move to any one of the remaining two states.
Assume that a0 , a1 , a2 are given non-zero holding-time rates.
This chain is clearly irreducible and has a finite state space, so it is positive recurrent by
Theorem 1.2. The balance equations are
a0 P0 = (1/2)a1 P1 + (1/2)a2 P2
a1 P1 = (1/2)a0 P0 + (1/2)a2 P2
a2 P2 = (1/2)a0 P0 + (1/2)a1 P1 .
3. FIFO M/M/1 queue: X(t) denotes the number of customers in the system at time t. Here,
irreducibility is immediate since as pointed out earlier, the embedded chain is a simple
random walk (hence irreducible), so, from Theorem 1.1, we will have positive recurrence
if and only if we can solve the balance equations:
λP0 = µP1
(λ + µ)P1 = λP0 + µP2
(λ + µ)P2 = λP1 + µP3
..
.
(λ + µ)Pj = λPj−1 + µPj+1 , j ≥ 1.
These equations are derived as follows: Given X(t) = 0, the rate out of state 0 is the
arrival rate a0 = λ, and the only way to enter state 0 is from state i = 1, from which a
departure must occur (rate µ). This yields the first equation. Given X(t) = j ≥ 1, the
rate out of state j is aj = λ + µ (either an arrival or a departure can occur), but there are
two ways to enter such a state j: either from state i = j − 1 (an arrival occurs (rate λ)
when X(t) = j − 1 causing the transition j − 1 → j), or from state i = j + 1 (a departure
ocurrs (rate µ) when X(t) = j causing the transition j + 1 → j). This yields the other
equations.
11
Note that since λP0 = µP1 (first equation), the second equation reduces to λP1 = µP2
which in turn causes the third equation to reduce to λP2 = µP3 , and in general the
balance equations reduce to
or
Pj+1 = ρPj , j ≥ 0,
from which we recursivly obtain P1 = ρP0 , P2 = ρP1 = ρ2 P0 and in general Pj = ρj P0 .
Using the fact that the probabilities must sum to one yields
∞
X
1 = P0 ρj ,
j=0
from which we conclude that there is a solution if and only if the geometric series converges,
that is, if and only if ρ < 1 (equivalently λ < µ, “the arrival rate is less than the service
rate”), in which case 1 = P0 (1 − ρ)−1 , or P0 = 1 − ρ.
Thus Pj = ρj (1 − ρ), j ≥ 0 and we obtain a geometric stationary distribution.
Summarizing:
The FIFO M/M/1 queue is positive recurrent if and only if ρ < 1 in which case
its stationary distribution is geometric with paramater ρ; Pj = ρj (1 − ρ), j ≥ 0.
(If ρ = 1 it can be shown that the chain is null recurrent, and transient if ρ > 1.)
When ρ < 1 we say that the queueing model is stable, unstable otherwise. Stability
intuitively means that the queue length doesn’t grow and get out of control over time,
but instead reaches an equilibrium in distribution.
When the queue is stable, we can take the mean of the stationary distribution to obtain
the average number of customers in the system
∞
X
L = jpj (7)
j=0
X∞
= j(1 − ρ)ρj (8)
j=0
ρ
= . (9)
1−ρ
Then, from L = λw, we can solve for the average sojourn time of a customer, w = L/λ or
1/µ
w= . (10)
1−ρ
12
Using (10) we can go on to compute average delay d (time spent in line before entering
service), because w = d + 1/µ, or d = w − 1/µ;
ρ
d= . (11)
µ(1 − ρ)
4. Birth and Death processes: The fact that the balance equations for the FIFO M/M/1
queue reduced to “for each state j, the rate from j to j + 1 equals the rate from j + 1 to
j” is not a coincidence, and in fact this reduction holds for any Birth and Death process.
For in a Birth and Death process, the balance equations are:
λ0 P0 = µ1 P1
(λ1 + µ1 )P1 = λ0 P0 + µ2 P2
(λ2 + µ2 )P2 = λ1 P1 + µ3 P3
..
.
(λj + µj )Pj = λj−1 Pj−1 + µj+1 Pj+1 , j ≥ 1.
Plugging the first equation into the second yields λ1 P1 = µ2 P2 which in turn can be
plugged into the third yielding λ2 P2 = µ3 P3 and so on. We conclude that for any Birth
and Death process, the balance equations reduce to
in which case
1
P0 = P∞ λ0 ×···×λj−1
,
1+ j=1 µ1 ×···×µj
and
λ0 ×···×λj−1
µ1 ×···×µj
Pj = P∞ λ0 ×···×λj−1 , j ≥ 1. (13)
1+ j=1 µ1 ×···×µj
13
which agrees with our previous analysis.
We note in passing that the statement “for each state j, the rate from j to j + 1 equals
the rate from j + 1 to j” holds for any deterministic function x(t), t ≥ 0, in which changes
of state are only of magnitude 1; up by 1 or down by 1. Arguing along the same lines
as when we introduced the balance equations, every time this kind of function goes up
from j to j + 1, the only way it can do so again is by first going back down from j + 1
to j. Thus the number of times during the interval (0, t] that it makes an “up” transition
from j to j + 1 differs by at most one, from the number of times during the interval (0, t]
that it makes a “down” transition from j + 1 to j. We conclude (by dividing by t and
letting t → ∞) that the long-run rate at which the function goes from j to j + 1 equals
the long-run rate at which the function goes from j + 1 to j. Of course, as for the balance
equations, being able to write this statement as λj Pj = µj+1 Pj+1 crucially depends on
the Markov property
5. M/M/∞ queue: X(t) denotes the number of customers (busy servers) in the system at
time t. Being a Birth and Death process we need only consider the Birth and Death
balance equations (12) which take the form
λPj = (j + 1)µPj+1 , j ≥ 0.
Irreducibility follows from the fact that the embedded chain is an irreducible simple
random walk, so positive recurrence will follow if we can solve the above equations.
As is easily seen by recursion, Pj = ρj /j!P0 . Forcing these to sum to one (via using the
Taylor’s series expansion for ex ), we obtain 1 = eρ P0 , or P0 = e−ρ . Thus Pj = e−ρ ρj /j!
and we end up with the Poisson distribution with mean ρ:
The M/M/∞ queue is always positive recurrent for any λ > 0, µ > 0; its
stationary distribution is Poisson with mean ρ = λ/µ.
The above result should not be surprising, for we already studied (earlier in this course)
the more general M/G/∞ queue, and obtained the same stationary distribution. But
because we now assume exponential service times, we are able to obtain the result using
CTMC methods. (For a general service time distribution we could not do so because then
X(t) does not form a CTMC; so we had to use other, more general, methods.)
6. M/M/c loss queue: This is the M/M/c model except there is no waiting room; any arrival
finding all c servers busy is lost. This is the c−server analog of Example 1. With X(t)
denoting the number of busy servers at time t, we have, for any λ > 0 and µ > 0, an
irreducible B&D process with a finite state space S = {0, . . . , c}, so positive recurrence
follows from Theorem 1.2. The B&D balance equations (12) are
λPj = (j + 1)µPj+1 , 0 ≤ j ≤ c − 1,
or Pj+1 = Pj ρ/(j + 1), 0 ≤ j ≤ c − 1; the first c equations for the FIFO M/M/∞ queue.
Solving we get Pj = ρj /j!P0 , 0 ≤ j ≤ c, and summing to one yields
c
X ρj
1 = P0 1 + ,
j=1
j!
14
yielding
c
X ρj −1
P0 = 1 + .
j=1
j!
Thus
c −1
ρj X ρn
Pj = 1+ , 0 ≤ j ≤ c. (14)
j! n=1
n!
In particular
c −1
ρc X ρn
Pc = 1+ , (15)
c! n=1
n!
the proprtion of time that all servers are busy. Later we will see from a result called
PASTA, that Pc is also the proportion of lost customers, that is, the proportion of arrivals
who find all c servers busy. This turns out to be a very important result because the
solution in (14), in particular the formula for Pc in (15), holds even if the service times
are not exponential (the M/G/c-loss queue), a famous queueing result called Erlang’s
Loss Formula.
7. Population model with family immigration: Here we start with a general B&D process,
(birth rates λi , death rates µi ) but allow another source of population growth, in addition
to the births. Suppose that at each of the times from a Poisson process at rate γ,
independently, a family of random size B joins the population (immigrates). Let bi =
P (B = i), i ≥ 1 denote corresponding family size probabilities. Letting X(t) denote the
population size at time t, we no longer have a B&D process now since the arrival of a
family can cause a jump larger than size one. The balance equations (“the rate out of
state j equals the rate into state j”) are:
(λ0 + γ)P0 = µ1 P1
(λ1 + µ1 + γ)P1 = (λ0 + γb1 )P0 + µ2 P2
(λ2 + µ2 + γ)P2 = γb2 P0 + (λ1 + γb1 )P1 + µ3 P3
..
.
j−1
X
(λj + µj + γ)Pj = λj Pj−1 + µPj+1 + γbj−i Pi , j ≥ 1.
i=0
The derivation is as follows: When X(t) = j, any one of three events can happen next:
A death (rate µj ), a birth (rate λj ) or a family immigration (rate γ). This yields the rate
out of state j. There are j additional ways to enter state j, besides a birth from state
j − 1 or a death from state j + 1, namely from each state i < j a family of size j − i could
immigrate (rate γbj−i ). This yields the rate into state j.
15
from j to j is to be interpreted as having left/entered state j, “did the chain really leave (or
enter) state j when it made such a transition?” This would seem to be an important issue when
setting up balance equations. But it turns out that as long as one is consistent (on both sides of
the equations), then the same equations arise in the end. We illustrate with a simple example:
A CTMC with two states, 0, 1, in which Pij = 1/2, i, j = 0, 1. a0 and a1 are given non-zero
holding-time rates. It is important to observe that, by definition, ai is the holding time rate
when in state i, meaning that after the holding time Hi ∼ exp(ai ) is completed, the chain will
make a transition according to the transition matrix P = (Pij ). If we interpret a transition
j → j as both a transition into and out of state j, then the balance equations are
a0 P0 = (1/2)a0 P0 + (1/2)a1 P1
a1 P1 = (1/2)a0 P0 + (1/2)a1 P1 .
As the reader can check, these equations reduce to the one equation
a0 P0 = a1 P1 .
(1/2)a0 P0 = (1/2)a1 P1 ,
a0 P0 = a1 P1 .
So, it makes no difference. This is how it works out for any CTMC.
1. Tandem queue: Consider a queueing model with two servers in tandem: Each customer,
after waiting in line and completing service at the first single-server facility, immediately
waits in line at a second single-server facility. Upon completion of the second service,
the customer finally departs. in what follows we assume that the first facility is a FIFO
M/M/1, and the second server has exponential service times and also serves under FIFO,
in which case this system is denoted by
F IF O M/M/1/ −→ /M/1.
Besides the Poisson arrival rate λ, we now have two service times rates (one for each
server), µ1 and µ2 . Service times at each server are assumed i.i.d. and independent of
each other and of the arrival process.
16
Letting X(t) = (X1 (t), X2 (t)), where Xi (t) denotes the number of customers in the ith
facility, i = 1, 2, it is easily seen that {X(t)} satisfies the Markov property. This is an
example of an irreducible two-dimensional CTMC. Balance equations (rate out of a state
equals rate into the state) can be set up and used to solve for stationary probabilities.
Letting Pn,m denote the long-run proportion of time there are n customers at the first
facility and m at the second (a joint probability),
λP0,0 = µ2 P0,1 ,
because the only way the chain can make a transion into state (0, 0) is from (0, 1) (no one
is at the first facility, exactly one customer is at the second facility, and this one customer
departs (rate µ2 )). Similarly when n ≥ 1, m ≥ 1,
because either a customer arrives, a customer completes service at the first facility and
thus goes to the second, or a customer completes service at the second facility and
leaves the system. The remaining balance equations are also easily derived. Letting
ρi = λ/µi , i = 1, 2, it turns out that the solution is
provided that ρi < 1, i = 1, 2. This means that as t → ∞, X1 (t) and X2 (t) become
independent r.v.s. each with a geometric distribution. This result is quite surprising
because, after all, the two facilities are certainly dependent at any time t, and why should
the second facility have a stationary distribution as if it were itself an M/M/1 queue? (For
example, why should departures from the first facility be treated as a Poisson process at
rate λ?) The proof is merely a “plug in and check” proof using Theorem 1.2: Plug in
the given solution (e.g., treat it as a “guess”) into the balance equations and verify that
they work. Since they do work, they are the unique probability solution, and the chain is
positive recurrent.
It turns out that there is a nice way of understanding part of this result. The first
facilty is an M/M/1 queue so we know that X1 (t) by itself is a CTMC with stationary
distribution Pn = (1 −ρ1 )ρn1 , n ≥ 0. If we start off X1 (0) with this stationary distribution
(P (X1 (0) = n) = Pn , n ≥ 0), then we know that X1 (t) will have this same distribution
for all t ≥ 0, that is, X1 (t) is stationary. It turns out that when stationary, the departure
process is itself a Poisson process at rate λ, and so the second facility (in isolation) can be
treated itself as an M/M/ 1 queue when X1 (t) is stationary. This at least explains why
X2 (t) has the geometric stationary distribution, (1 − ρ2 )ρm 2 , m ≥ 0, but more analysis is
required to prove the independence part.
2. Jackson network:
Consider two FIFO single-server facilities (indexed by 1 and 2), each with exponential
service at rates µ1 and µ2 respectively. For simplicity we refer to each facility as a
“node”. Each node has its own queue. There is one exogenous Poisson arrival process
at rate λ which gets partitioned with probabilities p1 and p2 = 1 − p1 (yielding rates
17
λ1 = p1 λ and λ2 = p2 λ to the nodes respectively). Type 1 arrivals join the queue at node
1 and type 2 do so at node 2. This is equivalent to each node having its own independent
Poisson arrival process. Whenever a customer completes service at node i, they next go
to the queue at node j with probability Qij , independent of the past, i = 1, 2, j = 0, 1, 2.
j = 0 refers to departing the system. So typically, a customers gets served a couple of
times, back and forth between the two nodes before finally departing. In general, we allow
feedback, which means that a customer can return to a given node (perhaps many times)
before departing the system. The tandem queue does not have feedback; it is the special
case when Q1,2 = 1 and Q2,0 = 1 and p2 = 0, an example of a feedforward network. In
general, Q = (Qij ) is called the routing transition matrix, and represents the transition
matrix of a Markov chain. We always assume that states 1 and 2 are transient, and state 0
is absorbing. Letting X(t) = (X1 (t), X2 (t)), where Xi (t) denotes the number of customers
in the ith node, i = 1, 2, {X(t)} yields an irreducible CTMC. Like the tandem queue, it
turns out that the stationary distribution for the Jackson network is of the product form
Pn,m = (1 − α1 )α1n × (1 − α2 )α2m , n ≥ 0, m ≥ 0,
provided that αi < 1, i = 1, 2. Here
λi
αi = E(Ni ),
µi
where E(Ni ) is the expected number of times that a customer attends the ith facility.
E(Ni ) is completely determined by the routing matrix Q: Each customer, independently,
is routed according to the discrete-time Markov chain with transition matrix Q, and since
0 is absorbing (and states 1 and 2 transient), the chain will visit each state i = 1, 2 only a
finite number of times before getting absorbed. Notice that λi E(Ni ) represents the total
arrival rate to the ith node. So αi < 1, i = 1, 2, just means that the total arrival rate must
be smaller than the service rate at each node. As with the tandem queue, the proof can
be carried out by the “plug in and check” method.
where X(tn −) denotes the number in system found by the nth arrival.
On the one hand, λπja is the long-run rate (number of times per unit time) that X(t) makes
a transition j → j + 1. After all, arrivals occur at rate λ, and such transitions can only happen
when arrivals find j customers in the system. On the other hand, from the B&D balance
equations (6), λPj is also the same rate in question. Thus λπja = λPj , or
πja = Pj , j ≥ 0,
which asserts that
18
the proportion of Poisson arrivals who find j customers in the system is equal to the
proportion of time there are j customers in the system.
This is an example of Poisson Arrivals See Time Averages (PASTA), and it turns out that
PASTA holds for any queueing model in which arrivals are Poisson, no matter how complex,
as long as a certain (easy to verify) condition, called LAC, holds. For example PASTA holds
for the M/G/c queue, the M/G/c loss queue, and essentially any queueing network in which
arrivals are Poisson.
Moreover, PASTA holds for more general quantities of interest besides number in system.
For example, the proportion of Poisson arrivals to a queue who, upon arrival, find a particular
server busy serving a customer with a remaining service time exceeding x (time units) is equal
to the proportion of time that this server is busy serving a customer with a remaining service
time exceeding x. In general, PASTA will not hold if the arrival process is not Poisson.
To state PASTA more precisely, let {X(t) : t ≥ 0} be any stochastic process, and ψ = {tn :
n ≥ 0} a Poisson process. Both processes are assumed on the same probability space. We have
in mind that X(t) denote the state of some “queueing” process with which the Poisson arriving
“customers” are interacting/participating. The state space S can be general such as multi-
dimensional Euclidean space. We assume that the sample-paths of X(t) are right-continuous
with left-hand limits. 2
The lack of anticipation condition (LAC) that we will need to place on the Poisson process
asserts that for each fixed t > 0, the future increments of the Poisson process after time t,
{N (t + s) − N (t) : s ≥ 0}, be independent of the joint past, {(N (u), X(u)) : 0 ≤ u ≤ t}. This
condition is stronger than the independent increments property of the Poisson process, for it
requires that any future increment be independent not only of its own past but of the past of
the queueing process as well. If the Poisson process is completely independent of the queueing
process, then LAC holds, but we have in mind the case when the two processes are dependent
via the arrivals being part of and participating in the queueing system.
Let f (x) be any bounded real-valued function on S, and consider the real-valued process
f (X(t)). We are now ready to state PASTA. (The proof, ommitted, is beyond the scope of this
course.)
Theorem 1.3 (PASTA) If the Poisson process satisfies LAC, then w.p.1.,
N
1 X 1 t
Z
lim f (X(tn −)) = lim f (X(s))ds,
N →∞ N t→∞ t 0
n=1
in the sense that if either limit exists, then so does the other and they are equal.
A standard example when X(t) is the number of customers in a queue, would be to let
f denote an indicator function; f (x) = I{x = j}, so that f (X(t)) = I{X(t) = j}, and
f (X(tn −)) = I{X(tn −) = j}. This would, for example, yield πja = Pj for the M/M/1 queue.
The reader should now go back to Example 6 in Section 1.8, the M/M/c-loss queue, where
we first mentioned PASTA in the context of Erlang’s Loss Formula.
2 def
A function x(t), t ≥ 0, is right-continuous if for each t ≥ 0, x(t+) = limh↓0 X(t + h) = x(t). It has left-hand
def
limits if for each t > 0, x(t−) = limh↓0 x(t − h) exists (but need not equal x(t)). If x(t−) 6= x(t+), then the
function is said to be discontinuous at t, or have a jump at t. Queueing processes typically have jumps at arrval
times and departure times.
19
Final Remarks on PASTA
1. To see why Poisson arrivals are needed to assert that arrivals see time averages, consider
a single-server queue in which all interarrival times are exactly of length 2, and all service
times are exactly of length 1, a deterministic queue. Note that every customer completes
service before the next arrival, and so every arrival finds the system empty. This means
that π0a = 1, the proportion of arrivals who find an empty system is 1. But since ρ = 0.5
(λ = 0.5 and µ = 1), the proportion of time the system is empty is P0 = 1 − ρ = 0.5. So
π0a 6= P0 here. This is the norm rather than the exception, unless arrivals are Poisson.
2. The simple proof of PASTA that we presented for the M/M/1 queue (λπja = λPj ) works
essentially for any queueing model (multidimensional complex networks even) for which
the only source of customer arrivals is a Poisson process at rate λ. Letting X(t) denote
the number of customers in the system at time t (in total, no matter where/how they are
distributed within the system), all that is needed is: (1) Only a Poisson arrival can cause
X(t) to jump from j to j + 1. (2) For any n ≥ 0, given X(t) = n, λ is the (instantaneous)
rate at which the next arrival will occur, independent of the past. (1) implies that λπja is
the rate at which {X(t)} makes a transition j → j + 1, and (2) implies that λPj is this
very same rate.
which says that given X(0) = i, the probability of a trandsition i → k 6= i in the next dt time
units is the rate of such a transition, conditional on X(0) = i, multiplied by dt (remember
the Poisson process and more generally the balance equations), and the probability that no
transition takes place is 1 minus the probability that a transition does take place.
20
The Chapman-Kolmogorov equations, for all t ≥ 0, s ≥ 0, i, j ∈ S,
X
Pij (t + s) = Pik (s)Pkj (t),
k∈S
known as the Kolmogorov backward equations. These are a set of linear differential equations
and thus can be solved accordingly (as we shall do below). The word backward refers to the
fact that in our use of the Chapman-Kolmogorov equations, we chose to place the s = dt first
and the t second, that is, the dt was placed “in back”. The derivation above can be rigorously
justified for any non-explosive CTMC.
If we define rik = ai Pik , k 6= i, rii = −ai , then in matrix form (22) becomes
where P (t) = (Pij (t)), P 0 (t) = (Pij0 (t)), R = (rij ), and I is the identity matrix (recall that
Pii (0) = 1, Pij (t) = 0, j 6= i).
The unique solution is thus of the exponential form;
where ∞
def X (Rt)n
eRt = .
n=0
n!
It is rare that we can explicitly compute the inifinite sum, but there are various numerical
recipes for estimating eRt to any desired level of accuracy. For example, since
eM = limn→∞ (1+M/n)n , for any matrix M , one can choose n large and use eRt ≈ (1+(Rt)/n)n .
21
instead of in back when using the Chapman Kolmogorov equations (together with (16)-(18)):
X
Pij (t + dt) − Pij (t) = −Pij (t) + Pik (t)Pkj (dt) (25)
k∈S
X
= −Pij (t) + Pij (t)Pjj (dt) + Pik (t)Pkj (dt) (26)
k6=j
h X i
= dt −Pij (t)aj + Pik (t)ak Pkj , (27)
k6=j
yielding
Pij0 (t) = −aj Pij (t) +
X
ak Pkj Pik (t), i, j ∈ S, (28)
k6=j
which is P 0 (t)
= P (t)R, P (0) = I in matrix form. Although it turns out that the above
method of derivation of the forward equations is not always justified (whereas our analogous
derivation for the backward equations is justified), it does not matter, since the unique solution
P (t) = eRt to the backward equations is the unique solution to the forward equations, and thus,
both equations are valid.
For a (non-explosive) CTMC, the transition probabilities Pij (t) are the unique so-
lution to both the Kolmogorov backward and forward equations.
As an application, suppose the CTMC is positive recurrent. Then (recall (3)) for all i, j ∈ S,
limt→∞ Pij (t) = Pj which in turn implies that limt→∞ Pij0 (t) = 0. Thus from the forward
equations (28) we obtain X
0 = −aj Pj + ai Pij Pi , j ∈ S,
i6=j
the balance equations. We conclude that if a (non-explosive) CTMC is positive recurrent, then
the stationary probabilities must satisfy the balance equations (this then proves one direction
of Theorem 1.1).
22