0% found this document useful (0 votes)

5 views70 pages

Memoria

This bachelor's thesis by Anna Areny Satorra focuses on the analysis of discrete time Markov chains, a type of stochastic process where future states depend only on the current state. The document includes an introduction to basic concepts, properties, and applications of Markov chains, followed by detailed discussions on various aspects such as the Chapman-Kolmogorov equation, state classification, hitting times, and ergodic theory. The study aims to provide a theoretical understanding of Markov chains rather than practical applications.

Uploaded by

rezaiam625

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views70 pages

Memoria

Uploaded by

rezaiam625

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Treball final de grau

GRAU DE MATEMÀTIQUES
Facultat de Matemàtiques
Universitat de Barcelona

MARKOV CHAINS

Autor: Anna Areny Satorra

Director: Dr. David Márquez Carreras

Realitzat a: Departament de probabilitat,
lògica i estadı́stica

Barcelona, June 27, 2016

Abstract
The goal of this project is to analyze and know in detail the Markov chains, a
stochastic process characterized by the fact that the result obtained in a given
stage only depends on the result obtained in the previous stage. We will study a
concrete type of Markov chains, these are the discrete chains. For this, the first
part of the thesis is focused on introducing the basic notation used and the basic
concepts related with Markov chains, and, in the remaining sections, we will focus
on some of its specific characteristics and properties.

i
Acknowledgements
During the last months it has been highly important the guidance and support of
Dr. David Márquez, the director of this thesis. My most sincere words of thanks for
his time and the confidence he has shown on me. I would also like to acknowledge
the support provided by my family and friends, especially in the most difficult
moments. Their unconditional support has made possible the realization of this
project.

ii
Contents
1 Introduction 1

2 Discrete time Markov chains 3

2.1 Homogeneous Markov chains . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Basic examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Chapman-Kolmogorov equation 10

4 Classification of states 13
4.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Classification of states . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Periodicity and cyclic classes . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Example: an analysis of the random walk on Z . . . . . . . . . . . . 22

5 Hitting times 24

6 Distribution and measure 30

6.1 Stationary or invariant distribution . . . . . . . . . . . . . . . . . . 30
6.2 Positive recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

7 Ergodic theory 42
7.1 Ergodic theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.2 Finite state space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

8 Conclusions 52

A Basic probability concepts 53

A.1 Probability of countable sets . . . . . . . . . . . . . . . . . . . . . . 53
A.2 Conditional probability and independence . . . . . . . . . . . . . . 54
A.3 Discrete random variables and expected value . . . . . . . . . . . . 56

B Examples 58

iii
1 Introduction
In the field of probability theory we find the concept of stochastic process, a kind
of process that is characterized by a set of random variables that represent the
evolution of a system of random values along time.
The simplest case of a stochastic process is that its results depend directly from
the other previous results. This type of process is known as Markov chain and will
be the area which this bachelor’s degree final thesis will focus in.
One of the most important properties of a Markov chain, also called Markov
property, is that the values it can take in the future only depend on its present
value, not in past values, so it is said that Markov chains have no memory, due to
the track incurred till present does not affect its future path.
These chains receive their name from the Russian mathematician Andrey An-
dreyevich Markov, who dedicated his studies to stochastic process as well as be one
of the most important participants in the political and social events that occurred
in Europe at the beginning of the 20th Century.
It is essential to know that Markov chains have several and useful applications in
the field of the investigation. Though its applications are numerous, its importance
is greater when we talk about chemistry, medicine, biology and physics, perhaps the
four most important fields in the huge world of science. An easy example of these
applications is the Ehrenfest chain, which allows us to parametrize the exchange
of gas molecules between two bodies. However, this project is not focused on the
practical applications of Markov chains, in it we make a theoretical study of discrete
time Markov chains.
During the bachelor’s degree we have taken probability and statistics subjects.
However, the amount of time dedicated to these topics is not very high, so I found
interesting to focus this thesis on these fields. After looking for several options, I
decided that Markov chains would be an interesting topic due to its characteristic
of lack of memory caught my attention.

Structure of the project

This project analyzes the elemental theory of discrete time Markov chains. The
thesis is divided in several chapters, each of them based on a different question
related with this Markov chains.
To begin with, in the first chapter we can find a brief introduction that helps us
to understand the basic concepts related with Markov chains as well as introducing
some of their basic properties. Additionally, the basic notation that we will find
thereafter is introduced in this section.
Sometimes, in order to move from one state to another within a Markov chain we
can choose among several ways. In chapter 3, we study the probability of moving
between two states taking into account all the possible ways, so we will introduce
the Chapman-Kolmogorov equation for this purpose.

1
Once introduced several useful concepts in the prior sections, in chapter 4 we
will focus on studying the behavior of each one of the states of a Markov chain. In
particular, we will analyze the possibility of returning to every state.
In the previous section we have studied in the hypothetic return to the original
state. In chapter 5, we will also focus on arriving at a given state, but now the
target is to reach a subset of states of the Markov chain, in which we may not be
initially.
Finally, the last two chapters are strongly related between them. Here, we study
the long term behavior of a Markov chain and the invariant distributions that the
stochastic matrix of the Markov chain presents. Additionally, in chapter 6 we
will see some applications of the previous distributions and in chapter 7 we will
demonstrate the uniqueness of the invariant distribution.

2
2 Discrete time Markov chains
We already know that Markov chains are very important within the theory of prob-
abilities. In this chapter, we begin focusing on discrete Markov chains, in particular
homogeneous chains, which do not depend on time. This type of chains go with us
throughout the chapter where we study one important result, which will allow us
to know which chains satisfy the homogeneous property. At the end of this section
we also include some basic examples about homogeneous Markov chains.
To begin this section we will give some basic definitions about Markov theory
which are a great help to achieve results that are in later sections.

2.1 Homogeneous Markov chains

In this chapter we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where I is a numerable set known
as state space and every i ∈ I is a state.
Moreover, recall that every random variable X : Ω −→ I will have associated a
distribution ν, in other words, ν is its law, so νi = P (X = i).

Definition 2.1.1. A stochastic process is a collection of random variables which

are included by an arbitrary set {Xt ; t ∈ T }. For every t, we have Xt : Ω −→ R, it
represents the evolution of some system of random values over time.

We are going to study the discrete case, that is to say T = Z+ = {0, 1, 2, ...}; as
a result we have the process defined as {Xt ; t ≥ 0}. Moreover, we will focus on the
stochastic process known as Markov chain. This process has no memory because
its future behaviour is only affected by the present one.

Definition 2.1.2. A stochastic matrix, also called transition (probability) matrix,

Π = (pi,j ; i, j ∈ I) is a matrix which satisfies the following conditions
(i) pi,j ∈ [0, 1], ∀i, j ∈ I.
X
(ii) ∀i ∈ I, pi,j = 1.
j∈I

The element pi,j is called transition probability, that is to say, it is the probability
that in the point of time k, the process is in the state j given that in the point of
time k − 1 it was in i.
Now we will see, with the help of examples, that there is a strong relationship
between diagrams and the stochastic matrix described above.

Example 2.1.3.

1. We start with a general example in order to clearly see that the stochastic
matrix satisfies its own properties. Given a, b ∈ [0, 1], we have the diagram

3
and the stochastic matrix is

1−a a
b 1−b

2. Now, we are going to see a numerical example, in this case we have the diagram

and the stochastic matrix is

 
1 3 1
5 0 0 0 0
5 5
0 0 1 0 0 0 0
 

0 1 2 
0 0 0 0

 3 3 

0 1 0 0 0 0 0
 
0 0 0 0 0 1 0
 
0 0 0 0 0 0 1
0 0 0 0 1 0 0

Then, using the concepts defined so far we are going to study two important
issues for the Markov theory.
Definition 2.1.4. Given a stochastic process {Xn ; n ≥ 0} that takes values in the
state space I, we have

(i) P (X0 = i) = νi , ∀i ∈ I, which is called initial distribution.

(ii) P (Xn+1 = in+1 |X0 = i0 , ..., Xn = in ) = P (Xn+1 = in+1 |Xn = in ), ∀n ≥ 0 and
∀i0 , ..., in , in+1 ∈ I.

4
So the stochastic process {Xn ; n ≥ 0} is a Markov chain with initial distribution
ν = {νi ; i ∈ I} and transition matrix Π = (pi,j ; i, j ∈ I).

The last equality is known as Markov property, which tells us that the probability
of a future event only depends on the probability of the prior event instead of the
whole system evolution. This implies that Markov chains are a non-memory process.

Definition 2.1.5. A homogeneous Markov chain is a Markov chain, such that

∀n ≥ 0, ∀i, j ∈ I, the probability

P (Xn+1 = j|Xn = i) = pi,j

is independent of n.

Often, we write HM C(ν, Π) in reference to a homogeneous Markov chain with

distribution (or law) ν and transition matrix Π.
To continue, based on concepts previously defined, we introduce a theorem re-
lated with homogeneous Markov chains, which will be so useful to study some
outcomes we will face in next chapters.

Theorem 2.1.6. A stochastic process {Xn ; n ≥ 0} with values in I is a HM C(ν, Π)

if and only if, ∀n ≥ 0 and ∀i0 , ..., in−1 , i ∈ I we have

P (X0 = i0 , ..., Xn−1 = in−1 , Xn = i) = νi0 pi0 ,i1 · · · pin−1 ,i . (2.1)

Proof. Firstly, we demonstrate the right implication. Given Xn , a homogeneous

Markov chain (ν, Π), applying the compound probability and the Markov property
we get

P (X0 = i0 , ..., Xn−1 = in−1 , Xn = i)

= P (X0 = i0 )P (X1 = i1 |X0 = i0 ) × · · · × P (Xn = i|X0 = i0 , ..., Xn−1 = in−1 )
= νi0 pi0 ,i1 · · · pin−1 ,i .

Now we prove the left implication. Considering (2.1), we have to study that
Xn is a Markov chain as well as it is homogenous. Firstly, we analyze the initial
distribution, in that way ∀i0 ∈ I we obtain
X X
P (X0 = i0 ) = νi0 pi0 ,j = νi0 pi0 ,j = νi0 .
j∈I j∈I

Secondly, we have to check Markov property, in this case we use the conditional
probability, so ∀i0 , ..., in−1 , i, j ∈ I we have

P (X0 = i0 , ..., Xn = i, Xn+1 = j)

P (Xn+1 = j|X0 = i0 , ..., Xn = i) =
P (X0 = i0 , ..., Xn = i)
νi0 pi0 ,i1 · · · pin−1 ,i pi,j
= = pi,j .
νi0 pi0 ,i1 · · · pin−1 ,i

5
In order to prove the property we are studying, the value of the last equality must
be equal to the value of P (Xn+1 = j|Xn = i), we are going to check if this is certain
P (Xn+1 = j, Xn = i)
P (Xn+1 = j|Xn = i) =
P (Xn = i)
X
P (X0 = i0 , ..., Xn = i, Xn+1 = j)
i0 ,...,in−1 ∈I
= X
P (X0 = i0 , ..., Xn = i)
i0 ,...,in−1 ∈I
X X
νi0 pi0 ,i1 · · · pin−1 ,i pi,j pi,j νi0 pi0 ,i1 · · · pin−1 ,i
i0 ,...,in−1 ∈I i0 ,...,in−1 ∈I
= X = X = pi,j .
νi0 pi0 ,i1 · · · pin−1 ,i νi0 pi0 ,i1 · · · pin−1 ,i
i0 ,...,in−1 ∈I i0 ,...,in−1 ∈I

As a result, we have P (Xn+1 = j|X0 = i0 , ..., Xn = i) = pi,j = P (Xn+1 = j|Xn = i),

therefore the Markov property is satisfied.
Finally, we have to study the homogeneous property. From the last equality we
have that P (Xn+1 = j|Xn = i) = pi,j where pi,j is independent of n which is the
necessary condition for homogeneity.
In conclusion, we have that Xn is a homogeneous Markov chain with distribution
(or law) ν and transition (probability) matrix Π.

The result we have just obtained is very important, due to it is useful to check if a
Markov chain is homogenous or not. We use this result in order to prove that given
a homogenous Markov chain, this can be moved, and in that way obtain another
chain that is homogenous too.
Proposition 2.1.7. Given {Xn ; n ≥ 0}, a HM C(ν, Π); then {Xm+n ; m ≥ 0} is
another HM C(η, Π), that is to say, ∀m ≥ 0, ∀j0 , ..., jn ∈ I
P (Xm+0 = j0 , ..., Xm+n = jn ) = P (Xm = j0 )pj0 ,j1 · · · pjn−1 ,jn .

Proof. Suppose {Xn ; n ≥ 0} is a homogeneous Markov chain and we want to demon-

strate that {Xm+n ; m ≥ 0} too. Consequently, ∀m ≥ 0 and ∀i0 , ..., im−1 , j0 , ..., jn ∈
I we have
P (Xm+0 = j0 , ..., Xm+n = jm )
X
= P (X0 = i0 , ..., Xm−1 = im−1 , Xm = j0 , ..., Xm+n = jn )
i0 ,...,im−1 ∈I
X
= P (X0 = i0 )pi0 ,i1 · · · pim−1 ,j0 pj0 ,j1 · · · pjn−1 ,jn
i0 ,...,im−1 ∈I

= P (Xm = j0 )pj0 ,j1 · · · pjn−1 ,jn .

In conclusion, as we have seen that equality is satisfied, we have that Xm+n is a
homogeneous Markov chain.

6
2.2 Basic examples
In this section, we are studying some basic examples about Markov chains, in
particular about homogeneous ones. These type of chains are considered as one of
the most important for getting into the Markov theory world.

1. Random walks on Z:
Given the stochastic process {Xn ; n ≥ 0}, we define X0 as initial position which
is constant (X0 = 0) and the process
n
X
Xn = X0 + ξ1 + · · · + ξn = X0 + ξi
i=1

for n ≥ 1 and {ξi ; i ≥ 1} is a set of random variables.

Let p ∈ (0, 1), and suppose that {ξi ; i ≥ 1} is a sequence of {−1, 1} values, which
are 1 with probability p and -1 with probability 1−p = q, this is to say P (ξi = 1) = p
and P (ξi = −1) = q respectively.
In this case, ξi are identically distributed Bernoulli random variables and also
independents.
To sum up, the process begins at time i = 0 with a value of 0 and moves along a
straight line for every point of time i ≥ 1. The probability that step is to the right
is p, whereas, the probability that the step is to the left is q; both are independent
of where it has been before time i.
Once exposed this basic example about Markov theory, we are introducing a
result in which are reflected some of the concepts studied here.

Proposition 2.2.1. Given {Zn ; n ≥ 1} a family of identically distributed random

variables and also independents with Zn : Ω −→ I. Suppose f : I × I −→ I and
X0 a random variable with values in I, independent of {Zn ; n ≥ 1}. If Xn+1 =
f (Xn , Zn+1 ) where n ≥ 0, then {Xn ; n ≥ 0} is a homogeneous Markov chain.

Proof. In this case we have to prove the Markov property and the homogeneous
one, to see that Xn is a homogeneous Markov chain. Firstly, let us to see the first
property using the conditional probability, so ∀n ≥ 0 and ∀i0 , ..., in−1 , i, j ∈ I we
have
P (f (Xn , Zn+1 = j), Xn = i) P (f (i, Zn+1 ) = j, Xn = i)
P (Xn+1 = j|Xn = i) = =
P (Xn = i) P (Xn = i)
P (f (i, Zn+1 ) = j)P (Xn = i)
= = P (f (i, Zn+1 ) = j) := qi,j .
P (Xn = i)

In order to satisfy the property we are studying, the value of the last equality must
be equal to the value of P (Xn+1 = j|X0 = i0 , ..., xn−1 = in−1 , Xn = i), we are going

7
to check if this is certain

P (Xn+1 = j|X0 = i0 , ..., Xn−1 = in−1 , Xn = i)

P (X0 = i0 , ..., Xn−1 = in−1 , Xn = i, Xn+1 = j)

=
P (X0 = i0 , ..., Xn−1 = in−1 , Xn = i)

P (X0 = i0 , f (X0 , Z1 ) = i1 , ..., f (Xn−1 , Zn ) = i, f (Xn , Zn+1 ) = j)

=
P (X0 = i0 , f (X0 , Z1 ) = i1 , ..., f (Xn−1 , Zn ) = i)

P (X0 = i0 , f (i0 , Z1 ) = i1 , ..., f (in−1 , Zn ) = i, f (i, Zn+1 ) = j)

=
P (X0 = i0 , f (i0 , Z0 ) = i1 , ..., f (in−1 , Zn ) = i)

P (X0 = i0 )P (f (i0 , Z1 ) = i1 ) · · · P (f (in−1 , Zn ) = i)P (f (i, Zn+1 ) = j)

=
P (X0 = i0 )P (f (i0 , Z1 ) = i1 ) · · · P (f (in−1 , Zn ) = i)

= P (f (i, Zn+1 = j) := qi,j .

With this we have

P (Xn+1 = j|X0 = i0 , ..., Xn−1 = in−1 , Xn = i) = qi,j = P (Xn+1 = j|Xn = i)

therefore the Markov property is correct.

Now let us to see the homogeneous property. From previous results in the first
part of the demonstration we get P (Xn+1 = j|Xn = i) = qi,j where qi,j is indepen-
dent of n, this is the necessary condition for homogeneity.
So, the family of random variables {Xn ; n ≥ 0} satisfy the Markov property and
homogeneity so that is a homogeneous Markov chain.

Corollary 2.2.2. Let {Yn ; n ≥ 1} be a family of identically

P distributed random vari-
ables and also independents with values in I. If Xn = ni=1 Yi , then {Xn ; n ≥ 0} is
a homogeneous Markov chain.

With that, we have that given a set of identically distributed random variables,
which also are independent, we can build different Markov chains, which satisfy the
properties defined at the beginning.
To conclude this example we give the transition matrix, where p and q = 1 − p
are probabilities defined above (p ∈ (0, 1))
 .. .. 
. .
· · · 0 p 0 0 0 · · ·
 
· · · q 0 p 0 0 · · ·
 
Π= 
· · · 0 q 0 p 0 · · ·

· · · 0 0 q 0 p · · ·
 
.. ..
. .

8
2. Random walk on Z with absorbing barriers:
Consider two players A and B who play at coin flipping (or heads or tails) and
their capital are a and b respectively. The game ends when one player runs out of
money.
This game is the practice of throwing a coin in the air. Consider the player A
who wins 1 coin when the result is a head, this happen with probability p; moreover,
the player loses 1 coin if the result is a tail, this happen with probability q = p − 1.
Given the stochastic process {Xn ; n ≥ 0}, which we note as the evolution of the
capital for the player A. Therefore, we define X0 = a as initial condition and the
process Xn+1 = Xn + ξn+1 such that Xn : Ω −→ {0, 1, ..., a + b} where {ξi ; i ≥ 1} is
a set of identically distributed random variables and also independents.
To conclude this example we give their transition matrix, where p and q = 1 − p
are probabilities defined above
 
1 0 0 0 ··· 0
q 0 p
 0 ··· 0 
0 q 0
 p ··· 0 
Π = 0 0 q
 0 ··· 0 
 .. .. .. 
. . .
 
0 0 · · · q 0 p
0 0 ··· 0 0 1

3. Random walk on Z with reflecting barriers:

This section has a similar argument to the last example, but with a difference;
when one player runs out of money not ends the game due to that the other player
gives him one coin. In this case the transition matrix is
 
0 1 0 0 ··· 0
q 0 p 0 · · · 0
 
0 q 0 p · · · 0 
 
0 0 q 0 · · · 0
Π=  
 .. .. .. 
. . .
 
0 0 · · · q 0 p
0 0 ··· 0 1 0

9
3 Chapman-Kolmogorov equation
One of the most relevant elements when introducing to Markov chains is the Chapman-
Kolmogorov equation. In order to study this type of equation, in this chapter, we
focus on analyzing the simple way that allows us to discompose the probability of
moving from one state i to another state j in m steps, in the sum of the probabilities
of the trajectories that go from the state i to the state j and pass through any other
state in an intermediate point of time.
In this chapter, we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where {Xn ; n ≥ 0} is a HMC(ν, Π)
and I is a countable set called state space.
Definition 3.0.1. Given {Xn ; n ≥ 0} a time homogeneous Markov chain, we define
the m-step transition probability, with m > 1, as
(m)
pi,j = P (Xm = j|X0 = i) = P (Xn+m = j|Xn = i), ∀i, j ∈ I

which is the probability of going from i to j in m steps. In this case, the transition
matrix is given as Πm = Π(m) := (pm i,j ; i, j ∈ I).

Observation 3.0.2. Considering the last definition, for m = 1 we have the tran-
(1)
sition probability pi,j = pi,j = P (Xn+1 = j|Xn = i) which goes from i to j in one
step. Now, the transition matrix is Π(1) = Π.
Moreover, we are checking that m-step transition probability is well defined,
(2)
hence we have to calculate pi,j = P (Xn+2 = j|Xn = i). In order to compute
Xn+2 = j, Xn+1 must go through some state k.
(2)
X
pi,j = P (Xn+2 = j|Xn = i) = P (Xn+2 = j, Xn+1 = k|Xn = i)
k∈I
X P (Xn+2 = j, Xn+1 = k, Xn = i)
=
k∈I
P (Xn = i)
X P (Xn+2 = j, Xn+1 = k, Xn = i) P (Xn+1 = k, Xn = i)
= ·
k∈I
P (Xn = i) P (Xn+1 = k, Xn = i)
X P (Xn+2 = j, Xn+1 = k, Xn = i) P (Xn+1 = k, Xn = i)
= ·
k∈I
P (Xn+1 = k, Xn = i) P (Xn = i)
X X
= P (Xn+2 = j|Xn+1 = k, Xn = i)P (Xn+1 = k|Xn = i) = pi,k pk,j ,
k∈I k∈I

where the last result is the i, j-th entry of the transition matrix Π2 .
After that, if we do mathematical induction on the number of steps, we get the
following result.

Proposition 3.0.3. The m-step transition probability is the m-th power of the
m
transition matrix Π, that is to say Πm = Π · · · Π, ∀m > 2

10
Proof. To study this proposition, we just need to prove the following equality
(m)
X (m−1)
pi,j = pi,k pk,j , ∀m > 2
k∈I

which implies that Πm = Πm−1 Π, because of the first definition. Using the condi-
tional probability we get
(m)
X
pi,j = P (Xn+m = j|Xn = i) = P (Xn+m = j, Xn+m−1 = k|Xn = i)
k∈I
X P (Xn+m = j, Xn+m−1 = k, Xn = i)
=
k∈I
P (Xn = i)
X P (Xn+m = j, Xn+m−1 = k, Xn = i) P (Xn+m−1 = k, Xn = i)
= ·
k∈I
P (X n = i) P (Xn+m−1 = k, Xn = i)
X P (Xn+m = j, Xn+m−1 = k, Xn = i) P (Xn+m−1 = k, Xn = i)
= ·
k∈I
P (Xn+m−1 = k, Xn = i) P (Xn = i)
X
= P (Xn+m = j|Xn+m−1 = k, Xn = i)P (Xn+m−1 = k|Xn = i)
k∈I
X (m−1)
= pk,j pi,k .
k∈I

(m) P (m−1) P (m−1)

Finally we have pi,j = k∈I pk,j pi,k = k∈I pi,k pk,j as we desired.

(m) P (m−1)
Observation 3.0.4. From the previous proposition we get pi,j = k∈I pi,k pk,j ,
now ∀i1 , ..., im ∈ I we can rewrite the equality as follows
(m)
X (m−1) X
pi,j = pi,im−1 pim−1 ,j = pi,i1 pi1 ,i2 · · · pim−1 ,j .
im−1 ∈I im−1 ,...,i1 ∈I

Now, considering the defined items and the outcomes obtained in the prior intro-
duction, we are already able to study the Chapman-Kolmogorov equation, which is
defined as following.

Definition 3.0.5. Let m, n ≥ 0, then ∀i, j ∈ I the probability of going from i to j

in m + n steps is X (m) (n)
(m+n)
pi,j = pi,k pk,j
k∈I

which is called Chapman-Kolmogorov equation.

To continue, we have to see that the Chapman-Kolmogorov equation is correct.

We have to take into account that for going from i to j in m + n steps; firstly we
are going from i to k in m steps and then, from k to j in n steps. As a result we

11
have
(m+n)
X
pi,j = P (Xm+n = j|X0 = i) = P (Xm+n = j, Xm = k|X0 = i)
k∈I
X
= P (Xm+n = j|Xm = k, X0 = i)P (Xm = k|X0 = i)
k∈I
X (m) (n)
= pi,k pk,j .
k∈I

Observation 3.0.6. In this case, the transition matrix is given as Πm+n = Πm Πn ,

where m, n ≥ 0. It can be defined due to the probability distribution is discrete
and the Markov chain is homogeneous.

Then we are studying the effects of considering that the first state of the HMC(ν, Π)
is generated randomly, so ∀j ∈ I we can write
X X
P (Xn = j) = P (X0 = i, Xn = j) = P (X0 = i)P (Xn = j|X0 = i),
i∈I i∈I
(n)
where P (Xn = j|X0 = i) = pi,j and P (X0 = i) = νi . Using this last notation, we
can rewrite the previous equality as follows
X (n)
P (Xn = j) = νi pi,j .
i∈I

With this we obtain the product between the transition matrix and a initial distri-
bution which form a vector, then we get
(n)
X (n)
νj = P (Xn = j) = νi pi,j where ν (n) = νΠn is the law for Xn .
i∈I

Also using Observation 3.0.4., for all j ∈ I, we can express the equality as
follows
(n)
X (n) X X
νj = P (Xn = j) = νi pi,j = νi pi,i1 · · · pin−1 ,j .
i∈I i∈I i1 ,...,in−1 ∈I

To continue, we study how we can determine the law of the random vector
(Xn1 , ..., Xnm ) with n1 , ..., nm ≥ 0. For this, it is necessary using the distribution ν
and the transition matrix Π. In general, we use the compound probability to find
the law of the vector, so ∀i1 , ..., im ∈ I we have
P (Xn1 = i1 , ..., Xnm = im )
= P (Xn1 = i1 )P (Xn2 = i2 |Xn1 = i1 ) · · · P (Xnm = im |Xn1 = i1 , ..., Xnm−1 = im−1 )
(n ) (n −n ) (nm −nm−1 )
X
= νk pk,i11 pi1 ,i22 1 · · · pim−1 ,im .
k∈I

With this, the result obtained consists in the product of the initial distribution
and the elements of the stochastic matrix, which are raised to a given power, so it
presents the same characteristics than the law obtained in the case of the homoge-
neous Markov chain Xn .

12
4 Classification of states
In this chapter, we analyze the way in which we can classify the states that compose
a Markov chain. This classification may be based in the communication relationship
that the different states present, so we study the possibility of moving from one state
to another and then come back to the initial state, or weather given a specific state
we do not move from this point.
In this section, we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where I is a countable set and
{Xn ; n ≥ 0} is a HMC(ν, Π) where the transition matrix is Π = (pi,j ; i, j ∈ I).

4.1 Basic concepts

We begin defining two concepts which show us that Markov chains are divided into
parts.

Definition 4.1.1. Given the states i, j ∈ I, we write i → j and say that j is

accessible from i if the chain can go from state i to state j in a finite number of
transitions with positive probability. This means that, for some n ≥ 0
(n)
pi,j = P (Xn = j|X0 = i) > 0.

Once given this definition, we use it to introduce the following concept, in which
we talk about the possible relationship between two states.

Definition 4.1.2. The state i communicates with j if they are accessible between
them, in other words, this is i is accessible from j and j is accessible from i, which
we write as i ↔ j.

To continue, we are going to see that the property of communication is an equiv-

alence relation, so, it has to satisfy the properties of reflexivity, symmetry and
transitivity respectively, which are shown below
(0)
(i) pi,i = P (X0 = i|X0 = i) = 1 > 0, so i ↔ i in 0 steps.
(n) (m)
(ii) If i ↔ j, exist n, m ≥ 0 such that pi,j > 0 and pj,i > 0, then j ↔ i.
(n ) (n )
(iii) If i ↔ j, exist n1 , n2 ≥ 0 where pi,j1 > 0, pj,i2 > 0 and also j ↔ l, so
(m ) (m )
exist m1 , m2 ≥ 0 where pj,l 1 > 0, pl,j 2 > 0, then i ↔ l because, using the
Chapman-Kolmogorov equation, the probabilities satisfy
(n +m )
X (n ) (m ) (n ) (m )
pi,l 1 1 = pi,k1 pk,l 1 ≥ pi,j1 pj,l 1 > 0,
k∈I

(n +m2 )
X (m ) (n ) (m ) (n )
pl,i 2 = pl,k 2 pk,i2 ≥ pl,j 2 pj,i2 > 0.
k∈I

13
Therefore, we have that the property ↔ is an equivalence relation which gives a
partition of the state space, I, into equivalence classes, which are called communi-
cation classes. Moreover, we have that two states that communicate are in the same
class. The concept we have just studied helps us to define the following property.
Definition 4.1.3. A Markov chain is irreducible if there is an unique equivalence
class, in other words, when every state communicates with each other.

Now, we define the concept of a closed class, which will be useful later for clas-
sifying states.
Definition 4.1.4. A subset C ⊆ I is a closed class if i ∈ C and i → j implies that
j ∈ C, that is to say, from a state of a class C we can never have access to a state
of I \ C (hence, we cannot move from it).

To continue, once given the last definition, we say that a class C is closed if the
elements of the stochastic matrix satisfy the following property
X
pi,j = 1, ∀i ∈ C.
j∈C

An alternative way to study whether a class C is closed, is using the n-step

(n)
transition probability, which must take the following value pi,j = 0, for all i ∈ C,
j ∈ I \ C and n ≥ 1.

4.2 Classification of states

Right now, we have defined some concepts, which form an introduction in order to
be able to study the behaviour of any state of a Markov chain, which depends on
whether we return to the initial state or not. For this, we begin defining two highly
important concepts within the classification of the states.
Definition 4.2.1. For any state i ∈ I, we define

fi = P (Xn = i for infinitely many n|X0 = i) = Pi (Xn = i for infinitely many n)

and hence, we say that i is recurrent if fi = 1, and it is transient if fi = 0.

In general, a state is said to be recurrent if we will return to it in the future, if

not, it is transient.
Definition 4.2.2. The first passage time to state i ∈ I is defined as

Ti = inf {n ≥ 1; Xn = i} ,

this is the first time in which the chain visits the state i.

To continue, we define a new concept which will also help us to determine if a

state is recurrent or transient.

14
Definition 4.2.3. The probability that a chain which starts in state i pass through
j, denoted ρi,j , is defined as

ρi,j = Pi (Tj < ∞).

Once given the last definition, we can rewrite ρi,j as follows

∞
X
ρi,j = Pi (Tj = k).
k=1

Moreover, we have the particular case where the probability that a chain which
starts in state i returns to the starting point, i, it is denoted as ρi,i .
Now, we are going to introduce
a new concept, for which we will need to consider
1 if X = k
the indicator function 1{X=k} = .
0 if X 6= k
Definition 4.2.4. The number of visits that the chain does to the state j, denoted
by N (j), is defined as
∞
1{Xn =j} .
X
N (j) =
n=1

Considering the concepts we have just defined, we have an equivalence between

both elements {N (j) ≥ 1} and {Tj < ∞}, given X0 = i, so the element ρi,j can be
expressed as ρi,j = Pi (Tj < ∞) = Pi (N (j) ≥ 1).
To carry on, given the last property, we are going to prove the following equality

Pi (N (j) ≥ k) = ρi,j ρk−1

j,j , ∀k ≥ 2.

Now, we are going to compute the probability that a Markov chain which starts
at i visits the state j for the first time in the point of time k and, the next time it
comes back to j, it takes n instants of time, then the probability is

P (Xn+k = j, Xn+k−1 6= j, ..., Xk = j, ..., X1 6= j|X0 = i)

= P (Xn+k = j, ..., Xk+1 6= j|Xk = j)P (Xk = j, ..., X1 6= j|X0 = i)
= Pj (Tj = n)Pi (Tj = k).

Using the expression of ρi,j , we have that the probability of visiting the state i at
least twice is
∞ X
X ∞
Pi (N (j) ≥ 2) = Pi (Tj = k)Pj (Tj = n)
k=1 n=1
∞
! ∞
!
X X
= Pi (Tj = k) Pj (Tj = n) = ρi,j ρj,j .
k=1 n=1

After that, we do mathematical induction on the number of visits to the state j.

Assuming the result is true for k − 1, we will see that also holds for k. To continue,

15
we calculate the probability that a chain which starts at i visits the state j for the
first time in the point of time l, after this, the chain will visit again the state j k − 1
times.

P (Xn+k+···+m+l = j, ..., Xk+···+m+l = j, ..., Xm+l = j, ..., Xl = j, ..., X1 6= j|X0 = i)

k−1
= Pj (Tj = n) · · · Pj (Tj = m)Pi (Tj = l).
As we have done previously, once computed the last equality, we have the following
probability
∞ X
X ∞ ∞
X
Pi (N (j) ≥ k) = ··· Pi (Tj = l)Pj (Tj = m) · · · Pj (Tj = n)
l=1 m=1 n=1
∞
! ∞
! ∞
!
X X k−1 X
= Pi (Tj = l) Pj (Tj = m) ··· Pj (Tj = n)
l=1 m=1 n=1
k−1
= ρi,j ρj,j · · · ρj,j = ρi,j ρk−1
j,j .

This result is very important, due to it will help us to demonstrate some of the
theorems, related with the classification of states, that we will find later.
Considering the concepts previously defined and given one state i ∈ I, if the
probability satisfies the following equality ρi,i = 1, then

Pi (N (i) = ∞) = lim Pi (N (i) ≥ k) = lim ρi,i ρk−1 k

i,i = lim 1 = 1,
k→+∞ k→+∞ k→+∞

and hence, the state i is recurrent because we can visit this state ∞ times. Moreover,
the probability ρi,i may also be less than 1, ρi,i < 1, and in this case we have that

Pi (N (i) = ∞) = lim Pi (N (i) ≥ k) = lim ρi,i ρk−1

i,i = 0
k→+∞ k→+∞

and now the state i is transient, because the number of times we visit this state is
low.
Once studied the previous equalities, we focus on the definition of the number of
times that a chain visits one state, in this context, we will introduce the calculation
of the expected value. Therefore, given the expected value Ei (1{Xn =j} ) = Pi (Xn =
(n)
j) = pi,j , the expected number of visits to state j, for a chain that begins in i, is
the following
∞
X (n)
Ei (N (j)) = pi,j .
n=1

Now, we study the relationship between different types of states of a Markov

chain and the expected value of the number of times that a chain visits them.

Theorem 4.2.5. Given a state j ∈ I which can be transient or recurrent, then we

ρj,j
have Ej (N (j)) = 1−ρj,j or Ej (N (j)) = ∞ respectively.

16
Proof. Firstly, we consider the case in which the probability ρj,j is less than 1, so
k−1
ρj,j < 1 and using the property Pi (N (j) ≥ k) = ρi,j ρj,j we have

Pi (N (j) = ∞) = lim Pi (N (j) ≥ k) = lim ρi,j ρk−1

j,j = 0.
k→+∞ k→+∞

In particular, Pj (N (j) = ∞) = 0 and hence, the state j is transient. Now, we

compute the expected value
∞
X ∞
X
Ej (N (j)) = kPj (N (j) = k) = k (Pj (N (j) ≥ k) − Pj (N (j) ≥ k + 1))
k=1 k=1
X∞ ∞
X
= k(ρj,j ρk−1
j,j − ρj,j ρkj,j ) = ρj,j (1 − ρj,j ) kρk−1
j,j
k=1 k=1
1 ρj,j
= ρj,j (1 − ρj,j ) = < ∞.
(1 − ρj,j )2 1 − ρj,j

Secondly, suppose the following equality holds ρj,j = 1 and using the property
Pi (N (j) ≥ k) = ρi,j ρk−1
j,j we get

Pi (N (j) = ∞) = lim Pi (N (j) ≥ k) = lim ρi,j · 1k−1 = ρi,j .

k→+∞ k→+∞

In particular, Pj (N (j) = ∞) = 1 and hence, the state j is recurrent. From this

equality, we obtain that the value for the random variables is ∞, due to the prob-
ability of happening is 1, in this case, the expected value is Ej (N (j)) = ∞.

Once given the prior theorem, the opposite implication is also true. This means
that given the expected value of the number of times a chain visits one state, we
are able to know if this state is recurrent or transient.
To continue, we are going to study some important properties related with re-
current and transient states, which are necessary, in practice, to classify the states
of all classes.

Proposition 4.2.6. Given one state i, if i is recurrent and i → j, then j → i.

Proof. We begin assuming that the next property does not hold j → i, so we have
j 6→ i. This fact means that once the chain has reached the state j it will never
been able to return to the state i again, in which it was located initially. Hence,
once we had given up the state i we cannot come back again, so the state i is not
recurrent, it is transient, which is a contradiction.

By the last proposition, we can check that given both states i, j; the state i is
recurrent if once we have given up it we are not able to come back, this means that,
i → j but j 6→ i.

17
Proposition 4.2.7. Consider two states i, j if i is recurrent and i → j, then j is
also recurrent.

Proof. To begin with, we know that if i is recurrent and i → j, then also we

(M )
have that j → i. Therefore, there exists integers M, m ≥ 1 such that pj,i > 0
(m)
and pi,j > 0 and then, using the Chapman-Kolmogorov equation, we obtain the
(M +n+m) (M ) (n) (m)
following outcome pj,j ≥ pj,i pi.i pi,j , which means that to go from state j
to the state j in M + n + m steps, we can go from j to i in M steps, then from i
to i in n steps and finally from i to j in m steps. With this, we can compute the
following expected value
∞
X ∞
X ∞
X ∞
X
(k) (k) (M +n+m) (M ) (m) (n)
Ej (N (j)) = pj,j ≥ pj,j = pj,j ≥ pj,i pi,j pi,i
k=1 k=M +n+m n=1 n=1
(M ) (m)
= pj,i pi,j Ei (N (i)) = ∞,

the last equality is due to the fact that i is a recurrent state, so now we have that
j is also recurrent.

To carry on, as an immediate consequence of both properties we have already

studied, we have the following results.

Corollary 4.2.8. Considering a communication class, C, then all its elements are
recurrent or transient.

Corollary 4.2.9. Given a class, C, which is finite, irreducible and also closed, then
all its states are recurrent.

Proof. Firstly, we have that a given class C is finite and closed, which implies that
there is at least one recurrent state. The class C is also irreducible, this means that
all its states are communicated so, considering the results studied before, these
states have to be recurrent.

Now, we introduce an example in which we analyze all the concepts we have just
studied.

Example 4.2.10. Consider the Markov chain with state space I = {1, 2, 3, 4, 5}
and transition matrix 1
0 12 0 0

2
0 1 0 3 0
 4 4 
1 2
Π = 0 0 3 0 3 

1 1
 4 2 0 41 0 

1 1 1
3
0 3
0 3

18
In this case there are two communication classes which are

C1 = {2, 4}, C2 = {1, 3, 5}.

To continue, we analyze the communication relation that exists between the

different states in order to determine weather the previous classes are recurrent or
transient. For this, we use the diagram associated to the stochastic matrix, which
is

We begin studying the states of the class C1 . In this case, the state 4 is transient
because if we start in this state, we are able to move to state 1 and once there
we cannot return back to state 4. We know that if a state is transient, the rest
of the states within the same class are transient too. Hence, the whole class C1 is
transient.
Finally, we study the defining characteristics of the class C2 . This class is closed,
because of we can move from the state 4 ∈ C1 to the state 1 ∈ C2 , but once we
reach the class C2 we are not able to give up this class. Furthermore, this class is
finite and irreducible due to all the states satisfy the property of communication,
then, based on Corollary 4.2.9. C2 is a recurrent class.

Once defined the concepts of recurrent state and transient state, and having
studied some of their properties, we can define new ways to classify the states of a
Markov chain.

Definition 4.2.11. A state i of a Markov chain is called absorbing if pi,i = 1, so

an absorbing state is necessarily recurrent.

With the last definition, we obtain that if a Markov chain is in an absorbing

state i, it will not leave it due to pi,j = 0, ∀i 6= j, so the closed class is only make
up by state i, C = {i}.

19
Definition 4.2.12. An essential state is one that allows exit and return, in other
words, given an essential state i for all j ∈ I such that i → j it is also true that
j → i.

Given the prior definition, if the state allows to leave it but not to return, then
it is called inessential, hence state i is inessential if for j ∈ I exist n ≥ 1 such that
(n) (m)
pi,j > 0 but pj,i = 0, ∀m ≥ 1.

4.3 Periodicity and cyclic classes

Now, we suppose the state space is made up of disjoint sets and at each point of
time we go from one set to another until at the last moment we return to the initial
set. These sets are called cyclic subclasses, which we analyze later, but first we
will study the period of the essential states which determines the number of cyclic
subclasses that each one of them has.
Definition 4.3.1.
n Given essential
o state i ∈ I, the period of the state i denoted d(i)
(n)
is d(i) = gcd n ≥ 1; pi,i > 0 , so we say that the state i is periodic if d(i) > 1.
Moreover i is said to be aperiodic if d(i) = 1.

Hence, we have seen that a state i ∈ I is periodic if the greatest common divisor
of the number of steps to return to the starting point is greater than one.
To continue, the following proposition studies the fact that if all states are in
the same class, then they have the same period.
Proposition 4.3.2. Consider two states i, j ∈ I which satisfy the following property
i ↔ j, then the period of the states are the same, d(i)=d(j).

Proof. Given i, j ∈ I a pair of distinct states. Then i ↔ j and there exists n, m ≥ 1

(n) (m) (n+m) (n) (m)
such that pi,j > 0 and pj,i > 0, thereby we have pi,i ≥ pi,j pj,i > 0 and hence,
(k)
n + m must be divisible by d(i). Let k ≥ 1 such that pj,j > 0, now we have
(n+k+m) (n) (k) (m)
pi,i ≥ pi,j pj,j pj,i > 0, in this case d(i)|n + k + m, thus k is divisible bu d(i).
(k)
This is true for every k such that pj,j > 0, so d(i)|d(j) by the definition of period.
Reversing the roles of i and j we have d(j)|d(i), so d(i) = d(j).

Consequently, the period of a class can be defined as the period of its states.
Now, we consider the congruence equivalence relation, so given m, n ∈ Z we have
m ≡ n(mod d) which means that m − n is divisible by d. For the following results,
we suppose that C forms an essential states class and we fix a reference state i ∈ C
with period d.
(r)
To continue, given j ∈ C and considering r ≥ 1 such that pj,i > 0, if the
(m) (n) (m+r) (m) (r)
probabilities satisfy pi,j > 0 and pi,j > 0, then we have that pi,i ≥ pi,j pj,i > 0
(n+r) (n) (r)
and pi,i ≥ pi,j pj,i > 0. Suppose that the period of the state i is d, so it divides

20
m + r and also n + r. Consequently, m − n is divisible by d and hence we can rewrite
this fact as m ≡ n(mod d). Finally, we define sj as the remainder when n is divided
(n)
by d for any n with pi,j > 0.

Corollary 4.3.3. For any j ∈ C corresponds an integer sj = 0, 1, ..., d − 1 such

(n)
that pi,j > 0 which implies that n ≡ sj (mod d).

To carry on, we are going to present the cyclic classes. In order study this
concept, we will need the outcomes obtained right now, where we have been studying
the congruence relation, so considering h ∈ {0, 1, ..., d − 1}, we define
(n) (n)
Ch = {j ∈ C; pi,j > 0 for sj ≡ h(mod d)} = {j ∈ C; pi,j > 0 for n ≡ h(mod d)},

the last equality is due to the congruence for a fixed module is an equivalence
d−1
[
relation, moreover C0 = Cd . Once given this, we can express C = Ch so the sets
h=0
C0 , ..., Cd−1 , which are disjoint, are called the cyclic subclasses of I.
Now, we study a result that will be helpful when classifying the states that
compose a Markov chain into cyclical subclasses. This result takes into account all
the non-null elements of the transition matrix.

Proposition 4.3.4. Given j ∈ Ck and pj,l > 0, by convention Cd = C0 it follows

that if k < d − 1, then l ∈ Ck+1 and if k = d − 1, then l ∈ C0 .
(n) (n+1) (n)
Proof. Let n ≥ 1 such that pi,j > 0, we have that pi,l ≥ pi,j pj,l > 0. As j ∈ Ck ,
we get that n ≡ k(mod d) and, as the congruence equivalence relation is preserved
by sums, we obtain n + 1 ≡ k + 1(mod d) which implies that l ∈ Ck+1 .

To carry on, we introduce an example, of a Markov chain with finite state space,
in which we study the period and the cyclic subclasses.

Example 4.3.5. Consider the Markov chain with state space I = {1, 2, 3, 4, 5, 6, 7}
and stochastic matrix
 
0 0 0 1 0 0 0
0 0 0 0 0 0 1
 
1 0 0 0 0 0 0
0 21 0 0 1
 
Π = 0 0
 
2 
1 2
0 0 0 0 0
3 3 
2 1 
3 3
0 0 0 0 0
0 0 0 0 13 2
3
0

In this case, we have that the Markov chain is irreducible, due to there is an unique
equivalence class. Now, we study the cyclic subclasses that the chain has. For this,

21
we note that all states are essential, thus, the period of this class is the period of
any state. Consider for example the state i = 1, and then d(1) = 3. Therefore,
there are three cyclic subclasses and these are the following ones

C0 = {1, 2}, C1 = {4, 7}, C2 = {3, 5, 6}

4.4 Example: an analysis of the random walk on Z

In this section, we have an example in which we analyze all the concepts that we
have studied previously.
As we explain in the first chapter, this example focuses on the real line of integers.
We started at state i and each time of point we can move one step forward or one
backward with probabilities p and q = 1 − p respectively.
Using the previous explanation, it is easy to check that all the states communi-
cate, thereby there is only one class which is irreducible.
To continue, we know that all the states of a given class should be recurrent or
transient. In our case, we will focus on checking if the initial state, which is the
state 0 (X0 = 0) to simplify, is recurrent or transient and then we will have that
the rest of the states belong to the same type.
Before carrying on, we need to mention that the stochastic process is composed
by random variables
Pn ζi that follow a Bernoulli distribution, then the stochastic
process Xn = i=1 ζi follows a Binomial distribution so we have that

n k
P (Xn = 2k − n) = p (1 − p)n−k where k = 0, 1, ..., n.
k

Now, to classify the stateP0, we use the Theorem 4.2.5., so it is only necessary
consider the following sum ∞
(m)
m=1 p0,0 , which is the expected value of visits to the
state 0, for a chain that begins in 0.
(2n+1)
Firstly, if we consider m as an odd number we have p0,0 = 0, so we cannot
come back to the initial state 0 with an odd number of movements, due to the
number of steps to the right should be equal to the number of steps to the left.
Then, if the number of movements is even we have that

(2n) 2n n
p0,0 = P (X2n = 0) = p (1 − p)n .
n
Therefore we get
∞ ∞ ∞ ∞
X (m)
X (2n)
X 2n n n
X (2n)!
p0,0 = p0,0 = p (1 − p) = pn (1 − p)n ,
n n!n!
m=1 n=1 n=1 n=1

in this case, Stirling’s formula, if n is large, provides a good approximation to n!,

and using the notation of [3, pag 9] we have
1 √
n! ≈ nn+ 2 e−n 2π as n → ∞.

22
(2n) (4p(1−p))n
Hence, we can approximate the probability p0,0 ≈ √
πn
. To continue, we
(2n)
have to consider two cases, the first one is when p = 12 then we have p0,0 ≈ √1
πn
so
the sum is ∞ ∞
X (m)
X 1
p0,0 ≈ √ =∞
m=1 n=1
πn
in this case the state 0 is recurrent as the last sum is infinite, and hence, all states
are recurrent.
(2n) n n
The second case is when p 6= 12 , then p0,0 ≈ (4p(1−p))
√
πn
= √r
πn
where 0 < r < 1 so
the sum is ∞ ∞ ∞
X (m)
X rn X
p0,0 ≈ √ ≤ rn < ∞
m=1 n=1
πn n=1
and now the state 0 is transient as the last sum is finite, and hence, all states are
transient.
Finally, we are going to study the cyclic subclasses. We observe that all the
states are considered as essentials, due to if we move from anyone of them we are
able to come back in a future point of time.
As we said before, the chain is irreducible, then exists an unique class which has
a period of 2, so we just have two cyclic subclasses which are

C0 = {j ∈ Z; p0,j > 0 for n ≡ 0(mod 2)} = {0, ±2, ±4, ...},

(n)

C1 = {j ∈ Z; p0,j > 0 for n ≡ 1(mod 2)} = {±1, ±3, ±5, ...}.

(n)

23
5 Hitting times
In this chapter, we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where {Xn ; n ≥ 0} is a HMC(ν, Π)
and I is a countable set called state space. Let A be a subset of the state space I,
so A ⊆ I.
In this section, we study which is the probability that a Markov chain reaches
a state which is within the subset A. In order to analyze this concept would be
necessary to introduce some new concepts.
The random variable H A : Ω −→ {0, 1, 2, ...} ∪ {∞} defined by

H A = inf {n ≥ 0; Xn ∈ A}

is called the hitting time of A, which is the first time the chain hits the subset A.
In addition, if we consider the empty set, then its infimum is ∞.
To continue, we define the probability that, starting from state i ∈ I, the chain
hits A as
hA A
i = Pi (H < ∞).

Consider the particular case in which the chain is initially in state i and also i ∈ A,
then we have that the hitting time is H A = 0 and the probability hA i takes the
following value hAi = 1. We also have to consider the case in which the subset A is
A
a closed class, then hi is called absorption probability.
Now, we study the average time a chain needs to reach the subset A. This fact
is called mean hitting time and is defined by
X
µAi = E i (H A
) = nPi (H A = n) + ∞P (H A = ∞).
n<∞

In this case, we also consider the fact in which the subset A is a closed class, then
µA
i is called the absorption time.

In practice there are some cases in which we use the following notation

hA A
i = Pi (hit A), µi = Ei (time to hit A)

because these expressions are easy to find once we have the stochastic matrix.
To continue, we are going to study all the possible values that the probability hA
i
can take in function of whether the state i is on the subset A or not. To make this
analysis, we use the minimal solution expression which means that if x = (xi ; i ∈ I)
is a minimal solution and y = (yi ; i ∈ I) is another solution such that yi ≥ 0, then
yi ≥ xi ∀i ∈ I.
Theorem 5.0.1. The vector of probabilities hA = (hA i ; i ∈ I) is the minimal non-
negative solution of the following system
A
hi = P1 for i ∈ A
(5.1)
hAi = p hA
j∈I i,j j for i ∈
/ A.

24
Proof. Initially, we show that the probability hA is a solution of the system. Firstly,
if X0 = i ∈ A, then H A = 0 and in this case the probability is hA = 1. Secondly, if
X0 = i ∈/ A, then H A ≥ 1 and using the Markov property we have

Pi (H A < ∞|X1 = j) = Pj (H A < ∞) = hA

and hence
X X
hA
i = Pi (H A < ∞) = Pi (H A , X1 = j) = Pi (H A < ∞|X1 = j)Pi (X1 = j)
j∈I j∈I
X
= hA
j pi,j
j∈I

with this we have that hA pi,j hA / A and hence hA is a solution of (5.1).

P
i = j∈I j , ∀i ∈

In general, there exist multiple solutions for the system. In this context, let us
to prove that the vector hA is the minimal solution. Now, suppose l = (li ; i ∈ I) is
another solution to (5.1) so we have hA = li = 1 ∀i ∈ A, if otherwise we have i ∈
i P /A
we can rewrite the linear equation li = j∈I pi,j lj as follows
X X
li = pi,j lj + pi,j lj .
j∈A j ∈A
/

In the last expression we substitute lj and we obtain

!
X X X X
li = pi,j lj + pi,j pj,k lk + pj,k lk
j∈A j ∈A
/ k∈A k∈A
/
XX
= Pi (X1 ∈ A) + Pi (X2 ∈ A, X1 ∈
/ A) + pi,j pj,k lk .
j ∈A
/ k∈A
/

Repeating this argument, which consists in substitute for lj the final term we have

li = Pi (X1 ∈ A) + · · · + Pi (Xn ∈ A, Xn−1 ∈ / A, ..., X1 ∈

/ A)
X X
+ ··· pi,j1 pj1 ,j2 · · · pjn−1 ,jn ljn .
j1 ∈A
/ jn ∈A
/

Finally, if the solution ljn is non-negative and the previous terms Pi (X1 ∈ A), ...,
Pi (Xn ∈ A, Xn−1 ∈ / A, ..., X1 ∈/ A) sum to Pi (H A ≤ ∞), then li ≥ Pi (H A ≤ n)
which implies that

li ≥ lim Pi (H A ≤ n) = Pi (H A < ∞) = hA
i .
n→+∞

With this we get that the vector of probabilities hA = (hA

i ; i ∈ I) is the minimal
non-negative solution to the linear equations (5.1).

25
Once we studied this result, we find a similar fact for the vector of mean hitting
times µA in the following theorem.

Theorem 5.0.2. The vector of mean hitting times µA = (µA i ; i ∈ I) is the minimal
non-negative solution of the following system
A
µi = 0 P for i ∈ A
µA
i = 1 + p µ
j∈I i,j j
A
for i ∈
/ A.

Proof. To begin we show that the mean hitting time µA is a solution of the system.
Firstly, if X0 = i ∈ A, then H A = 0 which implies that µA i = 0. Secondly, if
A
X0 = i ∈/ A, then H ≥ 1 and using the Markov property we have

Ei (H A |X1 = j) = 1 + Ej (H A ) = 1 + µA
j

and hence using some properties of the conditional expectation we get

1
X X
µAi = Ei (H A
) = Ei (H A
{X1 =j} ) = Ei (H A |X1 = j)Pi (X1 = j)
j∈I j∈I
X
= 1+ pi,j µA
j .
j ∈A
/

with this we have that µA pi,j µA / A and hence µA is a solution for

P
i = 1+ j ∈A
/ j , ∀i ∈
the system.
In general, there exists multiple solutions for the system. In this context, let
us to prove that the vector µA is the minimal solution. To continue, suppose that
r = (ri ; i ∈ I) is another solution to the system, so we have kiA = ri = 0, ∀i ∈ A.
Now suppose that i ∈ / A and then we have
!
X X X
ri = 1 + pi,j rj = 1 + pi,j 1 + pj,k rk
j ∈A
/ j ∈A
/ k∈A
/
XX
A A
= Pi (H ≥ 1) + Pi (H ≥ 2) + pi,j pj,k rk .
j ∈A
/ k∈A
/

Repeating this argument, which consists in substitute for rj the final term we have
X
ri = Pi (H A ≥ 1) + · · · + Pi (H A ≥ n) + pi,j1 · · · pjn−1 ,jn rjn .
j1 ,...,jn ∈A
/

Finally, if rjn is non-negative we get

Pi (H A ≥ 1) + · · · + Pi (H A ≥ n) = Ei (H A ) = µA

ri ≥ lim i
n→+∞

and with this we get that the vector of mean hitting times µA = (µA
i ; i ∈ I) is the
minimal non-negative solution of the system.

26
To continue, we introduce two examples in which we analyze the concepts pre-
viously studied. In the first one the state space is finite but in the second one not,
in this case, the minimality condition is essential.

Example 5.0.3. Consider the Markov chain with symmetric random walk on the
integers 1, 2, 3, 4. We want to study which is the probability of absorption in 1 if
we start in the state 2 or in the state 3. Using the result of the first theorem, this
is the system (5.1), we have to compute h2 = P2 (hit 1) and h3 = P3 (hit 1).

First we get h1 = P1 (hit 1) = 1 this is the probability of hitting 1 if we left 1.

Now, suppose that we start at state 2 and consider the situation after making
one step, so we jump to the state 1 or to the state 3 with the same probability
which is 12 and hence we have

1 1
h2 = P2 (hit 1) = h1 + h3 .
2 2
Similarly, the expression is obtained for the probability h3 and this is
1 1
h3 = P3 (hit 1) = h2 + h4 .
2 2
Both equations we have just found form a system of linear equations and hence, is
easier to get the values of h2 and h3 so, we have
(
h2 = 23 + 13 h4
1
h3 = 3
+ 23 h4 .

The value of h4 is not defined by the system (5.1) because being in state 4 we cannot
reach the state 1. In this case, we use the minimality condition and hence we get
h4 = P4 (hit 1) = 0. Substituting this value in the previous system we have that h2
and h3 take the following values h2 = 23 and h3 = 13 respectively. If we had not used
the minimality condition the result would be the same because, as the state space
is finite, to study the probabilities hi , is only necessary the prior diagram.
To continue, we compute the time that must take until the chain is absorbed
in state 1 or 4 if we start in the state 2 or in the state 3. In this case, we have
to compute the value of mean hitting time µ2 = E2 (time to hit {1, 4}) and µ3 =
E3 (time to hit {1, 4}) using the system of Theorem 5.0.2..

27
First we get µ1 = µ4 = 0 because we are initially in one of the states in which
we want to arrive.
Now, suppose that we start at state 2 and consider the situation after making
one step so we jump as we defined before. In this case the mean hitting time is
1 1 1
µ2 = 1 + µ1 + µ3 = 1 + µ3 .
2 2 2
Similarly, the expression is obtained for the mean hitting time µ3 and this is
1 1 1
µ3 = 1 + µ2 + µ4 = 1 + µ2 .
2 2 2
Using the two equations which we have just found, then, starting in the state 2, the
mean time for the chain to be absorbed by the state 1 or 4 is 2. On the other hand,
if we are initially in the state 3, the mean time to be absorbed by the state 1 or 4
is also 2.

Example 5.0.4. In the first chapter, we studied the example of a random walk on
Z with absorbing barriers, also called Gambler’s ruin on 0, 1, ... . Let us consider
the homogeneous Markov chain with the following diagram

Suppose that the gambler plays at heads or tails and initially he has i coins,
at each point of time he wins or loses one coin with probability p or q = 1 − p
respectively. The game ends when the gambler loses everything. We want to study
the probability that the player loses everything, we note that the only absorbing
state is {0}.
The transitions probabilities are

p0,0 = 1
pi,i−1 = 1 − p, pi,i+1 = p for i = 1, 2, ... .

We want to study the probability hA

i for state 0 when we start at state i, in other
words, we want analyze hi = Pi (hit 0). Using the previous theorem we have the
following system
h0 = 1
hi = phi+1 + qhi−1 for i = 1, 2, ... .
In this case, we have a recurrence relation of the form rhi+1 + shi + thi−1 = 0 where
r and t are non-zero. We compute a solution of the form hn = λn , so we have

28
rλ2 + sλ + t = 0. There are two solutions of the quadratic equation, these are
λ+ = 1 and λ− = pq , then hn = βλn+ + αλn− is a solution.
Firstly, if p 6= q we have that λ+ 6= λ− and we can solve the following equation
h0 = α + β which implies that β = h0 − α = 1 − α. Therefore, the previous
recurrence has the following general solution
i
q
hi = 1 − α + α , f or i ≥ 1
p

where α ∈ [0, 1] and the probability hi takes different values depending on the values
p and q. To continue we study two cases

(i) If p < q, then as hi ∈ [0, 1] implies that α = 0 and hence the minimal solution
is given by hi = 1.

(ii) If p > q and we have to find a minimal solution, the value of α has to be the
i
biggest one, then α = 1 and the minimal solution is given by hi = pq .

Finally, if p = q = 21 , then we have that λ+ = λ− = 1 and hn = (β + nα)λn+ is a

solution. To continue, we can solve the following equation h0 = βλn+ which implies
that βλn+ = 1. Therefore, the previous recurrence has the following general solution

hi = 1 + αi, f or i ≥ 1

in this case the restriction hi ∈ [0, 1] implies that α = 0 and again we have hi = 1
for all i.
1
In conclusion, for p ≤ 2
the player loses everything.

29
6 Distribution and measure
In this chapter, we shall study the limiting behaviour of Markov chains when at
time n approaches infinity. In particular, we will focus on the relationship between
this behaviour and probability distributions which are invariant. The previous
concept tells us the fraction of times that the Markov chain spends in each state
as n becomes large. Moreover, we study other utilities that has this distribution in
Markov chains.
In this chapter, we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where {Xn ; n ≥ 0} is a HMC(ν, Π)
and I is a countable set called state space.

6.1 Stationary or invariant distribution

In this section, we study the relationship between the long time behaviour of Markov
chain with the invariant distribution or measure. Remember that a measure µ is
a row vector µ = (µi ; i ∈ I) with non-negative entries. Moreover the measure µ is
invariant if µΠ = µ.
Definition 6.1.1. Given γ = (γi ; i ∈ I) a row vector which is a probability distri-
bution such that satisfies
X
0 ≤ γi ≤ 1, ∀i ∈ I and γi = 1
i∈I

and considering Π = (pi,j ; i, j ∈ I) a stochastic matrix, then if

X
γi pi,j = γj , ∀j ∈ I (6.1)
i∈I

we say that γ is an invariant distribution or stationary distribution for stochastic

matrix Π.

For a Markov chain {Xn ; n ≥ 0} with transition matrix Π and invariant distri-
bution γ, we can rewrite in matrix form the condition (6.1) as follows
γΠ = γ
where γ is a row vector as we said above.
Given γ an invariant distribution, we will study that the distribution (or law) of
the HMC(γ, Π), {Xn ; n ≥ 0}, is independent of n, for all n ≥ 0. Therefore, we will
see that all random variables Xn have the same law.
For n = 1 we have the equality (6.1) which we have analyzed previously. Now
we study the case in which n is 2, for this we will use the Chapman-Kolmogorov
equation so we have
!
X (2) X X X X X
γi pi,j = γi pi,k pk,j = γi pi,k pk,j = γk pk,j = γj ,
i∈I i∈I k∈I k∈I i∈I k∈I

30
(2) P (2)
and hence we get γj = P (X2 = j) = i∈I γi pi,j = γj which is the same law as the
previous case, where n = 1.
After that, we use mathematical induction over the number of steps needed in order
to the chain reaches the state j. Assuming the result is true for n − 1, we will see
that also holds for n. To continue, we want to compute the distribution of the
random variable Xn . For this, first of all, we study whether equality (6.1) holds but
in this case are needed n steps to go from state i to state j, so we have
X (n) X X X X (n−1)
γi pi,j = γi pi,i1 · · · pin−1 ,j = γi pi,in−1 pin−1 ,j
i∈I i∈I i1 ,...,in−1 ∈I i∈I in−1 ∈I
!
X X (n−1)
X
= γi pi,in−1 pin−1 ,j = γin−1 pin−1 ,j = γj ,
in−1 ∈I i∈I in−1 ∈I

(n) P (n)
in this case we get γj = P (Xn = j) = i∈I γi pi,j = γj and this is the law of the
random variable Xn .
Therefore, if the distribution of the initial state X0 is γ, then the following equality
P (n)
i∈I γi pi,j = γj implies that, for all n, P (Xn = j) = γj so, all random variables
have the same law. In addition, we can note that the distribution of all random
variables is independent of n.
To continue, once given the definition of invariant distribution and some proper-
ties, we study the existence of an invariant distribution for any stochastic matrix.
After this, we analyze this fact through some examples.

Proposition 6.1.2. Given a homogeneous Markov chain {Xn ; n ≥ 0} with finite

state space I, then all stochastic matrix Π has an invariant distribution.

Proof. First we consider a probability over P the space I, which we refer as w =

(wi ; i ∈ I), it satisfies 0 ≤ wi ≤ 1 and also i∈I wi = 1.
Before carrying on, we say that the number of elements which the finite space I
contains is denoted as |I| and it is called cardinal.
To continue, we define for all n ≥ 1 the equality wn = n1 n−1 (m)
P
m=0 wΠ , and now
we analyze that wn also defines a probability over the space I. On the one hand we
have
n−1 n−1
X 1 XXX (m) 1 X X X (m)
wn,i = wj pj,i = wj pj,i
i∈I
n i∈I m=0 j∈I n m=0 j∈I i∈I
n−1
1 XX 1 X
= wj · 1 = n wj = 1,
n m=0 j∈I n j∈I

and on the other hand we get that wn,i ≥ 0 because w ∈ [0, 1] and also pj,i ≥ 0.
Then, both necessary conditions are satisfied in order to consider wn as a probability.

31
The set of probability over the space I is a closed and bounded set in [0, 1]|I| .
To continue, suppose that there exists a convergent subsequence whose limit is an
element of the set defined above. This element has to be a probability over the
space I, which we note as γ. Thus, for any subsection {wnj ; j ≥ 1} we have that
wnj , when j takes very large values, converges to γ.
Now, we analyze that the probability γ is an invariant distribution. First we
have
nj −1 nj −1
1 X (m) 1 X 1
wnj − wnj Π = wΠ − wΠ(m+1) = (w − wΠnj )
nj m=0 nj m=0 nj

and hence as {w − wΠnj ; j ≥ 1} is a bounded sequence due to state space I is finite

we have
1
w − wΠ(nj ) = 0.

lim wnj − wnj Π = γ − γΠ, lim
j→+∞ j→+∞ nj

with this we get that γ − γΠ = 0 and hence, γ is an invariant distribution for Π

because the following equality is satisfied γΠ = γ.

Now, to study the previous concepts, we will use two practical examples that
differ on the number of invariant distributions for the stochastic matrix.

Example 6.1.3. Consider the Markov chain with state space I = {1, 2} and tran-
sition matrix !
1 3
4 4
Π= 1 4
5 5

In this example, we want to study that this chain has an invariant distribution γ.
To find this distribution, we have to prove the equality γΠ = γ, this is
!
14 43
γ1 γ2 1 4
= γ1 γ2
5 5

with this we obtain the following equations

γ1 = 14 γ1 + 51 γ2 ,

γ2 = 34 γ1 + 54 γ2 .

Moreover, we know the vector γ is a probability distribution, so it also has to satisfy

the equality γ1 + γ2 = 1.
4 15

Once resolved the system of linear equations, we find that γ = 19 , 19 , all the
components are positive and satisfy the three equations above, so they represent an
unique invariant distribution for the chain.

32
With this example, we have seen that the Markov chain with finite state space
has an unique invariant distribution. To carry on, we prove the prior result in one
of the examples that we have defined in Section 1, the random walk on Z with
absorbing barriers, with finite state space. In this case, we will see that there may
be more than one invariant distributions, therefore, the result above only assures
its existence.
Example 6.1.4. In this case, the Markov chain with state space I = {0, 1, ..., M }
is given by stochastic matrix
 
1 0 0 0 ··· 0
q
 0 p 0 ··· 0 
0
 q 0 p ··· 0 
Π = 0
 0 q 0 ··· 0 
 .. .. .. 
. . .
 
0 0 ··· q 0 p
0 0 ··· 0 0 1
Now we want to compute the invariant distribution for the stochastic matrix Π. For
this, we have to analyze the following equality γΠ = γ where γ = (γ1 , γ2 , ..., γM ),
so we have the following system


 γ0 = γ0 + γ1 q,
 γ1 = γ2 q,


γj = γj−1 p + γj+1 q for j = 2, ..., M − 2,
γM −1 = γM −2 p,




γM = γM −1 p + γM .


Moreover, we know that the vector γ is a probability distribution, so it also has to

satisfy γ0 + · · · γM = 1 and γi ≥ 0 for all i ∈ I.
Once solved the system of linear equations we get that γ1 = · · · = γM −1 = 0 and
γ0 + γM = 1. Hence the family of invariant distribution is
γ = (1 − α, 0, ..., 0, α) for α ∈ [0, 1].

In the last example, we observe that sometimes there exists more than one in-
variant distribution, so this distribution is not always unique. Moreover, in the last
two examples, we have seen that there is at least one invariant distribution.
In the following theorem, in which we can note the relationship between invariant
distributions and n-step transition probabilities, we study that the invariant distri-
bution is an equilibrium distribution because for all instants of time the invariant
distribution is the same.
Theorem 6.1.5. Given the state space I which is finite, we suppose that for some
i ∈ I is satisfied that
(n)
pi,j → γj as n → ∞ for all j ∈ I,
then γ = (γj ; j ∈ I) is an invariant distribution.

33
Proof. Firstly, we know that 0 ≤ γj ≤ 1 for all j ∈ I, this also holds for the
(n) (n)
probability pi,j so this is 0 ≤ pi,j ≤ 1 for all n ≥ 1 and i, j ∈ I. Now, we analyze
the vector γ and see that it is a probability distribution. Using the commutation
between the summation and the limit, due to state space is finite, we have
X X (n)
X (n)
γj = lim pi,j = lim pi,j = lim 1 = 1.
n→+∞ n→+∞ n→+∞
j∈I j∈I j∈I

Finally, using the Chapman-Kolmogorov equation, we analyze if γ is an invariant

distribution, so we get
(n) (n+1)
X (n) X (n)
X
γj = lim pi,j = lim pi,j = lim pi,k pk,j = lim pi,k pk,j = γk pk,j ,
n→+∞ n→+∞ n→+∞ n→+∞
k∈I k∈I k∈I

with this we get that the vector γ is an invariant distribution because the equality
holds.

Once given this result, we consider an example in which the state space I is not
finite, this can be the random walk on Z which we have studied in Section 4.4,
then the limit of the probability to go from state i to state j in n steps is invariant
(n)
because it satisfies pi,j → 0 := γj as nP→ ∞ for all i, j ∈ I, but the vector γ is
not a probability distribution because i∈I γi 6= 1 as well as it is not an invariant
distribution.
To continue, we introduce a practical example in which we study the result we
have just studied.

Example 6.1.6. Consider the Markov chain of Example 6.1.3.. Our goal is to
(n) (n)
compute the transition probabilities p1,1 and p2,2 .
First we have to compute the eigenvalues of the transition matrix Π. For this,
we calculate the characteristic equation, this is

1 4 3
det(Π − λId) = 0 ⇒ −λ −λ − =0
4 5 20
1
thus, solving the equation we obtain the following eigenvalues 1, 20 . Now, we can
rewrite the transition matrix Π as a diagonal matrix and hence exists an invertible
matrix A such that
!
n
1 0 1 0
Π= A 1 A−1 ⇒ Π(n) = A 1 n
A−1 .
0 20 0 20

(n)
Now, we want to compute the transition probability p1,1 . Using the n-th power
of the stochastic matrix we have
n n
(n) n 1 1
p1,1 = α · 1 + β =α+β .
20 20

34
To continue, we calculate the values of the constants α and β, so we have to solve
the following system of linear equations
(0)
(
1 = p1,1 = α + β
1 (1) 1
4
= p1,1 = α + 20
β.
4 15
so, the values of the constants are the following α = 19
and β = 19
.
(n)
Now, we use the last theorem applied on the probability p1,1 and we have
n
(n) 4 15 1 4
lim p1,1 = lim + = = γ1
n→+∞ n→+∞ 19 19 20 19
and with this we obtain that the distribution γ1 is invariant. If we had computed
(n)
the probability p2,1 , the result would have been the same. Now, we repeat the same
(n)
idea on the probability p2,2 , then we get
n n
(n) n 1 1
p2,2 = µ · 1 + λ =µ+λ ,
20 20
and now we want to compute the constants µ and λ, so we have to solve the following
system of linear equations
(0)
(
1 = p2,2 = µ + λ
4 (1) 1
5
= p2,2 = µ + 20
λ.
15 4
In this case, we obtain the values µ = 19 and λ = 19 , and hence the limit is the
following
n
(n) 15 4 1 15
lim p2,2 = lim + = = γ2 .
n→+∞ n→+∞ 19 19 20 19
With this we obtain that the distribution γ2 is invariant. If we had computed the
(n)
probability p1,2 the result would have been the same. Therefore, by Theorem
4 15

6.1.5. we get that γ = (γ1 , γ2 ) = 19 , 19 and it is an invariant distribution.

In the examples of this chapter, we can observe that the invariant distribution
may not exist, be unique or more than one distribution may exist. To continue,
we are going to introduce some results that allow us to set conditions in order
to guarantee the existence and uniqueness of the invariant measure. Firstly, we
remember the concept of measure and invariant measure.
Definition 6.1.7. A measure is any row vector µ = (µi ; i ∈ I) with µi ≥ 0 for
all i ∈ I. Moreover, we say a measure µ is invariant, if for any transition matrix
Π = (pi,j ; i, j ∈ I) satisfies
X
µi pi,j = µj , ∀j ∈ I,
i∈I

this equality in matrix form is the following µΠ = µ.

35
Now, in the following result, we analyze the existence of the invariant measure.
Theorem 6.1.8. Given the stochastic matrix Π = (pi,j ; i, j ∈ I) of the Markov
chain {Xn ; n ≥ 0}, which is irreducible and all states are recurrent. Consider
r −1
TX
λri = Er 1{Xn =i} ,
n=0

then

(i) λri ∈ (0, ∞) for all i ∈ I.

(ii) λr = (λri ; i ∈ I) satisfies λr Π = λr .

Proof. Given n ≥ 1 we have that the element {Tr ≥ n} depends only on X0 , ..., Xn−1
because none of the random variables Xn for all n ≥ 1 are in state r. Using the
Markov property at time n − 1 we get
Pr = (X1 = l, ..., Xn−1 = i, Xn = j and Tr ≥ n) = Pr (Xn−1 = i and Tr ≥ n)pi,j .
We also know that the states of the Markov chain are recurrent, so it is satisfied
Pr (Tr < ∞) = 1, which is the same as P (X0 = XTr = r) = 1. Now, we have
Tr ∞ ∞
1{Xn =j} = Er 1{Xn =j and Tr ≥n} =
X X X
λrj = Er Pr (Xn = j and Tr ≥ n)
n=1 n=1 n=1

To continue, the chain, before visiting the state j at time n, has been in some state
i ∈ I at time n − 1 so, this is the following
∞
XX
λrj = Pr (Xn−1 = i, Xn = j and Tr ≥ n)
i∈I n=1
∞ ∞
1{Xl =i and Tr −1≥l}
X X X X
= pi,j Pr (Xn−1 = i and Tr ≥ n) = pi,j Er
i∈I n=1 i∈I l=0
r −1
TX
1{Xl =i} =
X X
= pi,j Er λri pi,j .
i∈I l=0 i∈I

and with this we have that the following equality is satisfied λr Π = λr , which shows
(ii).
To continue, we have to prove (i). For this purpose, we know that all the states
of the Markov chain communicate among them. Then, for each state i ∈ I there
(m) (n)
are n, m ≥ 0 such that pi,r > 0 and pr,i > 0, with this we have
X (n) (n) (n)
λri = λrk pk,i ≥ λrr pr,i = pr,i > 0,
k∈I

(m) (m)
and, in addition, we obtain that 1 = λrr = k∈I λrk pk,r ≥ λri pi,r . With this we get
P
1
λri ≤ (m) < ∞. Hence, the vector λr satisfies λri > 0 for all i ∈ I, this shows (i).
pi,r

36
1{Xn =i} is defined as: given
P r −1
In the previous theorem, the equality λri = Er Tn=0
r
a fixed state r, λi is the mean time we have been in the state i before reaching the
state r. Moreover, remember that Tr = inf {n ≥ 1; Xn = r} is the first time the
chain visits the state r.
To continue, we will analyze a concept which we have used to study the result
which we have proved above.
Observation 6.1.9. Consider a Markov chain that starts in the state r, then we
have that n=0 1{Xn =r} = 1 and this implies that
PTr −1

r −1
TX
λrr = Er 1{Xn =r} = Er (1) = 1.
n=0

Now, once analyzed the existence of the invariant measure, we are going to study
the uniqueness for this invariant measure. In the following result we see that the
uniqueness is satisfied except for multiplicative constants.
Theorem 6.1.10. Given the stochastic matrix Π, which is irreducible and all states
are recurrent, then the invariant measure µ for Π is unique up to a multiplication
by a constant.

Proof. Firstly, given the invariant measure µ, we want to prove µ ≥ µr λr . Therefore,

considering the measure µ we have that
X X
µj = µi0 pi0 ,j = µi0 pi0 ,j + µr pr,j .
i0 ∈I i0 6=r

To continue, applying the same procedure to µi0 we obtain the following equality
!
X X
µj = µi1 pi1 ,i0 + µr pr,i0 pi0 ,j + µr pr,j
i0 6=r i1 6=r
!
X X
= µi1 pi1 ,i0 pi0 ,j + µr pr,i0 pi0 ,j + µr pr,j ,
i0 ,i1 6=r i0 6=r

and now, once calculated the previous equalities, we repeat this fact n − 1 times,
and then for all n ∈ N we obtain that
X
µj = µin pin ,in−1 · · · pi0 ,j
i0 ,...,in 6=r
 
X X
+µr  pr,in−1 · · · pi0 ,j + · · · + pr,i0 pi0 ,j + pr,j 
i0 ,...,in−1 6=r i0 6=r

≥ µr Pr (X1 = j and Tr ≥ 1) + · · · + µr Pr (Xn = j and Tr ≥ n)

n
X
= µr Pr (Xk = j and Tr ≥ k) → µr λrj as n → ∞.
k=1

37
In this case, for each j ∈ I, we get the following relation µj ≥ µr λrj for all j ∈ I, so
this implies that µ ≥ µr λr .
In addition, if the irreducible Markov chain has all its states recurrent, then, by
Theorem 6.1.8., λr is invariant, and hence, α = µ − µr λr is also invariant and
α ≥ 0. We have Π is irreducible so, given i ∈ I we can go from P state i to state r in n
(n) (n) (n)
steps, thus pi,r > 0. Using this we have 0 = µr − µr λrr = αr = j∈I αj pj,r ≥ αi pi,r ,
and hence αi = 0. This implies that the vector α = (αl ; l ∈ I) is null because all
its components are null, so we get that 0 = α = µ − µr λr which is the same as
µ = µr λr where µr is a constant, therefore, the invariant measure is unique.

With the results we have analyzed, we have been studied the existence and
uniqueness of the invariant measure. Now, we will introduce a practical example in
which we analyze the concepts we have studied in the last two theorems.

Example 6.1.11. Consider the example of the random walk on Z which we study
in Section 4.4. In this case, we have to consider the symmetric random walk, this
is p = 21 = q so, the stochastic matrix is
 .. .. 
. .
 1 1

· · · 0 0 0 · · ·
 2 2 
1 1
Π = · · · 0 2 0 2 0 · · ·
 
 
· · ·
 0 0 21 0 12 · · ·

.. ..
. .

In this case, the Markov chain is irreducible and also all its states are recurrent.
Now, we have to prove that the measure µ = (µi ; i ∈ I) is invariant, so it has to
satisfy the following equality µΠ = µ and this is equivalent to µi = 21 µi−1 + 12 µi+1
for all i ∈ I. The previous equality holds if µi = 1 for all i and hence, the measure
µ is invariant.

6.2 Positive recurrence

Now, we analyze the different types of recurrent states which we can find. Moreover,
we also study the relationship between these states and the invariant distribution.
First, remember that a state i is recurrent if it satisfies the equality Pi (Xm =
i for infinitely many n) = 1. In chapter 3 we studied that this is equivalent to the
following one ρi,i = Pi (Ti < ∞) = 1. To continue, we introduce some new concepts.

Definition 6.2.1. A state i ∈ I is said to be positive recurrent if, starting at

the state i, the expected time till the chain returns to the state i is finite, this is
equivalent to mi = Ei (Ti ) < ∞.
In the case in which the state i ∈ I is recurrent and does not satisfy the previous
condition, in other words if mi = Ei (Ti ) = ∞, the state i is called null recurrent.

38
In the last definition, the variable mi is called mean recurrence time at state i,
which is the expected return time to this state.
To continue, we analyze the relationship between recurrent states and null re-
current or positive recurrent states.
Corollary 6.2.2. Positive recurrent states and null recurrent states are both recur-
rent.

Proof. For a positive recurrent state, we have mi < ∞ and this means that Ti
cannot be ∞ with strictly positive probability. Hence, state i is recurrent. On the
other hand, for null recurrent states, is given by definition that these states are
recurrent.

Now, we analyze that for a Markov chain with stochastic matrix Π, to say that the
chain has positive recurrent states is equivalent to say that, the stochastic matrix Π
has an invariant distribution. In our case, the study focuses on irreducible Markov
chains.
Theorem 6.2.3. Given an irreducible Markov chain with stochastic matrix Π =
(pi,j ; i, j ∈ I), then the following properties are equivalent

(i) some state i is positive recurrent,

(ii) all states are positive recurrent,

(iii) the stochastic matrix Π has an invariant distribution γ.

1
In particular, when (iii) holds we have mi = γi
for all i ∈ I.

Proof. Firstly, we prove that (ii) implies (i). In this case, we have that all states
are positive recurrent, hence, each state i is positive recurrent and this shows (i).
Secondly, we prove that (i) implies (iii). Given r ∈ I a positive recurrent state,
then r is recurrent. We know that the chain is irreducible, hence there exists
an unique equivalence class and it contains one recurrent state, so all states are
recurrent. Theorem 6.1.8. tells that λr is an invariant measure, using this, we
construct an invariant distribution. First note that
r −1
TX r −1 X
TX
1{Xn =i} = Er 1{Xn =i}
X X
λri = Er
i∈I i∈I n=0 n=0 i∈I
r −1
TX
= Er 1 = Er (Tr ) = mr < ∞.
n=0

j λr
P a new random variable αj as αj = mr where j ∈ I.
From this equality we can define
In this case we have that, j∈I αj = 1 and hence, α = (αj ; j ∈ I) is an invariant
distribution.

39
To continue, we prove that (iii) implies (ii). Now, we know the Markov chain
is
Pirreducible and also the stochastic matrix Π has an invariant Pdistribution γ so,
(n)
γ
i∈I i = 1. Given a state r ∈ I, then we have that γr = γ p
i∈I i i,r > 0 for
any n ≥ 1. To continue, we can define an invariant measure µ = (µi ; i ∈ I) as
µi = γγri and hence we have µr = 1. Besides, as we know that the Markov chain is
irreducible, by Theorem 6.1.10. we obtain µ ≥ µr λr = λr . Therefore, we have
X X γi 1
mr = λri ≤ = <∞
i∈I i∈I
γr γr

with this we obtain that the state r is positive recurrent which shows (ii).
Finally, we have to prove the equality mr = γ1r , for this we assume that the
properties (i), (ii), and (iii) hold, so, all states are recurrent. By Theorem 6.1.10.
we have µ = µr λr and, using this, we obtain the following equality
X X γi 1
mr = λri = = ,
i∈I i∈I
γr γr

with this we obtain the required formula.

Now, once we have studied the positive recurrent and null recurrent states, which
are both recurrent states, we will analyze in an example the result we have just
studied.

Example 6.2.4. Consider the example of the random walk on Z for the case in
which p = 21 = q, we know that it is an irreducible Markov chain. In Example
6.1.11. we saw that there is an invariant measure µ and by Theorem 6.1.10., any
invariant measure is a scalar multiple of µ. Now, we analyze weather
X the symmetric
random walk is null recurrent or positive recurrent. We have µi = ∞ so, there
i∈I
can be no invariant distribution and, by Theorem 6.2.3., all states of the walk
are null recurrent.

Finally, we introduce an important result which is based on irreducible Markov

chains with finite state space, I. In this result, it is established a relationship with
positive recurrent states.

Proposition 6.2.5. Given an irreducible homogeneous Markov chain with finite

state space, then it is positive recurrent.

Proof. First, we show the recurrence. For this we assume that the Markov chain is
transient so, for all i, j ∈ I, we get
∞ ∞ ∞
Ei (1{Xn =j} ) = Ei (N (j)) < ∞,
X (n)
X X
pi,j = Pi (Xn = j) =
n=1 n=1 n=1

40
and hence, as the state space is finite we have j∈I ∞
P P (n)
n=1 pi,j < ∞, but the previous
sum is equal to
X∞ X ∞
X
(n)
pi,j = 1 = ∞,
n=1 j∈I n=1

which is a contradiction. Therefore, this implies that Markov chain is recurrent.

Now, by Theorem P 6.1.8. there exists an invariant measure λr and, as the state
space is finite, i∈I λri = Er (Tr ) = mr < ∞. Therefore, by Theorem 6.2.3. the
chain is positive recurrent.

41
7 Ergodic theory
In the previous chapter, we have introduced the concept of invariant distribution
where, among other things, we have also studied its possible existence. In this
section, we focus on doing an analysis of the uniqueness of the invariant distribution
which is related to a new concept, the ergodicity.
In this chapter, we should take into account a probability space (Ω, F, P ) and a
family of discrete random variables Xn : Ω −→ I where {Xn ; n ≥ 0} is a HMC(ν, Π)
and I is a countable set called state space.

7.1 Ergodic theorem

Now, we analyze the problem of the uniqueness of invariant distributions. For this
is necessary to know that a Markov chain is called ergodic if it is positive recurrent
and also we can go from each state to each other. First, we will introduce some
results that help us with the required analysis.
Definition 7.1.1. The stopping time is a random variable T : Ω → N ∪ {∞} such
that, for all m ≥ 0, the event {T = m} can be expressed in terms of X0 , X1 , ..., Xm .

To continue, we use the concept defined above to introduce an important result,

the strong Markov property. In this theorem, given a state j ∈ I, we simply wait
that the process hits this state at some random point of time, and then, we study
what happens thereafter. For this, it is necessary to define the unit mass at j as,
δj = (δj,i ; i ∈ I) which takes the following values

1 if j = i
δj,i =
0 if j 6= i
Theorem 7.1.2. Given a HMC(ν, Π), {Xn ; n ≥ 0}, and consider a stopping time
T of this Markov chain, then, conditional on T < ∞ and XT = j, {XT +n ; n ≥ 0}
is a new HMC(δj , Π) and also independent of the random variables X0 , ..., XT .

Proof. Firstly, let T be a stopping time and consider the random variables X0 , ..., XT
which determine an event A, where A ⊆ Ω and then, for all k ≥ 0, A ∩ {T = k} is
determined by X0 , ..., Xk . With this, we have the following
P ({XT = i0 , XT +1 = i1 , ..., XT +n = in } ∩ A ∩ {T = k} ∩ {XT = j})
= P ({Xk = i0 , Xk+1 = i1 , ..., Xk+n = in } ∩ A ∩ {T = k} ∩ {Xk = j}).
Now, remember the fact that A ∩ {XT = j} is determined by X0 , ..., Xk so, using
Markov property at time k and also considering the conditional probability we get
P (Xk = i0 , ..., Xk+n = in |A ∩ {T = k} ∩ {Xk = j})P (A ∩ {T = k} ∩ {Xk = j})
= P (Xk = i0 , Xk+1 = i1 , ..., Xk+n = in |Xk = j)P (A ∩ {T = k} ∩ {Xk = j})
= pj,i0 pi0 ,i1 · · · pin−1 ,in P (A ∩ {T = k} ∩ {Xk = j}).

42
The last equality is caused by the fact that {Xn ; n ≥ 0} is a homogeneous Markov
chain with transition probabilities pi,j . To continue, we sum over k = 0, 1, ... in the
equalities we have previously studied, and we obtain

P ({XT = i0 , XT +1 = i1 , ..., XT +n = in } ∩ A ∩ {T < ∞} ∩ {XT = j})

= pj,i0 pi0 ,i1 · · · pin−1 ,in P (A ∩ {T < ∞} ∩ {XT = j}).

Finally, we divide by P ({T < ∞} ∩ {XT = i}) the equality which we have just
calculated and we get

P ({XT = i0 , XT +1 = i1 , ..., XT +n = in } ∩ A|{T < ∞}, XT = j)

= pj,i0 pi0 ,i1 · · · pin−1 ,in P (A|{T < ∞}, XT = j),

and this proves that {XT +n ; n ≥ 0} is a homogeneous Markov chain and also it is
independent of the random variables X0 , ..., XT , as we want.

To continue, remember that in chapter 4 we introduced the random variable Ti

defined as Ti = inf {n ≥ 1; Xn = i}, which is the first time in which the chain visits
the state i and it is know as the passage time. Now, we define the r-th moment in
which the chain visits the state i with the following recurrence
(0) (1) (r+1) (r)
Ti = 0, Ti = Ti , Ti = inf {n ≥ Ti + 1; Xn = i}.

Now, we use the concept prior analyzed to introduce a new one, the length of the
r-th walk between two passage times, Ti . We define it as follows
(
(r) (r−1) (r−1)
(r) Ti − Ti if Ti <∞
Si = (r−1)
0 if Ti =∞

To carry on, we will study an outcome which will be useful to prove the ergodic
(r) (r)
theorem. Before, we note that, for all r ≥ 1, random variables Ti and Si are
(r)
stopping times. This fact is due to {Ti = m} is equivalent to say that Xm = i and
previously we have passed r − 1 times through the state i. With this, we note that
(r)
Ti only depends on X0 , ..., Xm . Now, we introduce a new result which links the
probability of the first passage time with the r-th walk between two passage times.
(r−1) (r)
Proposition 7.1.3. Given r = 2, 3, ..., conditional on Ti , the r-th walk Si is
(r−1)
independent of the random variables {Xk ; k ≤ Ti }, then we have
(r) (r−1)
P (Si = m|Ti < ∞) = Pi (Ti = m).

Proof. Firstly, note that, to prove this result we use the strong Markov property.
(r−1)
In this case, we consider the stopping time T = Ti < ∞ of the homogeneous
Markov chain {Xm ; m ≥ 0}. Then, XT = i and conditional on T , {XT +m ; m ≥ 0}
is a HMC(δi , Π) and also independent of X0 , ..., XT . Once given these properties,

43
(r)
Si can be defined as the first passage time to state i for the chain {XT +m ; m ≥ 0}
(r)
so, this is Si = inf {m ≥ 1; XT +m = i}.
Therefore, using the properties we have just analyzed, we have the following
(r) (r−1)
equality P (Si = m|Ti < ∞) = Pi (Ti = m).

Using the concepts studied before, we can conclude that the non-negative random
(1) (2)
variables Si , Si , ... are independent and identically distributed. Now, we continue
considering the passage time and we will introduce a new result which links this
random variables with the Markov chains whose states are all recurrent. Remember
that a recurrent state, i, is one which satisfies Pi (Ti < ∞) = 1.

Proposition 7.1.4. Given an irreducible Markov chain {Xn ; n ≥ 0} with stochastic

matrix Π and all its states recurrent, then for all i ∈ I we have P (Ti < ∞) = 1.

Proof. Firstly, we know that the first passage time to state i, Ti , is independent
of the initial position of the MarkovPchain, thus, we can rewrite the probability
P (Ti < ∞) as follows P (Ti < ∞) = j∈I P (X0 = j)Pj (Ti < ∞). In this case, we
(n)
just have to prove Pj (Ti < ∞) = 1 for all j ∈ I. Given n ≥ 1 such that pi,j > 0
and by Section 4.2., as all states are recurrent, we have

1 = Pi (Xm = i for infinitely many m) = Pi (Xm = i for some m ≥ n + 1)

X
= Pi (Xm = i for some m ≥ n + 1|Xn = k)Pi (Xn = k)
k∈I
X (n)
= Pk (Ti < ∞)pi,k .
k∈I

P (n)
We know that k∈I pi,k = 1 and by the prior equation, we have that the probability
satisfies Pj (Ti < ∞) = 1.
Finally, substituting this value, in the following equality, we obtain
X X
P (Ti < ∞) = P (X0 = j)Pj (Ti < ∞) = P (X0 = j) = 1,
j∈I j∈I

the last equality is because the sum of probabilities is always 1.

To continue, we introduce an important theorem, this is the strong law of large

numbers, which will be useful to prove the ergodic theorem. We are not going to
demonstrate this theorem because its proof is highly complex and also requires some
previous results which are not closely related with Markov chains. An analysis of
this theorem can be found in the following book; Nualart, D.; Sanz, M.: Curs de
probabilitats.

44
Theorem 7.1.5. Consider {Yn ; n ≥ 1} a sequence of independent and identically
distributed random variables, and also non-negative, with E(Y1 ) = m, then

Y1 + · · · Yn
P lim = m = 1.
n→+∞ n

Now, we know that in chapter 4 we introduced the number of visits that the
chain does to the state i, denoted by N (i). We define the number of visits to the
state i before the instant n as follows
n−1
1{Xr =i} ,
X
Nn (i) =
r=0

then the coefficient Nnn(i) determines the time that the chain spends in state i, before
arriving at instant n. The following theorem examines the long-run value spent by
a Markov chain in each state. Moreover, in it we talk about the uniqueness of the
invariant distribution.
Theorem 7.1.6. (Ergodic theorem) Consider {Xn ; n ≥ 0}, an irreducible Markov
chain with stochastic matrix Π and initial distribution ν, then

Nn (i) 1
P lim = = 1.
n→+∞ n mi
In addition, if the Markov chain is positive recurrent, then given a bounded function
f : I → R, we have
n−1
!
1X X
P lim f (Xr ) = γi f (i) = 1,
n→+∞ n
r=0 i∈I

in this case, the invariant distribution γ = (γi ; i ∈ I) is unique.

Proof. First suppose the Markov chain is transient or null recurrent, then for all
i ∈ I, the number of visits to state i is finite, so limn→+∞ Nnn(i) = 0 = m1i with
probability 1.
Now, suppose the Markov chain is recurrent and fix a state i. Considering T = Ti ,
by Proposition 7.1.4., we get P (T < ∞) = 1. Now, by strong Markov property,
we have the Markov chain {XT +n ; n ≥ 0} with initial distribution δi = (δi,j ; j ∈ I)
and stochastic matrix Π and also, independent of X0 , ..., XT . Moreover, both chains
XT +n and Xn spent the same time at state i, for this, we only need to consider that
µ = δi .
Previously, we have defined the length of the r-th walk to the state i, denoted
(r)
by Si . To continue, we use this concept to analyze the time of the visit to the
state i depending in which instant, n, of the chain we focus. Firstly, we know
(N (i)−1)
that the moment of the last visit to the state i before the instant n is Ti n =
(1) (Nn (i)−1)
Si + · · · + Si and hence, we have
(1) (Nn (i)−1)
Si + · · · + Si ≤ n − 1.

45
Secondly, we also know that the moment of the first visit to the state i after the
(N (i)) (1) (N (i))
instant n − 1 is Ti n = Si + · · · + Si n and now we have
(1) (Nn (i))
Si + · · · + Si ≥ n.
Finally, using the inequalities which we have just studied, we obtain the following
(1) (Nn (i)−1) (1) (Nn (i))
Si + · · · + Si n S + · · · + Si
≤ ≤ i .
Nn (i) Nn (i) Nn (i)
(r)
Now, by Proposition 7.1.3. we get that Ei (Si ) = mi and using the strong
law of large numbers, we have
!
(1) (k)
Si + · · · + Si
P lim = mi = 1,
k→+∞ k
In addition, we have considered that the Markov chain is recurrent, therefore, by
chapter 4 we have the following equality

P lim Nn (i) = ∞ = 1.
n→+∞

Therefore, if we apply what we have just introduced in the inequalities previously

analyzed, we get the following

n
P lim = mi = 1,
n→+∞ Nn (i)

and thereby we obtain what we want to show, so P limn→+∞ Nnn(i) = m1i = 1.
To continue, let γ = (γi ; i ∈ I) be an invariant distribution for {Xn ; n ≥ 0},
in this case, suppose that the Markov chain has all its states positive recurrent so,
mi < ∞. In addition, given the function f : I → R, consider that |f | ≤ 1. With
this, for any J ⊆ I, we have
n−1 n−1
1X 1X
1{Xr =i} −
X X X
f (Xr ) − γi f (i) = f (i) γi f (i)
n r=0 i∈I
n i∈I r=0 i∈I
X Nn (i) X Nn (i) X Nn (i)
= − γi f (i) ≤ − γi + − γi
i∈I
n i∈J
n n
i∈J
/
X Nn (i) X Nn (i)
≤ − γi + + γi
i∈J
n n
i∈J
/

To carry on, we know that the probability that a Markov chain is in the state i is
γi . Then, the time period that we spend in this state, we already know it is m1i ,
must be equal to γi . So we have that γi = m1i and hence the following equality holds

P limn→+∞ Nnn(i) = γi = 1, for all i ∈ I. Therefore, using this we have that
X Nn (i) X Nn (i) X Nn (i) X
− γi + + γi ≤ 2 − γi + 2 γi
i∈J
n n i∈J
n
i∈J
/ i∈J
/

46
Now, let > 0 and consider J a finite subset, so J = {0, 1, ..., N }. Firstly, we
Nn (i)
− γi < 4 . Then, for
P P
/ γi < 4 . Secondly, for n ≥ N we get
have that i∈J i∈J n
n ≥ N and using what we have just established we obtain the following
n−1
1X X
f (Xr ) − γi f (i) < ,
n r=0 i∈I

and this shows the equality which we want to prove because the required convergence
is satisfied.

7.2 Finite state space

In this section, we study the uniqueness of the invariant distribution considering the
case in which the state space I is finite, so as I = {1, ..., N }. For this, we refer to
the non degenerate probability distribution of I,Pwhich we denote by π = (πi ; i ∈ I).
This distribution satisfies 0 < πi ≤ 1 and also i∈I πi = 1.
Now, we give a definition which is necessary to study the ergodic theorem with
finite state space.

Definition 7.2.1. A homogeneous Markov chain, with stochastic matrix Π = (pi,j ; i,

j ∈ I), is ergodic if, for all i ∈ I, it satisfies the following property
(n)
πj = lim pi,j , ∀j ∈ I
n→+∞

where πj is a non degenerate probability distribution.

In the previous definition, we observe that the limit is independent of the state
i ∈ I. To continue, the ergodicity helps us to study the problem of the uniqueness
of the invariant distribution. In the following result we analyze this fact.

Theorem 7.2.2. Given a homogeneous Markov chain, {Xn ; n ≥ 0}, with finite
state space I and stochastic matrix Π. Suppose that there exists k ≥ 1 such that
(k)
min pi,j > 0. Then, for all i ∈ I, there exists π = (πj ; j ∈ I) which satisfies
i,j∈I

(n)
πj = lim pi,j
n→+∞

where π is a non degenerate probability distribution. Furthermore, for all j ∈ I, it

satisfies X
πl pl,j = πj ,
l∈I

with this we have that π is the unique invariant distribution for Π.

47
(n) (n) (n)
Proof. Firstly, for all n ≥ 1, we denote by αj and βj the values min pi,j and
i∈I
(n)
max pi,j respectively. Now, using Chapman-Kolmogorov equation and knowing
i∈I
P (n)
that r∈I pi,r = 1, we get

(n+1) (n+1)
X (n)
X (n)
X (n) (n)
αj = min pi,j = min pi,l pl,j ≥ min pi,l min pl,j = min pi,l αj = αj ,
i∈I i∈I i∈I l∈I i∈I
l∈I l∈I l∈I

(n)
with this we get that {αj }n≥1 is an increasing sequence. Then, as 0 ≤ pi,j ≤ 1,
the above sequence is bounded, therefore exists a limit which we denote by αj .
Repeating the same argument for the maximum, we obtain
(n+1)
X (n)
X (n)
X (n) (n)
βj = max pi,l pl,j ≤ max pi,l max pl,j = max pi,l βj = βj ,
i∈I i∈I l∈I i∈I
l∈I l∈I l∈I

(n)
and now, the sequence {βj }n≥1 is decreasing. Then, as 0 ≤ pi,j ≤ 1, the above
sequence is bounded, therefore exists a limit which we denote by βj .
(n) (n) (n)
For the moment, by definition we know that αj ≤ pi,j ≤ βj , so if it complies
αj = βj , then, for all j = 1, ..., N , we will have that the following equality is
(n)
satisfied, limn→+∞ pi,j = πj where πj is equal to αj and also βj . Therefore, to show
the desired result, it is necessary to study, for all j = 1, ..., N , the following equality

(n) (n)
lim βj − αj = 0,
n→+∞

and then we have that βj = αj . In the following lines, we will analyze the previous
equality.
(k)
Firstly, we define θ as θ = min pi,j > 0, then we have the following probability
i,j∈I

(k+n)
X (k) (n)
X (k) (n)

(n)
X (n) (n)
pi,j = pi,l pl,j = pi,l − θpj,l pl,j + θpj,l pl,j
l∈I l∈I l∈I
X (k) (n)

(n) (2n)
= pi,l − θpj,l pl,j + θpj,j .
l∈I

(k) (n)
To continue, as pi,l ≥ θ and we know that pj,l ≤ 1, then we have the following
(k) (n)
relationship pi,l − θpj,l ≥ 0. Therefore, using this, we obtain

(k+n)
X (k) (n)

(n) (2n)
X (k) (n)
X (n) (n) (2n)
pi,j ≥ pi,l − θpj,l min pl,j + θpj,j = pi,l αj − θpj,l αj + θpj,j
l∈I
l∈I l∈I l∈I
(n) (2n)
= (1 − θ)αj + θpj,j ,
P (n)
in this equalities we use that j∈I pi,j = 1. Hence, using we have just calculated
(k+n) (n) (2n)
we have αj ≥ αj (1 − θ) + θpj,j .
(k+n) (n) (2n)
Using the same argument, we get βj ≤ βj (1 − θ) + θpj,j .

48
(k+n) (k+n)
Now, combining the two inequalities which we calculate, we have βj −αj ≤
(n) (n)
(βj − αj )(1 − θ). Then, by induction we obtain, for all r ≥ 1, the following
(rk+n) (rk+n) (n) (n)
0 ≤ βj − αj ≤ (βj − αj )(1 − θ)r ,
(n) (n)
and with this we get that limr→+∞ (βj − αj )(1 − θ)r = 0 due to θ > 0. Therefore,
there exists a sequence {nr }r≥1 which satisfies
(nr ) (nr )
lim βj − αj = 0.
r→+∞

(n) (n)
Hence, due to for {βj − αj } each term is less than or equal to the previous
(n) (n)
one, then we get limn→+∞ βj − αj = 0 as we want, and hence this implies that
βj = αj , which is the desired equality.
Finally, we will study that π is an invariant distribution. Firstly, we need to see
that π is a non degenerate probability distribution, for this suppose that n ≥ k, so
(n) (k) (k)
αj ≥ αj ≥ min pi,j = θ > 0, then we get
i,j∈I

(n) (n)
lim αj ≥ lim θ = θ and πj = αj = lim αj
n→+∞ n→+∞ n→+∞

and this, implies that πj > 0 for all j = 1, ..., N .

P (n)
Now, knowing that j∈I pi,j = 1 and using the hypothesis of the theorem,
(n)
πj = limn→+∞ pi,j , we have the following equality
X X (n)
X (n)
πj = lim pi,j = lim pi,j = 1, ∀i ∈ I
n→+∞ n→+∞
j∈I j∈I j∈I
P
thus, j∈I πj = 1. Moreover, we have previously seen that πj > 0, therefore we
have obtained that π = (πj ; j ∈ I) is a non degenerate probability distribution.
To continue, we have to study that the probability distribution is invariant, this
is X X (n) (n+1) (n)
πr pr,j = lim pi,r pr,j = lim pi,j = lim pi,j = πj ,
n→+∞ n→+∞ n→+∞
r∈I r∈I

where we have used the Chapman-Kolmogorov equation. Therefore, using what we

have just analyzed, we have that π = (πj ; j ∈ I) is an invariant distribution.

Now, we will introduce a practical example in which we analyze the properties

that we have studied in this section.
Example 7.2.3. Consider the homogeneous Markov chain of Example 6.1.3..
Our goal is to determine the number of invariant distributions that the chain has.
Firstly, for all i = 1, 2, we look at Example 6.1.6. where we study the values of the
(n) (n) 4
limits of the probabilities pi,1 and pi,2 , which are γ1 = 19 and γ2 = 15
19
respectively.
In this case, we have γ1 > 0 and γ2 > 0, and also γ1 + γ2 = 1, hence, the Markov
4 15
chain is ergodic. Now, by Theorem 7.2.2. the invariant distribution γ = 19 , 19
is unique.

49
To continue, once studied the ergodic theorem for finite state space, we will
analyze that the left implication is also true.

Corollary 7.2.4. Given a non degenerate probability distribution π = (πj ; j ∈ I)

(n)
over space I which satisfies limn→+∞ pi,j = πj , then exists an integer number k ≥ 1
(k)
for which min pi,j > 0.
i,j∈I

(n)
Proof. Firstly, given the equality limn→+∞ pi,j = πj we have that, for all j =
1, ..., N , the required number of steps to reach the state j is nj , so for all n ≥ nj ,
(n) (n)
pi,j > 0 where i ∈ I. Therefore, this implies that for all j = 1, ..., N , min pi,j > 0.
i∈I
(k)
Now, we define k = max(n1 , ..., nN ) and then, for all j = 1, ..., N , we get min pi,j > 0
i∈I
and this is what we wanted to prove.

The ergodicity of Markov chains on an arbitrary finite state space can be char-
acterized by the following notion.

Definition 7.2.5. A Markov chain with finite state space is regular if there exists
a power of the transition matrix, Π, with only positive entries.

To continue, we introduce a practical example in which we analyze the concept

which we have just defined.

Example 7.2.6. Consider the Markov chain with state space I = {1, 2, 3} and
transition matrix 1 1 
2 2
0
1 1 3
Π = 5 5 5
1 0 0
To continue, we will study if the Markov chain is regular or not. For this, we are
going to compute the successive powers of the stochastic matrix in order to check if
there exists a matrix where all of its entries are non-null. Firstly, we calculate Π(2)
and we get
7 7 3
20 20 10
(2)
Π =  37 7 3 

50 50 25 
1 1
2 2
0
with this we have an entry which is null, therefore, we have to calculate Π(3) , which
is the following
 109 49 21 
200 200 100
(3)
Π =  259 199 21 

500 500 250 
7 7 3
20 20 10
Now, all entries of the matrix are positive, therefore, this implies that the Markov
chain is regular.

50
To continue, using the concept which we have just introduced, the regularity, we
will study the uniqueness of the invariant distribution of Theorem 7.2.2..

Proposition 7.2.7. Consider the stochastic matrix Π = (pi,j ; i, j ∈ I) which is

regular, then the invariant distribution π = (πj ; j ∈ I) is unique.

Proof. Firstly, due to the stochastic matrix is regular, for all i, j ∈ I, we get
(n)
that pi,j > 0 and hence, by Theorem 7.2.2. we have the following equality
(n)
limn→+∞ pi,j = πj . To continue, given that the column vector α = (αk ; k ∈ I) has
all its components equal to 1, then we can define the matrix Q = (qk,j ; k, j ∈ I) as
Q = απ, in other words, each row of Q is the same and it is the non degenerate
probability distribution π. Once defined this, we have that limn→+∞ Π(n) = Q.
Now, we will study that the invariant distribution π is unique. For this, we
suppose that there exists another invariant distribution β = (βi ; i ∈ I), therefore,
= β and βΠ(n) = β for all n ≥ 1, but
it has to satisfy the following equalities βΠ P
as β is an invariant distribution, then βα = i∈I βi = 1 which implies that

lim βΠ(n) = β lim Π(n) = βQ = βαπ = π,

n→+∞ n→+∞

and hence, β = π and this show that the invariant distribution π = (πj ; j ∈ I) is
unique.

51
8 Conclusions
This project has a theoretical approach so it is difficult to get conclusions or specific
results. However, the research and analytical work carried out constitute a good
framework in order to develop future studies that look for practical conclusions
using the basics of Markov chains.
From my point of view, this thesis has been useful in order to analyze the sev-
eral applications of the probability concepts we have learnt along the degree. For
example, we have used the concept of conditional probability and independence in
the study of the Chapman-Kolmogorov equation.
Furthermore, I have been able to know the basics of Markov chains, as long as
their different properties and characteristics, an aspect which was unknown for me
till I carried out this project. However, I have not only learnt specific mathematical
aspects due to this project has been a great challenge for me and have allowed me
to improve my skills related with the research of information and presentation of
ideas in a clear and accurate manner.
Finally, I want to say that the project has been only focused on discrete time
Markov chains. Despite this, there exists another type of chains, continuous time
Markov chains, which have not been included in the scope of this thesis due to
I preferred to focus the analysis in a specific topic. In this regard, it could be
interesting to expand the scope of this study and to carry out a similar analysis
focused on continuous time chains.

52
A Basic probability concepts
In this section we will see the most relevant concepts about probability theory which
we will be very useful to study Markov chains.
In this chapter we will focus on the probability of countable sets. For this, we
will study the concepts of conditional probability and independence. Moreover, we
will focus on random variables, concretely discrete variables, due to they are the
ones that help us to define the concept of Markov chain.

A.1 Probability of countable sets

We are starting this section giving the general probability definition. After, we will
focus on the case of countable sets. Moreover, we also enumerate the most relevant
properties related with this probability.

Definition A.1.1. The triple (Ω, F, P) is a probability space which is defined by

(i) Ω is the sample space (set of possible outcomes).

(ii) F has σ-algebra structure, is a collection of subsets of Ω, this satisfies that

- Ω ∈ F,
- if A ∈ F, then Ac ∈ F,
∞
[
- if the sequence {An ; n ≥ 1} ⊆ F, then An ∈ F.
n=1

(iii) P, called probability of F, is an application P: F → [0, 1] which satisfies

- P(Ω)=1,
- if {An ; n ≥ 1} ⊆ F disjoint two to two (this is to say Ai ∩ Aj 6= ∅, ∀i 6= j),
then
∞
! ∞
[ X
P An = P (Ai ) .
n=1 i=1

The definition of probability we just exposed is quite general, so we focus on the

case where the structure of the sample space is usually finite (or countable) and
always equiprobable.

Definition A.1.2. Consider Ω = {wi ; i ∈ I} where I is finite or numerable (in the

numerable
P case I = N). Now, {pi ; i ∈ I} is a probability if pi ∈ [0, 1], ∀i ∈ I and
p
i∈I i = 1. In this case, the probability of any event A ⊆ Ω is defined as
X X
P (A) = P ({wi }) = pi .
i,wi ∈A i,wi ∈A

53
Once given the last definition, we have the following result.
Corollary A.1.3. Assume that Ω = {wi ; i ∈ I} where I is finite, so I = {1, ..., n},
and we consider that the sample space is equiprobable. Then, probabilities {p1 , ..., pn }
take values pi = n1 for all i = 1, ..., n. Moreover, the probability of any event A ⊆ Ω
is defined as
X X 1 1 X #A
P (A) = pi = = 1= .
i,w ∈A i,w ∈A
n #Ω i,w ∈A
#Ω
i i i

Now we list some important properties of the probability.

1. P (∅) = 0, P (Ω) = 1.
2. For all A ∈ F, P (A) ∈ [0, 1].
3. Assume that A, B ∈ F are disjoint, then P (A ∪ B) = P (A) + P (B) .
This property may be more general if we consider A1 , A2 , ..., An mutually
disjoint, in this case we have
n
! n
[ X
P Ai = P (Ai ).
i=1 i=1

4. P (Ac ) = 1 − P (A), for all A ∈ F.

5. Let A, B ∈ F and A ⊆ B, then P (A) ≤ P (B).
6. If A, B ∈ F, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
n
! n
[ X
7. Consider A1 , ..., An ∈ F, then P Ai ≤ P (Ai ).
i=1 i=1

A.2 Conditional probability and independence

In this section, we study two important concepts. Conditional probability and the
independence. Firstly we start to talk about the conditional probability.
Definition A.2.1. Let us assume that A, B ∈ F are events with P (A) > 0, we can
define the probability of B conditional on A as
P (A ∩ B)
P (B|A) = .
P (A)

The following results are three important formulas related to the probability
defined above.

1. Theorem (Compound probability) If we consider A1 , ..., An ∈ F such

that P (A1 ∩ · · · ∩ An−1 ) > 0, then
P (A1 ∩ · · · ∩ An ) = P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ) · · · P (An |A1 ∩ · · · ∩ An−1 ).

54
2. Theorem (Total probabilities) Given the partition B1 , ..., Bn of Ω
({Bi ; 1 ≤ i ≤ n} ⊆ F) with P (Bi ) > 0, ∀i = 1, ..., n. Then, for all A ∈ F
(event) we have
n
X n
X
P (A) = P (A ∩ Bi ) = P (A|Bi ) P (Bi ) .
i=1 i=1

3. Bayes’ theorem Given A1 , ..., An a partition of Ω ({Ai ; 1 ≤ i ≤ n}) and B

an event (it means B ∈ F) where P (Ai ) > 0, ∀i = 1, ..., n and P (B) > 0.
Then for all I = 1, ..., n
P (B ∩ Ai ) P (B|Ai )P (Ai )
P (Ai |B) = = n .
P (B) X
P (B|Aj )P (Aj )
j=1

After that, we study the concept of independence and some of its properties.
Definition A.2.2. Consider two events A, B ∈ F which are independents if

P (A ∩ B) = P (A)P (B).

Once given this specific definition of independence, we can consider the general
one.
Definition A.2.3. Let us assume that the events A1 , ..., An ∈ F are independents
if for all finite subsets {Ai1 , ..., Aik } ⊂ {A1 , ..., An } satisfy
k
Y
P (Ai1 ∩ ... ∩ Aik ) = P Aij .
j=1

To continue, we enumerate the properties of the independence.

1. ∅ and Ω are independents from the others.

2. An event A is independent from itself ⇔ P (A) = 0 or P (A) = 1.

3. Two events, A and B, are independents ⇔ Ac and B are independents ⇔ Ac

and B c are independents ⇔ A and B c are independents.

Now we want to expose a result that links the two concepts we have studied so far.
Proposition A.2.4. Let A, B ∈ F events such that their probability are strictly
positive, then

A and B are independents ⇔ P (B|A) = P (B) and P (A|B) = P (A).

Finally, we explain the concept of conditionally independent and we study the

relation which it has with the concept of Markov family.

55
Definition A.2.5. Consider the events A1 , A2 , A3 ∈ F such that P (A3 ) > 0, then
A1 and A2 are conditionally independent given A3 if
P (A1 ∩ A2 |A3 ) = P (A1 |A3 )P (A2 |A3 ).
Definition A.2.6. Given the events {A1 , A2 , A3 }, these form a Markov family if
P (A3 |A1 ∩ A2 ) = P (A3 |A2 ).
Observation A.2.7. If we consider A1 , A2 , A3 like a chronological sequence where:
A1 is the past, A2 is the present and A3 is the future; then P (A3 |A1 ∩A2 ) = P (A3 |A2 )
shows that the dependence relationship only matters in the present (not in the past).
Proposition A.2.8. Let us assume that A1 , A2 , A3 are events, so A1 , A2 , A3 ∈ F,
then A1 and A1 and A3 are conditionally independents respect A2 if, and only if
{A1 , A2 , A3 } are a Markov family (in other words, P (A3 |A1 ∩ A2 ) = P (A3 |A2 )).
Observation A.2.9. The last proposition can be summarized with the following
P (A1 ∩ A3 |A2 ) = P (A1 |A2 )P (A3 |A2 ) ⇔ P (A3 |A1 ∩ A2 ) = P (A3 |A2 ).

A.3 Discrete random variables and expected value

To complete the chapter, we study the discrete random variables. We begin defining
the random variable concept and then we focus on the discrete case, for which we
define its distribution too. Finally, we talk about the expected value.
Definition A.3.1. Given a probability space (Ω, F, P) and a measurable space
(R, B (R)), a random variable is a map which transforms elements of the sample
space Ω with reals, this means that X : Ω −→ R such that
∀A ∈ B (R) , X −1 (A) = {w ∈ Ω; X(w) ∈ A} ∈ F,
where B (R) is the Borel σ-algebra.

56
Observation A.3.2. The Borel σ-algebra is the smallest σ-algebra generated by
the collection of open or closed sets in R.

Definition A.3.3. The law (or distribution) of a random variable X is the proba-
bility P ◦ X −1 such that

∀A ∈ B (R) , P ◦ X −1 (A) = P (X −1 (A)) = P (w ∈ Ω; X(w) ∈ A).

Observation A.3.4. Usually we write PX instead of P ◦ X −1 , so we write (P ◦

X −1 )(A) = PX (A).

To continue, we focus on discrete random variables which we wanted to study.

Definition A.3.5. Given a random variable, this is discrete if the set X(Ω) is finite
or countable, this can be written as X : Ω −→ I where I is finite or countable.
Therefore, we can express the discrete random variable like {ai ; i ∈ I}.

Definition A.3.6. The law (or distribution) of a discrete random variable is called
probability mass function and it is defined by
X X
p(ai ) = P (X = ai ), for i ∈ I and p(ai ) = P (X = ai ) = 1.
i∈X(Ω) i∈X(Ω)

Definition A.3.7. The set Xi : Ω −→ I, i = 1, ..., n of random variables are

independent if, for any x1 , ..., xn ∈ I
n
Y
P (X1 = x1 , ..., Xn = xn ) = P (Xi = xi ).
i=1

To conclude this section, we talk about the expected value focusing on discrete
random variables.

Definition A.3.8. Given a discrete random variable X, {ai ; i ∈ I} where I ⊆ N

such that X X
|ai | p(ai ) = |ai | P (X = ai ) < ∞
i∈I i∈I
P P
we define the expected value of X as E(X) = i∈I ai p(ai ) = i∈I ai P (X = ai ).

57
B Examples
Example B.0.1. Consider a coin repeatedly tossed (p the probability of heads and
q = 1 − p that of tails). Let Hn and Tn the number of heads and tails in the first n
releases, proof that Xn = (−1)2Tn +Hn is a homogeneous Markov chain.
We define the set of random variables ∀k = {1, ..., n} as

0 if k is head
ξk =
1 if k is tail

which are independents, so we can express the number of heads and tails as following
manner n n
X X
Hn = ξk and Tn = n − ξk ,
k=1 k=1

2(n− k=1 ξk )+ n
Pn P Pn
this implies that Xn = P (−1)2Tn +Hn = (−1) k=1 ξk
= (−1)2n− k=1 ξk
n+1
and Xn+1 = (−1)2(n+1)− k=1 ξk = Xn (−1)2−ξn+1 = Xn (−1)−ξn+1 . Now, using the
conditional probability and independence, we see that Xn is a homogeneous Markov
chain, ∀i, j ∈ I we have

P (Xn+1 = j|Xn = i)
P (Xn (−1)−ξn+1 = j, Xn = i)
= P (Xn (−1)−ξn+1 = j|Xn = i) =
P (Xn = i)
j
P ((−1)−ξn+1 = i , Xn = i) P ((−1)−ξn+1 = ji )P (Xn = i)
= =
P (Xn = i) P (Xn = i)

j
= P (−1)−ξn+1 = := pi,j .
i

We obtain that pi,j is independent of n so Xn is a homogeneous Markov chain.

Example B.0.2. Weather in the Land of Oz. The Land of Oz is wonderful

by many reasons, but weather is not one of them. There is never two consecutive
days of good weather. If the weather has been good, the next day will be rainy or
snowy. If it rains (or snow) there are the same probabilities that the following day
the weather changes or remains unchanged, and if it changes, only half the time it
would be a good day. Represent the weather in the Land of Oz according to Markov
chains.
The stochastic matrix of this problem can be the following one
 1

1
0 2 2
1 1 1
4 2 4
1 1 1
4 4 2

58
so the associated diagram is

Example B.0.3. Drunkard’s walk. Consider a completely drunk person who

walks between his house and a bench. All the steps along his path can be designated
by a whole number between 0 (house) and n (bench). In the spot i there is a bar,
which is the starting point of the walk. The drunk person moves leftwards with a
probability p, and moves rightwards with probability q = 1 − p. When he reaches
his house or the bench, he stays there. Describe the drunkard’s walk according to
Markov chains.
In this case the stochastic matrix is

··· ··· 0 ···

 
1 0 0
 .. .. .. .. 
. . . .
0 ··· q 0 p 0 0 ··· 0
 
0 ··· 0 q 0 p 0 ··· 0
 
0 ··· 0 0 q 0 p ··· 0
 
. .. .. .. 
 .. . . .
0 ··· 0 ··· 0 ··· 1

and the diagram is represented as the following

59
Example B.0.4. A fisherman and his boats. A fisherman owns 4 boats that
rents to tourists. Each boat can be damaged with a probability p, independently of
the conditions of the other boats. The fisherman can repair his boats at nights, but
just one each night. Because of municipal rules, he can not rent any boat if there
are not 2 or more boats available. If one day he can rent his boats, he will offer all
the boats available. Study the Markov chain associated to this problem.
Given p, the probability that the boat is not damaged is q = 1 − p, so the
stochastic matrix is
 
0 1 0 0
p2 2pq q2 0 
Π=  3
 
2 2 3 
p 3p q 3pq q 
4 3 2 2 3 4
p 4p q 6p q 4q p + q

and the associated diagram is

Example B.0.5. Consider the following stochastic matrix

1−a a
Π=
b 1−b

where a, b > 0 and a + b > 0. Now, we want to compute the m-step transition
probability of every state. For this, is necessary to compute the m-th power of the
stochastic matrix, Π(m) . Firstly, we have to calculate the eigenvalues of Π, for this,
we compute the characteristic equation which is
det(Π − λId) = 0 ⇒ λ2 − λ + (a + b)(λ − 1) + 1 = 0
thus, solving the equation we obtain the following eigenvalues 1 and 1 − a − b,
whose eigenvalues are (1, 1) and (a, −b), respectively. With this, we can define the
following

1 a −1 1 b a 1 0
Q= , Q = , D= .
1 −b a + b 1 −1 0 1−a−b

60
Now, we can rewrite the transition matrix Π as follows Π = QDQ−1 , and hence,
this implies that Π(m) = QD(m) Q−1 so, we get
m
(m) 1 1 a 1 0 b a
Π =
a + b 1 −b 0 (1 − a − b)m 1 −1

(1 − a − b)m

1 b a a −a
= + ,
a+b b a a+b −b b

with this we have that each matrix entry is the m-step transition probability of
going from one state to another.

Example B.0.6. Guardian of the tower. A guardian can watch from the four
corners of a tower in the following manner: after staying 5 minutes in a corner he
throws a coin to the air to determine if he moves to the left (heads) or he moves to
the right (tails). This process is repeated to the infinity. Study the Markov chain
associated to the process. If p = 21 (p is the probability of obtaining heads) compute
Πn .
Given p = 12 , the probability of obtaining tails is q = 1 − p = 12 , so the stochastic
matrix is
 1 1

0 2
0 2
 1
 2 0 21 0 

 
 0 1 0 1 
 2 2 
1 1
2
0 2 0

and the associated diagram is

In this case, to obtain the matrix Π(n) we have to consider two cases, if n is an
even number, then the stochastic matrix is

61
1 1

2
0 2
0
 1
 0 2 0 21 

Π(n) = 1

1

2 0 2
0 

1 1
0 2 0 2

if n is an odd number we have the stochastic matrix

 1 1

0 2
0 2
1
 2 0 21 0 

(n)
Π = 
0 1 0 1 

 2 2
1 1
2
0 2 0

(a) Suppose that the starting corner is selected randomly. If Xn is the random
variable which defines where the guard is located after 5n minutes, compute the Xn
law.
In this case, as the start is selected randomly we have X0 = i where i = {1, 2, 3, 4}
and the probability is P (X0 = i) = 41 , so the vector known as initial distribution is
1 1 1 1

, , ,
4 4 4 4
and the law for Xn is
 1 1
 1 1
0 2
0 2 4 4
1
 2 0 12
 1 1
0  4  4

0 1 0
 
1 1
= 
1
 2 2 4 4
1 1 1 1
2
0 2
0 4 4

which is the same if we consider n an even number or an odd number

(b) Now suppose that the guard starts in a fixed corner.

In this case, we suppose that the guard starts in the first corner, so the vector
is (1, 0, 0, 0) and we have to study the law for two cases. Firstly, we consider n an
even number, the lay for Xn is
1    1
1
2
01 2
0
2
 1 1    
 0 2 0 2  0  0 
 1 0 1 0  0 =  1 
    
2 2    2
1 1
0 2 0 2 0 0

62
Secondly, if n is an odd number, then the law for Xn is
 1
    1
0 2
01 0
2
1 1    1
 2 0 2 0  0  2 
 0 1 0 1  0 =  0 
    
 2 2    
1 1 1
2
0 2 0 0 2

Example B.0.7. Consider the Markov chain with the state space I = {1, 2, 3, 4, 5}
and transition matrix
1 1

2 2
0 0 0
1 1 1
0 0

3 3 3
 
Π= 
 0 0 1 0 0 

1 1
0 0 0
 
 2 2 
1 1
2
0 0 0 2

In this case, there are four communication classes which are

C1 = {1, 2}, C2 = {3}, C3 = {4}, C3 = {5}.

To continue, we analyze the communication relationship that exists between the

different states. For this, we use the following diagram

The class C1 is recurrent because once we reach the class C1 we are not able to
give up this class, and also it is finite and irreducible. The state of the class C2 is
recurrent, in particular it is an absorbing state, because if we start in state 4, we
are able to move to state 3 and if we start in state 3 we are always on it. The state
of the class C3 is transient because if we start in state 4, we are able to move to
state 3 and once there we cannot return back to state 4. The state of the class C4
is transient because if we start in state 5, we are able to move to state 1 and once
there we cannot return back to state 5.

63
Example B.0.8. Consider the Markov chain with the state space I = {1, 2, 3, 4, 5}
and transition matrix
 
0 0 0 1 0
 1
 0 2 0 21 0 

1 1 1

3 3 0 0 3
Π= 
0 0 1 0 0
 
1 1
0 0 2
0 2

In this case, there are one communication class which is

C1 = {1, 2, 3, 4, 5}.

To continue, we analyze the communication relationship that exists between the

different states. For this, we use the following diagram

In this case we have an unique class, C1 , which contains all the states of the Markov
chain. Moreover, using the previous diagram it is easy to see that all the states
communicate, so the chain is irreducible, in particular the class C1 is recurrent
because we can visit every state infinite times.

Example B.0.9. Consider the Markov chain with the state space I = {1, 2, 3, 4}
and transition matrix
1 1
0 0

2 2
0 0 1 1
Π=  1 2 2

1
0 2 0 2

1 1
0 2 2
0

64
and the associated diagram is

In this case, we have to study if the states of the Markov chain are essential or
inessential. First, we note that being in state 1 we can leave but we cannot return
to it, so this state is inessential. Now, being in state 2 we can go to state 3 and once
there we can return to the initial state or go to state 4. If we go to state 4, from
there, we can return to the initial state. With this we have that state 2 is essential.
The same argument is valid for states 3 and 4, which are also essentials.

65
References
[1] Ash, R.B.: Basic probability theory, New York: Wiley, 1970.

[2] Brémaud, P.: Markov chains : Gibbs fields and Monte Carlo simulation, and
queues, New York: Springer, 1999.

[3] Ching, W.; Huang, X.; Ng, M.K.; Siu, T.K.: Markov chains models, algorithms
and applications, 2nd ed., Boston, MA: Springer, 2013.

[4] Freedman, D.: Markov chains, New York: Springer, 1983, Reprint of 1971
Holden-Day edition.

[5] Kemeny, J.G.; Snell, J.L.: Finite Markov chains, Princeton, N. J: Van Nostrand,
1960.

[6] Norris, J.R.: Markov chains, Cambridge University Press, 1997.

[7] Sanz, M: Probabilitats, Edicions Universitat de Barcelona, 1999.

[8] Schmidt, V.: Markov Chains and Monte-Carlo Simulation,

http : //www.mathematik.uniulm.de/stochastik/lehre/ss06/markov/skript
engl/node10.html, May 2016, Ulm University, 2006.

[9] Stroock, D.W.: An Introduction to Markov Process, 2nd ed., Berlin: Springer,
2014.

Book Markov Chains by Norris 1-37
No ratings yet
Book Markov Chains by Norris 1-37
37 pages
Markov Chains - With Stationary Transition Probabilities 2th - Kai Lai Chung
No ratings yet
Markov Chains - With Stationary Transition Probabilities 2th - Kai Lai Chung
312 pages
Markov Chain
No ratings yet
Markov Chain
61 pages
Notes Co905
No ratings yet
Notes Co905
51 pages
Bab Amrkov
No ratings yet
Bab Amrkov
96 pages
A Brief Introduction To Markov Chains - Towards Data Science
0% (1)
A Brief Introduction To Markov Chains - Towards Data Science
26 pages
Markov Chains2
No ratings yet
Markov Chains2
75 pages
DSBD Unit-Ii 2
No ratings yet
DSBD Unit-Ii 2
47 pages
Jeffrey S. Rosenthal - A First Look at Stochastic Processes-World Scientific Publishing (2020)
No ratings yet
Jeffrey S. Rosenthal - A First Look at Stochastic Processes-World Scientific Publishing (2020)
230 pages
07 Markov Chains
No ratings yet
07 Markov Chains
4 pages
Stochastic Processes Beamer
No ratings yet
Stochastic Processes Beamer
43 pages
Sijia Davis 2014
No ratings yet
Sijia Davis 2014
32 pages
ST202 Notes
No ratings yet
ST202 Notes
52 pages
Notes On Markov Chain
No ratings yet
Notes On Markov Chain
34 pages
Markov Chain Togola Molobaly Dit Bébé 202352180026
No ratings yet
Markov Chain Togola Molobaly Dit Bébé 202352180026
19 pages
PDF Document 3
No ratings yet
PDF Document 3
80 pages
Stochastic - Lecture Notes
100% (1)
Stochastic - Lecture Notes
108 pages
Ms Sabiana
No ratings yet
Ms Sabiana
14 pages
Main
No ratings yet
Main
93 pages
Continuous-Time Markov Chain 50050 29 Let Ui Be
No ratings yet
Continuous-Time Markov Chain 50050 29 Let Ui Be
9 pages
Introduction To Markov Chains
No ratings yet
Introduction To Markov Chains
14 pages
Markov Chains Ergodicity
No ratings yet
Markov Chains Ergodicity
8 pages
Markov Chains (4728)
No ratings yet
Markov Chains (4728)
14 pages
Discrete Time Markov Chains
No ratings yet
Discrete Time Markov Chains
59 pages
Stoch Procs Lecture1 2025
No ratings yet
Stoch Procs Lecture1 2025
20 pages
Markov Analysis
No ratings yet
Markov Analysis
8 pages
Markov Chain
No ratings yet
Markov Chain
8 pages
Markov Chain
No ratings yet
Markov Chain
37 pages
Notes Co905 11
No ratings yet
Notes Co905 11
44 pages
Ang A. H-S, Probability Concepts in Engineering Planning and Design, 1984
86% (14)
Ang A. H-S, Probability Concepts in Engineering Planning and Design, 1984
572 pages
Markov Chain
100% (1)
Markov Chain
28 pages
Markov Process and Application
No ratings yet
Markov Process and Application
7 pages
Markov Chains
No ratings yet
Markov Chains
55 pages
Markov Chain
No ratings yet
Markov Chain
7 pages
Markov Chains: 1.1 Specifying and Simulating A Markov Chain
No ratings yet
Markov Chains: 1.1 Specifying and Simulating A Markov Chain
38 pages
Probability & Statistics 2: Robert Šámal January 29, 2024
No ratings yet
Probability & Statistics 2: Robert Šámal January 29, 2024
29 pages
STA 412 Markov Chain
No ratings yet
STA 412 Markov Chain
11 pages
Markov Chain and Markov Processes
No ratings yet
Markov Chain and Markov Processes
9 pages
Zadaci Markov Lanac
No ratings yet
Zadaci Markov Lanac
10 pages
JMSSP 2013 149 154
No ratings yet
JMSSP 2013 149 154
6 pages
Markov Chain Random Walk
No ratings yet
Markov Chain Random Walk
128 pages
Stochastic Process Simulation in Matlab
No ratings yet
Stochastic Process Simulation in Matlab
17 pages
IntroMarkovChainsandApplications PDF
No ratings yet
IntroMarkovChainsandApplications PDF
8 pages
Markove Chain and Application in Forecasting
No ratings yet
Markove Chain and Application in Forecasting
10 pages
Scope For FINAL Exam Grade 9 2024 - TERM 4
80% (5)
Scope For FINAL Exam Grade 9 2024 - TERM 4
3 pages
Markov Chain For Transition Probability
100% (1)
Markov Chain For Transition Probability
29 pages
1 Introduction
No ratings yet
1 Introduction
4 pages
Markov Chains
No ratings yet
Markov Chains
61 pages
Stanford - Discrete Time Markov Chains PDF
No ratings yet
Stanford - Discrete Time Markov Chains PDF
23 pages
Markov Chain
No ratings yet
Markov Chain
16 pages
Markov Hand Out
No ratings yet
Markov Hand Out
14 pages
Markov Chains
No ratings yet
Markov Chains
61 pages
Markov Chains
No ratings yet
Markov Chains
6 pages
MATH858D Markov Chains: Maria Cameron
No ratings yet
MATH858D Markov Chains: Maria Cameron
44 pages
4703 07 Notes MC PDF
No ratings yet
4703 07 Notes MC PDF
7 pages
Markov Chains
No ratings yet
Markov Chains
42 pages
(Multiphysics Modeling) Mehrzad Tabatabaian - COMSOL For Engineers-Mercury Learning & Information (2014) PDF
No ratings yet
(Multiphysics Modeling) Mehrzad Tabatabaian - COMSOL For Engineers-Mercury Learning & Information (2014) PDF
272 pages
Namma Kalvi 12th Maths 1st Midterm Important Questions
No ratings yet
Namma Kalvi 12th Maths 1st Midterm Important Questions
41 pages
HALF YEARLY CLASS-8 MATH Rev. W.S - Updated
No ratings yet
HALF YEARLY CLASS-8 MATH Rev. W.S - Updated
2 pages
Math Project About Pi
0% (1)
Math Project About Pi
15 pages
Full-Potential Linearized Augmented Plane Wave
100% (1)
Full-Potential Linearized Augmented Plane Wave
8 pages
L8 CurveFitting (Interpolation)
No ratings yet
L8 CurveFitting (Interpolation)
60 pages
Algorithms and Decision Procedures For Context-Free Languages
No ratings yet
Algorithms and Decision Procedures For Context-Free Languages
16 pages
Progression and Series CPP-tripathi
No ratings yet
Progression and Series CPP-tripathi
3 pages
AQA Formulae Booklet
No ratings yet
AQA Formulae Booklet
16 pages
Basic Idea and Rules For Logarithms - Math Insight
No ratings yet
Basic Idea and Rules For Logarithms - Math Insight
1 page
Lecture-10 Impulse Response & Convolution Sum in DT LTI System
100% (1)
Lecture-10 Impulse Response & Convolution Sum in DT LTI System
40 pages
Very Important Grade 9 Maths
No ratings yet
Very Important Grade 9 Maths
2 pages
CSE Lab Report (1706045)
No ratings yet
CSE Lab Report (1706045)
30 pages
Script Dynamics
No ratings yet
Script Dynamics
13 pages
03 Center-Radius vs. General Form
No ratings yet
03 Center-Radius vs. General Form
24 pages
Division of Mandaue City: 2 Quarter Dmea
No ratings yet
Division of Mandaue City: 2 Quarter Dmea
39 pages
4 SuperNode-2fSupermesh Note
No ratings yet
4 SuperNode-2fSupermesh Note
2 pages
Birla Institute of Technology and Science, Pilani Pilani Campus
No ratings yet
Birla Institute of Technology and Science, Pilani Pilani Campus
3 pages
Euclid's Geometry-1
No ratings yet
Euclid's Geometry-1
10 pages
17 Montecarlo
No ratings yet
17 Montecarlo
35 pages
Mathematics For Engineers PDF Ebook-1011-1015
No ratings yet
Mathematics For Engineers PDF Ebook-1011-1015
5 pages
CSAT.24.B2.8 Coding and Decoding
No ratings yet
CSAT.24.B2.8 Coding and Decoding
10 pages
Tutorial Tutorial: SECTION - A (Aimstutorial - In) SECTION - A (Aimstutorial - In)
No ratings yet
Tutorial Tutorial: SECTION - A (Aimstutorial - In) SECTION - A (Aimstutorial - In)
4 pages
Circle Lesson
No ratings yet
Circle Lesson
16 pages
Session - 8-Sampling Distribution
No ratings yet
Session - 8-Sampling Distribution
9 pages
Electrostatics Manual v1.1
No ratings yet
Electrostatics Manual v1.1
10 pages
Iteration Methods
No ratings yet
Iteration Methods
12 pages
Ifmo
No ratings yet
Ifmo
6 pages
Quantum Mechanics: Fundamental Theories
From Everand
Quantum Mechanics: Fundamental Theories
Bharat Saluja
No ratings yet
Theory of Markov Processes
From Everand
Theory of Markov Processes
E. B. Dynkin
No ratings yet
Primer of Quantum Mechanics
From Everand
Primer of Quantum Mechanics
Marvin Chester
4.5/5 (5)
Mortals or Immortals
From Everand
Mortals or Immortals
Konstantinos p Anastasiadis
No ratings yet
Finite Quantum Electrodynamics: The Causal Approach, Third Edition
From Everand
Finite Quantum Electrodynamics: The Causal Approach, Third Edition
Gunter Scharf
No ratings yet
Fundamentals of physics
From Everand
Fundamentals of physics
Alessio Mangoni
2/5 (1)