0% found this document useful (0 votes)
22 views58 pages

Markov Chain

Uploaded by

Dipesh Talukdar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views58 pages

Markov Chain

Uploaded by

Dipesh Talukdar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Markov Chains

Kazi Saeed Alam


Lecturer, Dept. of CSE, KUET
Courtesy: Prof. Dr. Sk. Md. Masudul Ahsan
Stochastic Process
 Many real-world systems contain uncertainty and evolve over time.
 Weather, stock market,

 A stochastic process is simply a random process through time.


 A good way to think about it, it is the opposite of a deterministic process.

 In a deterministic process, given the initial conditions and the


parameters of the system, we can define the exact "position" of the
system at any time. In a stochastic process, we don't know where the
process will be, even if we know the initial conditions and parameters.
 Stochastic processes have played a significant role in various
engineering disciplines like power systems, robotics, automotive
technology, signal processing, manufacturing systems, semiconductor
manufacturing, communication networks, wireless networks etc
https://www.quora.com/What-is-a-stochastic-process-What-are-some-real-life-examples
2
Stochastic Process
 Stochastic processes (and Markov chains) are probability
models for such systems.
 A ~ or Random process is a indexed collection of random
variables {Xt}, where t runs through a given set T.
 Xt = state of system (some measurable characteristic ) at t

 ~ is a family of random variable { Xt | t T }, defined on a


given probability space, indexed by t that varies over index
set T.
 A discrete-time stochastic process is a sequence of random
variables
 X0, X1, X2, . . . typically denoted by { Xt }.
3
Stochastic Process
 Values assumed by Xt are called states, set of all
possible values of states constitute state space
(S)
 If state space is
 Discrete – called discrete-state process or chain
 Continuous - called continuous-state process
 If index set T is
 Discrete – called discrete-time process or sequence
 Continuous - called continuous-time process
Check these Book
• Intro to Probability_2nd ed_ DP Bertsekas
• Prob. Stat. and Stoch. Process – P Olofsson

4
Markov Chain
 Markov Property: a stochastic process is said to
have ~ if probability distribution of future state
depends only on present state and not on how the
process arrived in that state.
 Formally-The state of the system at time t+1
depends
P X t 1  xt 1only
| X t  xon the state of the
t , X t  1  xt  1 ,  X 0  xt   system
P X t 1  xt at time
1 | X t  xt 
t

 A stochastic process {Xt} having Markov


property is called Markov Process
 Markov chain -If the state space of a Markov
process is discrete
5
 If index set is also discrete – Discrete-time
Markov Chain

6
Stationary Markov Chain
 Stationary Assumption: For all states i, j and for all t, P(Xt+1 = j |Xt
= i) is independent of time (t)
P(Xt+1 = j |Xt = i) = P(X1 = j |X0 = i) = pij
; i,j = 0,1, …, s; t = 0,1, …,T

 This means that if system is in state i, the probability that the


system will transition to state j is pij no matter what the value of t is
 pij = probability that the system will be in state j at time t+1 given
that it is in state i at time t
 Called one step transition probability
 Stationary Markov Chain – having stationary transition probabilities
 What we are interested in

7
Markov Chain
 We generally represent transition probabilities of
Markov Chain by a ss Transition Probability
Matrix P
 p11 p12 .. .. p1s 
p p .. .. p  s
P  21 22 2 s  p 0 and  p 1
ij ij
 : : :  j 1
 
 ps1 ps 2 .. .. pss 

 It can also be represented by stochastic Finite state


MachineX1 X2 X3 X4 X5

8
Markov Chain
Simple Example
Weather:
• raining today 40% rain tomorrow

60% no rain tomorrow

• not raining today 20% rain tomorrow


80% no rain tomorrow
Stochastic FSM: Transition Prob. Matrix:
0.6 State 1: Rain
0.4 0.8 State 2: No Rain

rain no rain  0.4 0.6 


P  
 0 .2 0 .8 
0.2

9
Transforming a process to a Markov chain
 Whether or not it rains today depends on previous
weather conditions through last two days
 If it rained for past two days, it will rain tomorrow with
prob. 0.7
 If it rained today but not yesterday, it will rain tomorrow
with prob. 0.5
 If it rained yesterday but not today, it will rain tomorrow
with prob. 0.4
 If it has not rained for past two days, it will rain
tomorrow with prob. 0.2
 Let the state at time n – depend only on a single
day
 Not Markov chain
 Convert – n saying that it depend on both day
10
Transforming a process to a Markov chain
 State 0 - If it rained today and yesterday (RR)
 State 1 - If it rained today but not yesterday (NR)
 State 2 - If it rained yesterday but not today (RN)
 State 3 - If it did not rained either today or
yesterday (NN)
Today, Tomorrow
Yesterday, Today RR NR RN
RR  0NN
.7 0 0 .3
0
 0.5 0 0.5 0 
NR
P  
RN  0 0. 4 0 0. 6 
 
NN  0 0 . 2 0 0 . 8 

11
Markov Chain
Gambler’s Example

– Gambler starts with $4


- At each play we have one of the following:
• Gambler wins $1 with probability p
• Gambler looses $1 with probability 1-p
– Game ends when gambler goes broke, or gains a fortune
of $10
(Both 0 and 10 are
p absorbing
p states)
p p 1
1

0 1 2 9 10

1-p 1-p 1-p 1-p


Start
(4$)
12
Markov Chain
Coke vs. Pepsi Example

• Given that a person’s last cola purchase was


Coke, there is a 90% chance that his next cola
purchase will also be Coke.
• If a person’s last cola purchase was Pepsi, there
is an 80% chance that his next cola purchase
will also be Pepsi.

transition matrix: 0.9 0.1


0.8
 0.9 0.1
P  
coke pepsi
 0.2 0.8 0.2
State 1: Coke
State 2: Pepsi
13
Characteristics of a Markov chain
 What do we need to know to describe a Markov
chain?
 Next state depends on previous state only,
therefore, it is sufficient to know
 the distribution of its initial state X 0
 initial distribution P0 – pmf of X0
 the mechanism of transitions from one state to another.
 one-step transition probabilities pij.

14
Characteristics of a Markov chain
 Based on this data, we would like to compute:
 n-step transition probabilities pij(n);
 Qn the distribution of states at time n, which is our
forecast for Xn;
 The limit of pij(n); and Qn as n → ∞, which is our long-term
forecast.
(nearly constant)

15
0.7 0.3
0.6
n-step Transition Probabilities sunny rainy
 We may be interested in 0.4
 It rains on Monday. Make forecasts for Wednesday, and
Thursday. (For weather forecast example with above
FSM)
 Mathematically
 p21(2) = P { Wednesday is sunny | Monday is rainy }

 More generally
 If a Markov chain in state i at time m, what is the
probability that n periods later the Markov chain will be
in state j
 ie. P(Xm+n = j |Xm = i) = ?

16
n-step Transition Probabilities
 Since we are dealing with stationary Markov chain,
we can write
 P(Xm+n = j |Xm = i) = P(Xn = j |X0 = i) =pij(n)
 pij(n) = n-step probability of transition from state i to
state j
 p11 p12 .. .. p1s 
p p22 .. .. p2 s  s
P  21 pij 0 and p ij 1
 : : :  j 1
 If 
 ps1 ps 2 .. .. pss 

 Cleary pij(1)= pij


 Matrix Pn represents n-step transition probabilities
from any state i to state j
17 How will you find Pn ?
n-step Transition Probabilities
 pij(2)= probability that the system will be in state j
two periods from now, considering it is now in state
i

18
n-step Transition Probabilities
 To compute pij(2), we must go from state i to some
state k, then form state k to statepi1j 1 p1j
s
pi2 p2j
pij (2)
 pik pkj for all states i, j 2
k 1
i : j
pik pkj
 Clearly right side of the eq is then k
scalar product of ith row of matrix P pis : psj
s
with j column of matrix P
th
Time Time
2
 Hence, pij(2) is the ijth element of matrix P.P =
1 P2

 i.e. Matrix P2 represents 2-step transition probabilities


for all states i, j

19
n-step Transition Probabilities

pij (3) P ( X 3  j | X 0 i )
s
 P ( X 3  j | X 2 k )P ( X 2 k | X 0 i )
k 1
s s
 pkj pik (2)  pik (2) pkj i th row of P 2 . j th col of P
k 1 k 1

( P 3 )ij

20
n-step Transition Probabilities
 Extending the previous reasoning we can find that
 P(n) = n-step transition probabilities for all states i, j
= P(n-1).P = Pn
or

= P(n-m) .P(m) = P(m) .P(n-m)


= Pn-m .Pm = Pm .Pn-m = Pn

 Of course for n = 0, pij(0) = P(X0 = j |X0 = i) , so we


write
(0) 1 if j i
pij 
0 if j i
21
Markov chain
Coke vs. Pepsi Example (cont)

Given that a person is currently a Pepsi


purchaser, what is the probability that he will
purchase Coke two purchases from now?
Pr[ Pepsi?Coke ] =
Pr[ PepsiCokeCoke ] + Pr[ Pepsi Pepsi Coke ] =
0.2 * 0.9 + 0.8 * 0.2 = 0.34

2  00.9.9 00.1.1  0.9 0.1  0.83 0.17 


P       
 00.2.2 00.8.8  0.2 0.8  0.34 0.66

Pepsi  ? ?  Coke
22
Markov chain
Weather example (cont)

p21(2) = P { Wednesday is sunny | Monday is rainy }

=p21p11 + p22p21
= (0.4)(0.7) + (0.6)(0.4) 0.7 0.3
0.6
= 0.52.
sunny rainy

0.4

state 1 =
“sunny” state 2
= “rainy.

23
Markov chain
Coke vs. Pepsi Example (cont)
 Given that a person is currently a Coke purchaser,
what is the probability that he will purchase Pepsi
three purchases from now?

3  0.9 0.1  0.83 0.17   0.781 0.219


P      
 0.2 0.8  0.34 0.66  0.438 0.562

24
Unconditional State Probabilities
 One or n-step transition probabilities are conditional
prob.
 For ex. P(Xn = j |X0 = i) =pij(n)
 Sometimes, we may not know the state of the
Markov chain at time 0, but we are interested to
determine the prob. that the system is in state j at
time n
 That is P(Xn = j ) = ?
 Weather example
 Suppose now that it does not rain yet, but
meteorologists predict an 80% chance of rain on
Monday. How does this affect our forecasts?

25
Unconditional State Probabilities
 For, P(Xn = j ) = ?
 it is necessary to specify prob. distribution of initial state
 i.e. P(X0 = i ) for all states i,
 let it be vector Q0, where qi0= P(X0 = i ) for all states i

 Then, q10 1
P1j(n)
s q20 2
P2j(n)
P( X n  j )  qi pij (n)
0
: j
i 1 qk
0
Pkj
(n)

k
th n
Q 0 ( j column of P ) :
qs0 Psj(n)
s
Time 0 Time n

26
Markov chain
Coke vs. Pepsi Example (cont)

• Assume each person makes one cola purchase per week


• Suppose 60% of all people now drink Coke, and 40% drink
Pepsi
• What fraction of people will be drinking Coke three weeks from
now? 0.781 0.219
 0.9 0.1 3 
P   P  
 0.2 0 .8  0 . 438 0 . 562 
Qi - the distribution in week i
Q0=(0.6,0.4) - initial distribution
P[X3=Coke] = Q0 * (1st column of P3) = [0.6 0.4] * [0.781 0.438]T
= 0.6 * 0.781 + 0.4 * 0.438 = 0.6438
Q3= Q0 * P3 =(0.6438,0.3562)

27
Classification of States in Markov Chain
 We now know probabilities associated with states
 We can classify the states of the system
 Whether you can get from one state to another
 Whether you can return to a state
 To help in classifying states, we use a state diagram
from the weather example:
0.4 0.6
0.8

rain no rain

0.2

28
Classification of States - Definitions
 Path - a sequence of transitions from state i to
state j exists and has positive probability, i.e.,
pij(n)>0 for some n.
 State j is Reachable from state i if there is a path
from i to j
 Two states, i and j, Communicate (i ↔ j) if j is
reachable from i, and i is reachable from j.
 It is easy to check that this is an equivalence relation:
1. i ↔ i; since pii(0)=1
2. i ↔ j implies j ↔ i; and
3. i ↔ j and j ↔ k together imply i ↔ k.

29
Classification of States – Definitions (cont)
 A set of states S in a Markov Chain is a closed set
if
 All the states of S communicate with each other, and
 No state outside of S is reachable.
.6 .7
 .4 .6 0 0 0
 .5 1 .5
0 
.4 .5 2 .3 3 4 .4
 .5 0 0 .5
.8 .1
P  0 0 .3 .7 0
  5
S1
0 0 .5 .4 .1 .2
 0 0 0 .8 .2 S2

30
Classification of States – Definitions (cont)
 Irreducible Markov Chain - if there is only one
Closed set
 Eg. Weather, Coke vs Pepsi
 A state i is an Absorbing state if the process
never will leave the state
 i.e. the state returns to itself with certainty in one
transition
 pii = 1 (closed set with 1 member)
 Example of Absorbing State - The Gambler’s Ruin
 At each play we have the following:
 Gambler wins $1 with probability p, or loses $1 with

probability 1-p
 Game ends when gambler goes broke, or gains a fortune of $N
 Then both $0 and $N are absorbing states

31
Classification of States – Definitions (cont)
 A state i is a Transient state if the process may
never return the state again.
 i.e. there exists a state j that is reachable from i, but i is
not reachable from j.
 Mathematically, lim p ji ( n ) 0, for all j
n 
 A state is Recurrent if– upon entering the state,
the process definitely will return the state again.
 if and only if it is not transient.

32
Classification of States – Definitions (cont)
 Example – Gardener Problem .5
 Chemical test to check soil condition .2 1 2 .5
 New season productivity .3 .5
 State 1 – Good; State 2 – Fair; State 3 – Poor 3

Gardener observed – last year soil condition 1


impacts currents year productivity  .2 .5 .3


P  0 .5 .5
 Ex- Gardener Problem  0 0 1 
 State 1, 2 transient
 Can reach state 3 but never be reached back  0 0 1
 State 3 absorbing - p33=1 P100  0 0 1
 0 0 1

33
Classification of States

34
Classification of States

35
Classification of States

• States i and j are in the same


communicating class if i ↔ j:
i.e. if each state is accessible
from the other.

• Every state is a member of


exactly one communicating
class

Don’t confuse with recurrent class

transient class

recurrent class

36
Classification of States – Definitions (cont)
 Ex- Gardener Problem .5
 State 1, 2 transient .2 1 2 .5
 Can reach state 3 but never be reached back
.3 .5
 State 3 absorbing - p33=1 3
 0 0 1 1
P100  0 0 1
 Ex- Gambler Ruin (simple case – Quits when $0 or $4) 
 0 0 1
 State 2 – Transient or Recurrent?
 Ans. Transient
 1 0 0 0 0
1  p 1-
0 
p p p
 0 p 0 p
1 0 1 2 3 4 1
P  0 1  p 0 p 0 1- 1-
 
 0 0 1 p 0 p p p
 0 0 0 0 1 
37
Classification of States – Definitions (cont)
 State i is periodic with period t > 1 if t is the
smallest number such that all paths leading from
state i back to state i have a length which is a
multiple of t
 i.e a return is possible only in t, 2t, 3t, … steps
 Mathematically, pii(n) = 0 whenever n is not divisible by t
 A recurrent state that is not periodic is called
aperiodic

38
Classification of States – Definitions (cont)
.6
 0 .6 .4   .24 .76 0 
1 2 1
.6
P  0 1 0  P 2  0 1 0 
.4 .4  .6 .4 0   0 .76 .24
3

 0 .904 .096  .0576 .9424 0   0 .97696 .02304


P 3  0 1 0  P 4  0 1 0  P 5  0 1 0 
 .144 .856 0   0 .9424 .0576  .03456 .96544 0 

• Continuing with n = 6, 7, … Pn shows that p11 and p33 are (+)ve for
even n and 0 otherwise
i.e. states 1 and 3 have period 2

40
Classification of States – Definitions (cont)
 If all states in a Markov Chain are recurrent,
aperiodic, and communicate with one another (a
“nice” chain), then the Markov Chain is said to
Ergodic
 Example –
 Gambler Ruin
 Not Ergodic
 Weather
 Ergodic
 Coke vs Pepsi
 Ergodic
 Gardener
 Not Ergodic

41
How do we check that a Markov chain is
aperiodic?
 Remember that two numbers and are said to be co-
prime if their greatest common divisor (gcd) is 1
 find two co-prime numbers l and m such that pii(l) > 0
and pii(m) > 0
 that is, we can go from state to itself in l steps, and also
in m steps.
 Then, we can conclude state is aperiodic.
 If we have an irreducible Markov chain, this means
that the chain is aperiodic.
 Since the number 1 is co-prime to every integer, any
state with a self transition is aperiodic.

42
How do we check that a Markov chain is
aperiodic?

43
How do we check that a Markov chain is
aperiodic?

44
How do we check that a Markov chain is
aperiodic?

45
Long Run Property of Markov Chain
Steady State Probabilities
 n-step transition probabilities for Cola drinkers
 0.9 0.1  0.83 0.17  3  0.781 0.219  0.72 0.28
P 
2
P  P   P 5  
   0.56 0.44
 0.2 0.8  0.34 0.66  0.438 0.562

 0.68 0.32  0.67 0.33  0.67 0.33 40 0.67 0.33


10
P  
20
P  
30
P  P  
 0. 67 0. 33
 0.65 0.35  0.67 0.33  0. 67 0 . 33  

 After long time a person’s next cola purchase


probability doesn’t depend on
 Whether (s)he was initially Coke or Pepsi drinker

46
Markov Chain
Coke vs. Pepsi Example (cont)

Simulation:

2/3
 0.9 0.1 2
3 2 1
3    3
1
3 
 0. 2 0 . 8
Pr[Xi = Coke]

stationary distribution

0.9 0.1 0.8

coke pepsi

0.2

week - i
47
Long Run Property of Markov Chain
Steady State Probabilities
 THEOREM: Let P be the transition matrix for a s-state ergodic
chain, then there exists a vector  = [ 1 2 3 …. s ] such
that    .. ..   1 2 s
  .. ..  s 
lim P n  1 2
n   : : :
 
1  2 .. ..  s 

 Recall that ijth element


lim pij ( n ) of
Pj , is p
n
for
ij
(n),
so theorem
any tells
initial state i that
n 

 j are called the steady state probabilities


 Vector  is called steady state/ equilibrium distribution
 Steady state
 doesn’t mean that the process settle down to one state
 Means - Prob. of process in state j, after long time, tends to j , and independent of initial state
48
Long Run Property of Markov Chain
Steady State Probabilities
 How can we find  ?
 From theorem, for large n, and all i pij(n+1)  pij(n) = j
…… (1)
n s
 Since pij(n+1) = (ith row( n of P )  (jth( ncolumn of P)
so, pij 1)
 pik pkj )

k 1
……….
(2) s
so,  j   k pkj
 If n is large, substituting keq
1 (1) into (2)
n

s
 In matrix form,  =  P ………………… (3)   j 1
j 1
 Unfortunately, eqn (3) has infinite no. of solutions
π πP
 To have unique solution, along s with eq (3), use
n

 n j 1
 That is, solve the system of eq
j 1
49
Long Run Property of Markov Chain
Steady State Probabilities
 Coke vs Pepsi Example
 .9 .1 π l .9 π l  .2π 2 Solving
[π l π 2 ] [π l π 2 ] 
 .2 . 8 π 2 .1π l  .8π 2 π l 2 3
π l  π 2 1 π l  π 2 1 π 2  13

 Suppose, 100 million cola customer, each person


purchase 1 cola during any week
 Each selling profits $1
 For $500 million per year, an add firm guarantees to
decrease 10% to 5% of Coke customers who switch to
Pepsi after a purchase
 Should Coke company hire the add firm?

50
Long Run Property of Markov Chain
Steady State Probabilities
 Coke vs Pepsi Example
 Total cola purchase in a year (52 week)
= 100 * 106 * 52= 5.2 billion
 Each purchase earns $1 profit for a company
 Current profit in a year for Coke company
= 2/3 (5.2 billion * $1) [ since steady state prob. of buying coke
1= 2/3]
=$3,466,666,667  0.95 0.05
P  
 What the add firm offers to Coke company 0
 is. 2 0 .8 
for $500 million per year

51
Long Run Property of Markov Chain
Steady State Probabilities
 Then what will be the long run/steady state
probabilities?
 Lets find.  0.95 0.05
P  
 .95 .05  0 . 2 0 .8 
[π l π 2 ] [π l π 2 ]
 .2 .8 
π l  π 2 1 Solving
π l .95π l  .2π 2
π l .8
π 2 .05π l  .8π 2
π 2 .2
π l  π 2 1

 New profit in a year for Coke company will be


= 0.8 * (5.2 billion * $1) - $500 million
=$3,660,000,000
 So, they should hire the add firm

52
Long Run Property of Markov Chain
Steady State Probabilities
 Gardener Problem (with fertilizer)  .3 .6 .1 
P  .1 .6 .3 
 .05 .4 .55
π πP
s
 System of eqn  j 1 yields following set of
equations j 1
1 .31  .1 2  .05 3 .8(.6  .2 3 )  .75 3 .7
 2 .61  .6 2  .4 3 .59 3 .22  3 .3729
 3 .11  .3 2  .55 3 so,  2 .6  .2 .3729 .5254
1   2   3 1  1 1  (.3729  .5254) .1017
.7(1  ( 2   3 )) .1 2  .05 3 .8 2  .75 3 .7
.4 2 .6(1  ( 2   3 ))  .4 3  2  .2 3 .6

Therefore,  = [.1017 .5254 .3729]


53
Mean First Passage Times
 For an ergodic chain, let mij = expected number of
transitions before we first reach state j, given that we are
currently in state i; mij is called the mean first passage
time from state i to state j.
 Mean/expected hitting time
 assume we are currently in state i. Then with probability
pij, it will take one transition to go from state i to state j.
 For k ≠ j, we next go with probability pik to state k. In this
case, it will take an average of 1 + mkj transitions to go
from i and j.

Sinc
e mij 1   pik mkj
k j
54
Mean First Passage Times
 By solving the linear equations of the equation
above, we find all the mean first passage times.
mij  pi1m1 j  pi 2 m2 j  ...  pis msj 1
m1 j  p11m1 j  p12 m2 j  ...  p1s msj 1
m2 j  p21m1 j  p22 m2 j  ...  p2 s msj 1
:
msj  ps1m1 j  ps 2 m2 j  ...  pss msj 1
 In Matrix form
 1  p11  p12 ...  p1s   m1 j   1
 p  
 p2 s   m2 j   1 m ij = (I - N j ) 1 ; j i
-1
 1  p22 ...
21
 
 : : :   :   1
    
  ps1  ps 2 ... 1  pss   msj   1
55
Mean First Passage Times
 By solving the linear equations of the equation
above, we find all the mean first passage times.

-1
 m
In Matrix form ij = (I - N j ) 1 ; j i

1
 mii 
It can be shown that mean recurrence time
i

56
Mean First Passage Times
 For the cola example, π1=2/3 and π2 = 1/3
 Hence, m11 = 1.5 and m22 = 3
 m12 = 1 + p11m12 = 1 + .9m12
 m21 = 1 + p22m21 = 1 + .8m21
 Solving these two equations yields,
 m12 = 10 and m21 = 5
0.9 0.1 0.8

coke pepsi

0.2

57
Mean First Passage Times
 Gardener Problem (with fertilizer)

58
References
 Operations Research : Applications and Algorithms
Wayne L. Winston
 Operations Research An Introduction
Hamdy A. Taha

59

You might also like