Module_2_Class
Module_2_Class
Markov Chain
Topic 1: Joint Probability distribution Two Discrete
Random Variables, Expectation, Covariance, and
Correlation in Joint Probability Distribution
P(X = x, Y = y )
X \Y 0 1 2
0 0.1 0.05 0.02
1 0.15 0.2 0.05
2 0.05 0.1 0.08
Example:
If the number of ad clicks and the number of products purchased
by a user are independent, then the probability that a user clicks
on ads 3 times and purchases 2 products is simply the product of
the individual probabilities: P(X = 3) · P(Y = 2).
Expectation of Two Discrete Random Variables
▶ The Expectation (or Expected Value) of a discrete random
variable is a measure of its central tendency.
▶ For two discrete random variables X and Y , their individual
expectations are given by:
X
E [X ] = x · P(X = x)
x
X
E [Y ] = y · P(Y = y )
y
Cov(X , Y ) = E [XY ] − E [X ]E [Y ]
▶ Covariance:
(i) Cov(X , X ) = Var(X ), the variance of X .
(ii) Cov(X , Y ) = Cov(Y , X ), symmetry property.
(iii) Cov(X , Y ) = 0 if X and Y are independent.
▶ Correlation:
(i) ρXY is dimensionless and provides a normalized measure of
linear dependence.
(ii) ρXY = 1 or ρXY = −1 indicates perfect linear dependence.
(iii) ρXY = 0 indicates no linear relationship, but X and Y may still
be dependent in a non-linear way.
Applications in AI and Data Science
Solution:
The marginal density of X is obtained by summing over the
probabilities of Y :
1 1 1 1
P(X = 1) = P(1, −4) + P(1, 2) + P(1, 7) = + + =
8 4 8 2
1 1 1 1
P(X = 5) = P(5, −4) + P(5, 2) + P(5, 7) = + + =
4 8 8 2
The marginal density of Y is obtained by summing over the
probabilities of X :
1 1 3
P(Y = −4) = P(1, −4) + P(5, −4) = + =
8 4 8
1 1 3
P(Y = 2) = P(1, 2) + P(5, 2) = + =
4 8 8
1 1 1
P(Y = 7) = P(1, 7) + P(5, 7) = + =
8 8 4
X 1 1
∴ E (X ) = x · P(X = x) = 1 · +5· =3
x
2 2
and,
X 3 3 1
E (Y ) = y · P(Y = y ) = −4 · +2· +7· =1
y
8 8 4
Also, XX
E (XY ) = x · y · P(X = x, Y = y )
x y
1 1 1
= (1 · (−4) · ) + (1 · 2 · ) + (1 · 7 · )
8 4 8
1 1 1 3
+(5 · (−4) · ) + (5 · 2 · ) + (5 · 7 · ) =
4 8 8 2
X
E (X 2 ) = x 2 · P(X = x)
x
1 1
= 12 · + 52 · = 13
2 2
σX2 = E (X 2 ) − (E (X ))2 = 13 − 32 = 4
√
σX = 4 = 2
X
E (Y 2 ) = y 2 · P(Y = y )
y
3 3 1
= (−4)2 · + 22 · + 72 · = 19.75
8 8 4
σY2 = E (Y 2 ) − (E (Y ))2 = 19.75 − 12 = 18.75
√
σY = 18.75 ≈ 4.33
Finally, we calculate the correlation coefficient ρ(X , Y ):
Cov (X , Y ) = E (XY ) − E (X )E (Y )
3 3
∴ Cov (X , Y ) = − (3 · 1) = −
2 2
ρ(X , Y ) = Cov (X , Y )/σX σY
− 32
∴ ρ(X , Y ) == ≈ −0.173
2 ∗ 4.33
Problem 2
Determine,
(i). Marginal distributions of X and Y.
(ii). Covariance between the discrete random variables X and
Y.
The joint probability distribution is given as:
X /Y 3 4 5
1 1 1
2 6 6 6
1 1 1
5 12 12 12
1 1 1
7 12 12 12
(i) Marginal Distributions: The marginal distribution of X is
calculated by summing over the probabilities of all values of Y for
each X.
1 1 1 3 1
P(X = 2) = + + = =
6 6 6 6 2
1 1 1 3 1
P(X = 5) = + + = =
12 12 12 12 4
1 1 1 3 1
P(X = 7) = + + = =
12 12 12 12 4
The marginal distribution of Y is calculated by summing over the
probabilities of all values of X for each Y.
1 1 1 4 1
P(Y = 3) = + + = =
6 12 12 12 3
1 1 1 4 1
P(Y = 4) = + + = =
6 12 12 12 3
1 1 1 4 1
P(Y = 5) = + + = =
6 12 12 12 3
(ii) The covariance Cov(X , Y ) is calculated as:
Cov(X , Y ) = E (XY ) − E (X )E (Y )
Where, XX
E (XY ) = xi yj P(X = xi , Y = yj )
i j
We compute:
1 1 1
E (X ) = 2 × + 5 × + 7 × = 2 × 0.5 + 5 × 0.25 + 7 × 0.25 = 4
2 4 4
1 1 1
E (Y ) = 3 × + 4 × + 5 × = 4
3 3 3
Next, calculate E (XY ):
1 1 1 1 1
E (XY ) = (2×3× )+(2×4× )+(2×5× )+(5×3× )+(5×4× )
6 6 6 12 12
1 1 1 1
+(5 × 5 × ) + (7 × 3 × ) + (7 × 4 × ) + (7 × 5 × ) = 16
12 12 12 12
Thus,
Cov(X , Y ) = E (XY ) − E (X )E (Y ) = 16 − 4 × 4 = 0
Problem 3
X and Y are independent random variables. X takes the values 2,
5, and 7 with probabilities 21 , 14 , and 41 , respectively. Y takes the
values 3, 4, and 5 with the probabilities 13 , 13 , and 31 , respectively.
(i). Find the Joint Probability Distribution of X and Y.
(ii). Show that COV(X, Y) = 0.
X \Y 3 4 5
1 1 1
2 6 6 6
1 1 1
5 12 12 12
1 1 1
7 12 12 12
Each entry in the table is computed by multiplying the marginal
probabilities, for example:
1 1 1
P(X = 2, Y = 3) = P(X = 2) × P(Y = 3) = × =
2 3 6
Solution (b): To show that COV (X , Y ) = 0, we first need to
calculate E (X ), E (Y ), and E (XY ).
1 1 1 5 7
E (X ) = 2 × +5× +7× =1+ + =4
2 4 4 4 4
1 1 1 3 4 5
E (Y ) = 3 × + 4 × + 5 × = + + = 4
3 3 3 3 3 3
Using the joint probability distribution, we can compute E (XY ):
1 1 1 1 1
E (XY ) = 2×3× +2×4× +2×5× +5×3× +5×4× +
6 6 6 12 12
1 1 1 1
5×5× +7×3× +7×4× +7×5× = 16
12 12 12 12
The covariance formula is:
COV (X , Y ) = E (XY ) − E (X )E (Y )
COV (X , Y ) = 16 − 4 × 4 = 16 − 16 = 0
Hence, COV (X , Y ) = 0.
Problem 4 Determine the value of k so that the function
f (x, y ) = k|x − y |, for x = −2, 0, 2 and y = −2, 3, represents the
joint probability distribution of the random variables X and Y .
Also, determine Cov(X , Y ).
Solution:
For f (x, y ) to represent a valid joint probability distribution, the
sum of all probabilities must be 1, i.e.,
X
f (x, y ) = 1
x,y
0 + 5k + 2k + 3k + 4k + k = 15k
P
Thus, to satisfy f (x, y ) = 1, we have:
1
15k = 1 =⇒ k =
15
The marginal density of X , fX (x), is found by summing over y :
X
fX (x) = f (x, y )
y
5 1
fX (−2) = f (−2, −2) + f (−2, 3) = 0 + = ,
15 3
2 3 1
fX (0) = f (0, −2) + f (0, 3) = + = ,
15 15 3
4 1 1
fX (2) = f (2, −2) + f (2, 3) = + = .
15 15 3
Similarly, the marginal density of Y , fY (y ), is found by summing
over x:
X
fY (y ) = f (x, y )
x
2 4 2
fY (−2) = f (−2, −2) + f (0, −2) + f (2, −2) = 0 + + = ,
15 15 5
5 3 1 3
fY (3) = f (−2, 3) + f (0, 3) + f (2, 3) = + + = .
15 15 15 5
Covariance is given by:
Cov(X , Y ) = E (XY ) − E (X )E (Y )
Expected value of X :
X 1 1
E (X ) = xf (x, y ) = (−2) × +0+2× =0
x,y
3 3
Expected value of Y :
6 9
E (Y ) = (−2) × +3× =1
15 15
Expected value of XY :
5 4 1 8
E (XY ) = (−2)(3) × + 2(−2) × + 2(3) × =−
15 15 15 3
Using the results from previous steps:
Cov(X , Y ) = E (XY ) − E (X )E (Y )
x+y
Solution: The joint probability distribution f (x, y ) = 30 for
x = 0, 1, 2, 3 and y = 0, 1, 2 is shown below:
x\y 0 1 2
1 2
0 0 30 30
1 2 3
1 30 30 30
2 3 4
2 30 30 30
3 4 5
3 30 30 30
P[X ≤ 2, Y = 1] = f (0, 1) + f (1, 1) + f (2, 1)
From the table:
1 2 3 6 1
P[X ≤ 2, Y = 1] = + + = =
30 30 30 30 5
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 1 / 17
Stochastic Processes
Example
A recommendation system might treat a user’s interactions as a stochastic
process. The system has probabilities associated with which product a user
will click on next, based on past interactions.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 2 / 17
Probability Vectors
Examples:
1 1 1
V1 = [0.1, 0.6, 0.3], V2 = , ,
3 3 3
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 3 / 17
Stochastic Matrices
A Stochastic Matrix is a square matrix P in which all entries are
non-negative and the sum of the entries in each row equals 1. Each
row of the matrix represents a probability vector:
p11 p12 · · · p1n
p21 p22 · · · p2n
P= .
.. .. ..
.. . . .
pn1 pn2 · · · pnn
Example:
1 1
2 2 0 1
P= 2 1 or P= 1 1
3 3 2 2
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 4 / 17
Regular Stochastic Matrices
Example:
If P is: 1 1
P= 2 2
1 3
4 4
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 5 / 17
Transition Matrices
Example:
0 1 0
P = 12 1
4
1
4
1 1 1
3 3 3
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 6 / 17
Problem 1 Given the matrices
1−a a 1−b b
P1 = , P2 =
b 1−b a 1−a
Show that: (i). P1 is a stochastic matrix. (ii). P2 is a stochastic matrix.
(iii). P1 P2 is a stochastic matrix.
Solution:
For P1 :
Row 1 sum = (1 − a) + a = 1
Row 2 sum = b + (1 − b) = 1
Since all entries are non-negative and each row sums to 1, P1 is a
stochastic matrix.
For P2 :
Row 1 sum = (1 − b) + b = 1
Row 2 sum = a + (1 − a) = 1
Since all entries are non-negative and each row sums to 1, P2 is a
stochastic matrix.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 7 / 17
Now, calculate the product P1 P2 :
1−a a
P1 P2 = b 1 − b1 − b b
a 1−a
Perform the matrix multiplication:
(1 − a)(1 − b) + a · a (1 − a)b + a(1 − a)
P1 P2 =
b(1 − b) + (1 − b)a b · b + (1 − b)(1 − a)
Simplifying the terms:
1 − a − b + ab + a2 b − ab + a − a2
P1 P2 =
b − b 2 + a − ab b 2 + 1 − b − a + ab
To show P1 P2 is stochastic, we check the row sums:
Row 1: (1 − a − b + ab + a2 ) + (b − ab + a − a2 ) = 1
Row 2: (b − b 2 + a − ab) + (b 2 + 1 − b − a + ab) = 1
Thus, P1 P2 is a stochastic matrix.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 8 / 17
Problem 2
Verify that the matrix
0 0 1
P = 12 1
4
1
4
0 1 0
is a regular stochastic matrix.
Solution:
A matrix is stochastic if all of its entries are non-negative and the
sum of the entries in each row equals 1.
A stochastic matrix is regular if some power of the matrix has all
positive entries.
Row 1: 0 + 0 + 1 = 1,
1 1 1
Row 2: + + = 1,
2 4 4
Row 3: 0 + 1 + 0 = 1.
Since all the row sums are 1, P is a stochastic matrix.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 9 / 17
0 0 1 0 0 1 0 1 0
P2 = P × P = 12 1
4
1
4× 21 1
4
1
4 = 81 5
16
9
16 .
1 1 1
0 1 0 0 1 0 2 4 4
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 10 / 17
Problem 3: Find the fixed probability vector for the regular stochastic
matrix 1 2
A = 31 33
4 4
Solution: Given: 1 2
A= 3 3
1 3
4 4
Since the matrix A is of second order, let the fixed probability vector
Q = [x y ], where x ≥ 0, y ≥ 0, and x + y = 1.
Now, 1 2
QA = [x y ] 31 33 = 31 x + 14 y 32 x + 34 y
4 4
Since QA = Q, we have:
1
+ 41 y 2
+ 34 y = x y
3x 3x
QP = Q
That is:
0 1 0
z z
[x y z] 0 0 1 = [ x + y )]
1 1 2 2
2 2 0
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 13 / 17
This gives the system of equations:
z
=x (1)
2
1
x + (z) = y (2)
2
y = z (3)
We know that x + y + z = 1, therefore:
1
x + 2x + 2x = 1 =⇒ x =
5
This simplifies to:
1 2 2
x= , y= , z=
5 5 5
Thus, the required fixed probability vector is:
Q = 51 25 25
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 14 / 17
Problem 5: Find the fixed probability vector of the regular stochastic
matrix:
0 32 13
P = 12 0 21
1 1
2 2 0
Solution:
Given the matrix:
2 1
0 3 3
P = 12 0 1
2
1 1
2 2 0
Since the given matrix P is of order 3 × 3, the required fixed
probability vector Q must also be of order 1 × 3. Let Q = x y z
where x ≥ 0, y ≥ 0, z ≥ 0, and x + y + z = 1.Also, QP = Q.
2 1
0 3 3
QP = x y z 12 0 21
1 1
2 2 0
QP = 12 y + 12 z 23 x + 12 z 13 x + 12 y
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 15 / 17
We know that QP = Q, so:
1 1 2 1 1
+ 12 y = x y z
2y + 2z 3x + 2z 3x
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 16 / 17
Assignment Problems
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 17 / 17
Markov Chains and Transition Probabilities
Example:
Consider a Markov chain with the following transition matrix:
1 0 0
P = 0.4 0.6
0
0 0 1
Here, states 1 and 3 are absorbing states, since p11 = 1 and p33 = 1.
A = 12 0 21
1 1
2 2 0
is irreducible.
Solution:
Given matrix A is a stochastic matrix, being a transition matrix.
This means that:
All the elements of the matrix are non-negative.
The sum of the elements in each row is equal to 1.
We now calculate A2 to check for the irreducibility of the Markov
Chain.
All the entries in A2 are non-negative, and the sum of each row
equals 1.
Hence, the matrix A is a regular stochastic matrix.
Since the matrix A is regular, it follows that the given Markov Chain
is irreducible.
1 3
with initial probability distribution p (0) =
4 4 .
Define and find the following:
(2)
(i) p21
(2)
(ii) p12
(iii) p (2)
(2)
(iv) p1
(v) The vector p (0) P n as n → ∞
(vi) The matrix P n as n → ∞
x 3y x y
=⇒ 2 + 4 2 + 4
= x y
x 3y x y 3 2
∴ + = x, + = y gives x = , y=
2 4 2 4 5 5
3 2
Q= 5 5
3 2
n 5 5
P → 3 2
5 5
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 4 / 20
Problem 2
1 1
and the initial probability distribution is p (0) =
2 2 0 .
Find:
(2)
i p13
(2)
ii p23
iii p (2)
(2)
iv p1
p (0) = 1 1
2 2 0
(2) 3 (2) 1
p13 = , p23 =
8 2
Now, we compute the probability distribution after two steps,
p (2) = p (0) P 2 :
3 1 3
8 4 8
p (2) = 12 21 0 12 0 12 = 16 7 1 7
8 16
11 1 3
16 8 16
7 1 7
Thus, p (2) = . From the result of p (2) , we have:
16 8 16
(2) 7
p1 =
16
Next, we calculate P4
by squaring P 2:
1 1 1 1
3 5
P 4 = P 2 · P 2 = 12 23 · 21 2
3 = 8
5
8
11
4 4 4 4 16 16
Now, we calculate the probability distribution on the fourth day:
3 5
p (4) = p (0) · P 4 = 0 1 · 85 11 5 11
8 = 16 16
16 16
Therefore, on the fifth day:
(4) 5
The probability that the engineer uses a bike is p1 = 16 .
(4) 11
The probability that the engineer uses a car is p2 = 16 .
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 11 / 20
Problem 5
Every year, a man trades his car for a new car. If he has a Maruti, he
trades it for an Ambassador. If he has an Ambassador, he trades it for a
Santro. However, if he has a Santro, he is just as likely to trade it for a
new Santro as to trade it for a Maruti or an Ambassador. In 2020, he
bought his first car, which was a Santro.
(i) Find the probability that he has:
(a) A Santro in 2022,
(b) A Maruti in 2022,
(c) An Ambassador in 2023,
(d) A Santro in 2023.
(ii) In the long run, how often will he have a Santro?
since ”C” initially has the ball. The probability distribution after three
throws is given by:
1 1
2 2 0
p (3) = p (0) · P 3 = 0 0 1 · 0 12 12 = 41 14 12
1 1 1
4 4 2
Therefore, after three throws, the probabilities are:
(3) 1 (3) 1 (3) 1
pA = , pB = , pC =
4 4 2
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 19 / 20
Assignment Problems