S2 Vol1 Jointdistributions
S2 Vol1 Jointdistributions
By : Nikita Kumari
2
About the author
Nikita Kumari is a course instructor for the Statistics and Machine Learning Foundation
course in the IITM BS degree program. With a Masters degree in Mathematics from the
Indian Institute of Technology, Madras, she has taught Statistics to students with diverse
backgrounds.
She is an enthusiastic and dedicated instructor who is committed to helping her students
develop a strong understanding of Statistics. With her extensive knowledge, enthusiasm,
and effective teaching skills, she greatly enhances the quality of this Statistics textbook.
3
Contents
1 Joint, Marginal and Conditional of multiple discrete random variables 5
1.1 Examples: Toss a coin thrice . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Example: Random 2-digit number . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Example: IPL powerplay over . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Two random variables: Joint, marginal, conditional PMFs . . . . . . . . . . 6
1.4.1 Two discrete random variables: Joint PMF . . . . . . . . . . . . . . . 7
1.4.2 Two random variables: Marginal PMF . . . . . . . . . . . . . . . . . 10
1.4.3 Marginal from Joint PMF . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.4 Conditional distribution of a random variable given an event . . . . . 14
1.4.5 Conditional distribution of one random variable given another . . . . 14
1.5 More than two random variables: Joint, marginal, conditional PMFs . . . . . 18
1.5.1 Multiple discrete random variables: Joint PMF . . . . . . . . . . . . 18
1.5.2 Multiple discrete random variables: Marginal PMF . . . . . . . . . . 20
1.5.3 Marginalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5.4 Multiple discrete random variables: Marginal PMF (general) . . . . . 26
1.5.5 Conditioning with multiple discrete random variables . . . . . . . . . 27
1.5.6 Conditioning and factors of the joint PMF . . . . . . . . . . . . . . . 29
1.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4
Chapter 1
The three random variables are defined on the same probability space.
Observations:
• X1 , X2 and X3 are independent to each other. Events defined using X1 alone will be
independent of events defined using X2 alone and will be independent of events defined
using X3 alone. Even though all the random variables live in the same probability
space, if we define events with each of the random variables separately, the events will
be be independent.
5
probability of a two digit number being 01, 11, . . . , 91. Since total number of outcomes is
1
100, therefore, P (X = 1) = , and so on. Hence,
10
X ∈ Uniform{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Now, P (X = 0) is same as the probability of two digit number being 00, 04, 08, . . . , 96.
25
Total number of favourable outcomes is 25. Therefore, P (Y = 0) = = 1/4. Similarly
100
for Y = 1, the favourable outcomes are 01, 05, 09, . . . , 97, that again count up to 25. So,
P (Y = 1) = 1/4. Similarly we can try for Y = 2 and Y = 3. Hence,
Y ∈ Uniform{0, 1, 2, 3}
So, we have defined two different random variables on a common probability space.
Now, suppose event X = 1 has occurred. It means, the last digit of the number is 1.
Then, what about the event Y = 0? Can these two events happen simultaneously? The
answer is NO. Any number that ends with 1 cannot be a multiple of 4. So, the events X = 1
and Y = 0 are not independent.
We saw such things can also happen unlike the coin toss example where any event defined
using one random variable was independent of any event defined using the another random
variable. In the next example, we will see how this can play a role in modeling and other
things in practice.
6
discrete random variables defined in the same probability space, many PMFs can be defined.
We have joint PMF, Marginal PMF and conditional PMF, we will see their properties,
structure, conditions and relationships between them.
• Each entry in the joint PMF table will take values between 0 and 1.
In the next section, we will look at some of the examples from the joint distribution
perspective.
Suppose a fair coin is tossed twice independently and we define the random variable Xi as
(
1, if the i-th toss is a head
Xi =
0, if the i-th toss is a tail
for i : 1, 2
Solution:
X1 = 0, X2 = 0 implies that the first toss is a tail and the second toss is a tail. Also,
the events X1 = 0 and X2 = 0 are independent.
7
(ii) Find fX1 X2 (0, 1).
Solution:
X1 = 0, X2 = 1 implies that the first toss is a tail and the second toss is a head. Also,
the events X1 = 0 and X2 = 1 are independent.
fX1 X2 (0, 1) =P (X1 = 0, X2 = 1)
1 1 1
= × =
2 2 4
The joint PMF table of X1 and X2 is
X1 0 1
X2
0 1 1
4 4
1 1 1
4 4
To generalize this, suppose you have two random variables X1 and X2 . You put all the
values taken by X1 in the first row of the table, and all the values taken by X2 in the first
column of the table. The ij-th position in the matrix will give the probability that X1 taking
the value i and X2 taking the value j.
A 2-digit number from 00 to 99 is selected at random. Let X be the digit in units place. Let
Y be the remainder obtained when the number is divided by 4.
(i) Find fXY (0, 0).
Solution:
fXY (0, 0) is same as P (X = 0, Y = 0).
X = 0, Y = 0 implies that the unit place digit is 0 and the number is divisible by 4.
The favourable outcomes are
(0, 0), (2, 0), (4, 0), (6, 0) and (8, 0).
fXY (0, 0) =P (X = 0, Y = 0)
5 1
= =
100 20
8
(ii) Find fXY (1, 0).
Solution:
fXY (1, 0) is same as P (X = 1, Y = 0).
X = 1, Y = 0 implies that the unit place digit is 1 and the number is divisible by 4,
which is not possible. So, the number of favourable outcomes is zero.
fXY (1, 0) =0
(iii) Find fXY (4, 2).
Solution:
fXY (4, 2) is same as P (X = 4, Y = 2).
X = 4, Y = 2 implies that the unit place digit is 4 and the number is 2 modulo 4. The
favourable outcomes are
(1, 4), (3, 4), (5, 4), (7, 4) and (9, 4).
1
fXY (4, 2) =
20
Observations:
• If you write down all the possibilities, the probability is either 0 or 1/20.
X 0 1 2 3 4 5 6 7 8 9
Y
0 1 0 1 0 1 0 1 0 1 0
20 20 20 20 20
1 0 1 0 1 0 1 0 1 0 1
20 20 20 20 20
2 1 0 1 0 1 0 1 0 1 0
20 20 20 20 20
3 0 1 0 1 0 1 0 1 0 1
20 20 20 20 20
• Every entry in the joint PMF is between 0 and 1, and if we add all the entries in the
table, it will give 1. So, it is a reasonable joint PMF for this problem.
We can notice that things are getting slightly more complex and as we go more and more
into complex outcomes, this kind of dependency is very crucial and it will become more and
more complicated to write down. We will stop here with the first part of Joint PMF and in
the next section, we will look at the marginal PMFs.
9
1.4.2 Two random variables: Marginal PMF
Definition: Suppose X and Y are jointly distributed discrete random variables with joint
PMF fXY . The PMF of the individual random variables X and Y are called as marginal
PMFs. X
fX (t) = P (X = t) = fXY (t, t′ ) (1)
t′ ∈TY
X
fY (t) = P (Y = t) = fXY (t′ , t) (2)
t′ ∈TX
Here, (1) and (2) are marginalization equations. There is unique way of defining the marginal
PMF given the joint PMFs.
Proof:
Let TY = {y1 , y2 , . . . , yk }.
P (X = t) =P (X = t, Y = y1 )or . . . orP (X = t, Y = yk )
=P (X = t, Y = y1 ) + . . . + P (X = t, Y = yk )
This process is called marginalisation. We will see few examples to see how marginalisa-
tion works.
Suppose a fair coin is tossed twice independently and we define the random variable Xi as
(
1, if the i-th toss is a head
Xi =
0, if the i-th toss is a tail
for i : 1, 2
It is always easy to work with the table. We will make a joint PMF table for X1 and X2
and see how marginalisation works here.
• Marginal PMF of X1 : We will add over the columns of joint PMF table.
fX1 (0) = fX1 X2 (0, 0) + fX1 X2 (0, 1)
fX1 (1) = fX1 X2 (1, 0) + fX1 X2 (1, 1)
• Marginal PMF of X2 : We will add over the rows of joint PMF table.
fX2 (0) = fX1 X2 (0, 0) + fX1 X2 (1, 0)
fX1 (1) = fX1 X2 (0, 1) + fX1 X2 (1, 1)
10
X1 0 1 fX2 (t2 )
X2
0 1 1 1
4 4 2
1 1 1 1
4 4 2
fX1 (t1 ) 1 1 1
2 2
X1 0 1
X2
0 0.05 0.35
1 0.25 0.35
11
X1 0 1 fX2 (t2 )
X2
0 0.05 0.35 0.40
Many a times, given the marginal PMF, we tend to think that the joint PMF will be the
product of the marginals. But this is not the only joint PMF which will give that marginal.
We will see here how the same marginal PMF can result from different joint PMFs.
Case I:
X1 0 1 fX2 (t2 )
X2
0 1 1 1
4 4 2
1 1 1 1
4 4 2
fX1 (t1 ) 1 1 1
2 2
Case II:
For every x between 0 and 1/2, we get a joint PMF that results in the same marginal. There-
fore, from joint to marginal, we can go in a unique way, but we cannot go from marginal to
joint in a unique way.
12
X1 0 1 fX2 (t2 )
X2
0 x 1 1
−x
2 2
1 1 x 1
−x
2 2
fX1 (t1 ) 1 1 1
2 2
A 2-digit number from 00 to 99 is selected at random. Let X be the digit in units place. Let
Y be the remainder obtained when the number is divided by 4.
We have already seen the joint PMF of X and Y before, here we will find their marginals.
X 0 1 2 3 4 5 6 7 8 9 fY (y)
Y
0 1 0 1 0 1 0 1 0 1 0 1
20 20 20 20 20 4
1 0 1 0 1 0 1 0 1 0 1 1
20 20 20 20 20 4
2 1 0 1 0 1 0 1 0 1 0 1
20 20 20 20 20 4
3 0 1 0 1 0 1 0 1 0 1 1
20 20 20 20 20 4
fX (x) 1 1 1 1 1 1 1 1 1 1 1
10 10 10 10 10 10 10 10 10 10
Observations:
• From the table, you can observe that X ∼ Uniform{0, 1, . . . , 9} and Y ∼ Uniform{0, 1, 2, 3}.
1 1 1 1
• fXY (0, 0) = ̸= fX (x)fY (y) = × = . So, the events X = 0 and event Y = 0
20 10 4 40
are not independent. Hence, X and Y are not independent.
13
1.4.4 Conditional distribution of a random variable given an event
Definition: Suppose X is a discrete random variable with range TX , and A is an event in
the same probability space. The conditional PMF of X given A is defined as the PMF
Q(t) = P (X = t | A), t ∈ TX
For the above conditional PMF, we will use the notation fX|A (t) and (X | A) as the condi-
tional random variable.
P ((X = t) ∩ A)
fX|A (t) =
P (A)
Note: The range of (X | A) can be different from the range of X and will depend on A.
X 0 1 2 fY (t2 )
Y
0 1/4 1/8 1/8 1/2
14
Range X = {0, 1, 2}
Range Y = {0, 1}
Solution:
fXY (0, 0)
fY |X=0 (0) =
fX (0)
1/4 2
= =
3/8 3
Solution:
fXY (0, 1)
fY |X=0 (1) =
fX (0)
1/8 1
= =
3/8 3
You can observe here that fY |X=0 (0) and fY |X=0 (1) sums up to 1.
Solution:
fXY (0, 1)
fX|Y =1 (0) =
fY (1)
1/8 1
= =
1/2 4
Throw a die and toss a coin as many times as the number shown on die. Let X be the
number shown on die. Let Y be the number of heads. What is the joint PMF of X and Y ?
Solution:
15
1
fX (t) = , 1 ≤ t ≤ 6
6
Therefore,
X ∼ Uniform{1, 2, 3, 4, 5, 6}
The random variable Y represent the number of heads obtained on X number of tosses.
Therefore,
(Y | X = t) ∼ Binomial(t, 1/2)
Range (Y | X = t) = {0, 1, . . . , t}
t′ t−t′ t
t 1 1 t 1
fY |X=t (t′ ) = , t′ = 0, 1, 2, . . . , t
t′
= t′
2 2 2
t
′ 1 ′ t
1
Now the joint distribution of X and Y is fXY (t, t ) = fX (t)fY |X=t (t ) = t′
6 2
t 1 2 3 4 5 6
′
t
2 3 4 5 6
0 1 1 1 1 1 1 1 1 1 1 1 1
. . . . . .
6 2 6 2 6 2 6 2 6 2 6 2
2
1 1 1 1 1 ... ... ...
. .2
6 2 6 2
.. .. .. .. .. .. ..
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
Let N ∼ Poisson(λ). Given N = n, toss a fair coin n times and denote the number of heads
obtained by X. What is the distribution of X?
e−λ λn
fN (n) = , n = 0, 1, 2, . . .
n!
16
(X | N = n) ∼ Binomial(n, 1/2)
1 n
n
fX|N =n (k) = k , k = 0, 1, . . . , n
2
n
e−λ λn n 1
fN X (n, k) = , n = 0, 1, . . . ; k = 0, 1, . . . , n
n! k 2
∞
X
fX (k) = fN X (n, k)
n=0
∞ −λ n n
P e λ n 1 , for n ≥ k
= n=k n! k 2
0, for n < k
∞ n
X e−λ λn 1
=
n=k
k!(n − k)! 2
∞
e−λ λk X λn−k
=
k!2k n=k (n − k)!2n−k
e−λ/2 (λ/2)k
= , k = 0, 1, 2, . . .
k!
Therefore,
X ∼ Poisson(λ/2)
Example: IPL Powerplay over
Let X = number of runs in the over. Let Y = number of wickets in the over. Assume the
following:
Y ∈ {0, 1, 2} and
Y 0 1 2
fY (y) 13/16 1/8 1/16
17
Distribution of X
Solution:
18
Example: Toss a fair coin thrice
Suppose a fair coin is tossed three times. Let us define the random variable Xi as
(
1, if the i-th toss is a head
Xi =
0, if the i-th toss is a tail
The joint PMF of X1 , X2 and X3 is given by
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/8
0 0 1 1/8
0 1 0 1/8
0 1 1 1/8
1 0 0 1/8
1 0 1 1/8
1 1 0 1/8
1 1 1 1/8
A 3-digit number from 000 to 999 is selected at random. Let X denote the first digit from
the left. Let Y denote the number modulo 2. Let Z denote the first digit from the right.
Solution:
X ∈ {0, 1, 2, . . . , 9}
Y ∈ {0, 1}
Z ∈ {0, 1, 2, . . . , 9}
fXY Z (0, 0, 0) = P(starts with zero and the number is even and the number ends with
10 1
zero)= =
1000 100
1
fXY Z (1, 1, 1) =
100
fXY Z (1, 0, 1) = 0
10 1
fXY Z (8, 0, 6) = =
1000 100
fXY Z (8, 1, 6) = 0
19
0, if t2 = 0, t3 is even
fXY Z (t1 , t2 , t3 ) = 0, if t2 = 0, t3 is even
1
, otherwise
100
Example: IPL powerplay over
We will look here at a slightly complicated situation. Consider an over from the IPL pow-
erplay. Suppose this over has six deliveries. Let Xi denote the number of runs scored in the
i-th delivery and Xi ∈ {0, 1, . . . , 8}. What will be the joint PMF of X1 , X2 , . . . , X6 ?
It is very difficult to calculate these probabilities. Number of possibilities is going to be
6
8 in this case. We need some tools to tackle such problems. This is where marginalization
and conditioning will help, which we will see in the later section.
X
fX1 (t) = P (X1 = t) = fX1 ,...,Xn (t, t′2 , t′3 , . . . , t′n )
t′2 ∈TX2 ,t′3 ∈TX3 ...,t′n ∈TXn
X
fX2 (t) = P (X2 = t) = fX1 ,...,Xn (t′1 , t′ , t′3 , . . . , t′n )
t′1 ∈TX1 ,t′3 ∈TX3 ...,t′n ∈TXn
..
.
X
fXn (t) = P (Xn = t) = fX1 ,...,Xn (t′1 , t′2 , t′3 , . . . , t′n−1 )
t′1 ∈TX1 ,t′2 ∈TX2 ...,t′n−1 ∈TXn−1
Suppose a fair coin is tossed three times. Let us define the random variable Xi as
20
(
1, if the i-th toss is a head
Xi =
0, if the i-th toss is a tail
The joint PMF of X1 , X2 and X3 is given by
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/8
0 0 1 1/8
0 1 0 1/8
0 1 1 1/8
1 0 0 1/8
1 0 1 1/8
1 1 0 1/8
1 1 1 1/8
fX1 (0) =fX1 X2 X3 (0, 0, 0) + fX1 X2 X3 (0, 0, 1) + fX1 X2 X3 (0, 1, 0) + fX1 X2 X3 (0, 1, 1)
1 1 1 1
= + + +
8 8 8 8
1
=
2
fX1 (1) =fX1 X2 X3 (1, 0, 0) + fX1 X2 X3 (1, 0, 1) + fX1 X2 X3 (1, 1, 0) + fX1 X2 X3 (1, 1, 1)
1 1 1 1
= + + +
8 8 8 8
1
=
2
A 3-digit number from 000 to 999 is selected at random. Let X denote the first digit from
the left. Let Y denote the number modulo 2. Let Z denote the first digit from the right.
Instead of looking at the joint PMF, we can directly look at the marginals.
• There are a total of 100 favourable outcomes out of 1000, for every first digit from the
left.
21
100 1
fX (x) = = , for all x ∈ {0, 1, . . . , 9}
1000 10
• The possible values that Y can take is {0, 1}. Y will be 0, if the number is even and
Y will be 1 if the number is odd.
500 1
fY (y) = = , for y ∈ {0, 1}.
1000 2
• There are a total of 100 favourable outcomes out of 1000, for every first digit from the
right.
100 1
fZ (z) = = , for all z ∈ {0, 1, . . . , 9}
1000 10
Let us consider the first over of an IPL powerplay. Assume that this over has six deliveries.
Let Xi denote the number of runs scored in the i-th delivery.
Xi ∈ {0, 1, 2, . . . , 8}. It is very unlikely that more than eight runs will be scored in a
particular delivery, so we will neglect that part. Now, the question here is how to assign the
probabilities. The method that we will see here may not be the best one, but it is reasonable
to do in this way.
We can collect data from the past occurrences. There are 1598 matches in the IPL where
the first over has been bowled so far. Let us look at ball 1: Number of times zero runs were
scored is 957. Number of times one run was scored is 429. Number of times two runs were
scored is 57, three runs in 5 matches, four runs in 138 matches, five runs in 8 matches, six
runs in 4 matches. We do not have any seven or eight runs so far in the first ball.
957
P (X1 = 0) =
1598
429
P (X1 = 1) =
1598
57
P (X1 = 2) =
1598
22
5
P (X1 = 3) =
1598
138
P (X1 = 4) =
1598
8
P (X1 = 5) =
1598
4
P (X1 = 6) =
1598
Similarly, we can check for the other deliveries of the first over. We will get the following
probabilities:
Xi 0 1 2 3 4 5 6
fX1 0.5989 0.2685 0.0357 0.0031 0.0864 0.0050 0.0025
fX2 0.5551 0.2791 0.0438 0.0031 0.1083 0.0044 0.0063
fX3 0.5338 0.2847 0.0444 0.0044 0.1139 0.0025 0.0163
fX4 0.5344 0.2516 0.0394 0.0031 0.1489 0.0038 0.0188
fX5 0.5313 0.2672 0.0407 0.0056 0.1358 0.0025 0.0169
fX6 0.5056 0.2954 0.0394 0.0050 0.1414 0.0013 0.0119
Observations:
1.5.3 Marginalisation
Suppose X1 , X2 , X3 ∼ fX1 X2 X3 and Xi ∈ TXi . We have already discussed the marginal
PMF of the individual random variables. How do we find fX1 X2 , fX2 X3 , fX3 X1 ? This can be
found using the same principle of sum over everything that you do not want, which is called
marginalisation. Therefore,
X
fX1 X2 (t1 , t2 ) = P (X1 = t1 and X2 = t2 ) = fX1 X2 X3 (t1 , t2 , t′3 )
t′3 ∈TX3
X
fX1 X3 (t1 , t3 ) = P (X1 = t1 and X3 = t3 ) = fX1 X2 X3 (t1 , t′2 , t3 )
t′2 ∈TX2
X
fX2 X3 (t2 , t3 ) = P (X2 = t2 and X2 = t3 ) = fX1 X2 X3 (t′1 , t2 , t3 )
t′1 ∈TX1
Example: X1 , X2 , X3 ∼ fX1 X2 X3
23
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/9
0 0 1 1/9
0 0 2 1/9
0 1 1 1/9
0 1 2 1/9
1 0 0 1/9
1 0 2 1/9
1 1 0 1/9
1 1 1 1/9
Solution:
X1 , X2 ∈ {0, 1}
24
•
X 0 1
Y
0 1/3 2/9
1 2/9 2/9
X 0 1
Y
0 2/9 1/9
1 1/9 2/9
2 2/9 1/9
25
X 0 1
Y
0 1/9 2/9
1 2/9 1/9
2 2/9 1/9
We have already seen the marginalisation for two random variables and three random
variables. Next, we will see how the same principle can be applied for four random variables.
X
fX1 X3 X4 (t1 , t3 , t4 ) = P (X1 = t1 and X3 = t3 and X4 = t4 ) fX1 X2 X3 X4 (t1 , t′2 , t3 , t4 )
t′2
X
fXi1 ,...,Xik (ti1 , . . . , tik ) = fX1 ,...,Xn (t1 , . . . , t′i1 −1 , ti1 , t′i1 +1 , . . . , t′ik −1 , tik , t′ik +1 , . . . , tn )
t1 ,...,t′i −1 ,ti1 ,t′i +1 ,...,
1 1
t′i −1 ,tik ,t′i +1 ,...,tn
k k
26
1.5.5 Conditioning with multiple discrete random variables
In the previous section, we saw how marginalisation reduces the scale of the problem from
a bigger problem. Conditioning acts like a bridge between the marginal and joint, giving
a picture of the whole in a sense. In case of multiple random variables, a wide variety of
conditioning is possible. Lets understand this with the help of an example.
fX1 X2 (t1 , t2 )
1. (X1 | X2 = t2 ) ∼ fX1 |X2 =t2 (t1 ) =
fX2 (t2 )
fX1 X2 X3 (t1 , t2 , t3 )
2. (X1 , X2 | X3 = t3 ) ∼ fX1 X2 |X3 =t3 (t1 , t2 ) =
fX3 (t3 )
fX1 X2 X3 (t1 , t2 , t3 )
3. (X1 | X2 = t2 , X3 = t3 ) ∼ fX1 |X2 =t2 ,X3 =t3 (t1 ) =
fX2 X3 (t2 t3 )
fX1 X2 X3 X4 ) (t1 , t2 , t3 , t4 )
4. (X1 , X3 | X2 = t2 , X4 = t4 ) ∼ fX1 X3 |X2 =t2 ,X4 =t4 (t1 , t3 ) =
fX2 X4 (t2 t4 )
t1 t2 t3 t4 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 0 1/12
0 0 0 1 1/12
0 0 1 1 1/12
0 0 2 0 1/12
0 1 1 0 1/12
0 1 1 1 1/12
0 1 2 0 1/12
1 0 0 1 1/12
1 0 2 0 1/12
1 0 2 1 1/12
1 1 0 1 1/12
1 1 1 0 1/12
Solution:
Step 1: Identify the range.
27
Range of (X1 | X2 = 0) = {0, 1}
Step 2: Calculate the conditional probabilities.
P (X1 = 0, X2 = 0)
P (X1 = 0 | X2 = 0) =
P (X2 = 0)
4/12 4
= =
7/12 7
P (X1 = 1, X2 = 0)
P (X1 = 1 | X2 = 0) =
P (X2 = 0)
3/12 3
= =
7/12 7
Solution:
Step 1: Identify the range.
P (X1 = 0, X3 = 0, X4 = 1)
P (X1 = 0 | X3 = 0, X4 = 1) =
P (X3 = 0, X4 = 1)
1/12 1
= =
3/12 3
P (X1 = 0, X3 = 0, X4 = 1)
P (X1 = 1 | X3 = 0, X4 = 1) =
P (X3 = 0, X4 = 1)
2/12 2
= =
3/12 3
Solution:
Step 1: Identify the range.
28
Range of (X3 | X1 = 0) = {0, 1, 2}
Range of (X4 | X1 = 0) = {0, 1}
Step 2: Calculate the conditional probabilities.
P (X3 = 0, X4 = 0, X1 = 0)
P (X3 = 0, X4 = 0 | X1 = 0) =
P (X1 = 0)
1/12 1
= =
7/12 7
P (X3 = 0, X4 = 1, X1 = 0)
P (X3 = 0, X4 = 1 | X1 = 0) =
P (X1 = 0)
1/12 1
= =
7/12 7
Similarly find the conditional probabilities for the other cases, i.e., P (X1 = 0, X4 =
0 | X1 = 0), P (X3 = 1, X4 = 1 | X1 = 0), P (X3 = 2, X4 = 0 | X1 = 0) and
P (X3 = 2, X4 = 1 | X1 = 0).
X3 0 1 2
X4
0 1/7 1/7 2/7
1 1/7 2/7 0
29
So, we want to see if we can factor the joint PMF, and marginal and conditioning turn
out to be very crucial in factoring. Factoring can be done in many different ways, but the
fundamental idea is very simple, which we have seen so far. Lets understand this with the
help of an example.
So, here I could write my joint PMF as a product of four factors. Lets look at an example.
Example: X1 , X2 , X3 ∼ fX1 X2 X3 fX1 X2 X3 can be written as fX3 f(X2 |X3 ) f(X1 |X2 ,X3 ) . Therefore,
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/9
0 0 1 1/9
0 0 2 1/9
0 1 1 1/9
0 1 2 1/9
1 0 0 1/9
1 0 2 1/9
1 1 0 1/9
1 1 1 1/9
30
In the previous section, we saw the factoring of joint PMF. It turns out that it can be done
in any sequence. We will look at two others forms of it here.
Example: X1 , X2 , X3 ∼ fX1 X2 X3
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/9
0 0 1 1/9
0 0 2 1/9
0 1 1 1/9
0 1 2 1/9
1 0 0 1/9
1 0 2 1/9
1 1 0 1/9
1 1 1 1/9
fX1 X2 X3 can also be written as fX1 f(X2 |X1 ) f(X3 |X1 ,X2 ) . Therefore,
1.6 Problems
1. Let X and Y be two random variables with joint distribution given in Table 1.1.P, where
a and b are two unknown values.
31
X
0 1 2
Y
1 3
0 a
12 12
2 1
1 b
12 12
3 1 1
2
12 12 12
i) Find P (Y = 1).
4
a)
12
3
b)
12
5
c)
12
1
d)
12
ii) Find P (Y = 1 | X = 2).
1
a)
12
1
b)
4
1
c)
3
1
d)
2
iii) Find P (X = 0, Y ≥ 1).
4
a)
12
3
b)
12
5
c)
12
1
d)
12
32
2. Let X ∼ Uniform({1, 2, 3, 4, 5, 6}) and let Y be the number of times 2 occurs in X throws
of a fair die. Choose the incorrect option(s) among the following.
1
a) P (Y = 2 | X = 2) =
6
52
b) P (Y = 2 | X = 4) = 3
6
5
c) P (Y = 5 | X = 6) = 5
6
5
d) P (Y = 6 | X = 5) = 6
6
3. Let the random variables X and Y each have range {1, 2, 3}. The following formula
gives the joint PMF
i + 2j
P (X = i, Y = j) = ,
c
where c is an unknown value. Find P (1 ≤ X ≤ 3, 1 < Y ≤ 3).
5
a)
9
7
b)
9
2
c)
9
4
d)
9
4. The joint PMF of the random variables X and Y is given in Table 1.2.P.
X
1 2 3
Y
1 k k 2k
2 2k 0 4k
3 3k k 6k
33
a) {1, 4, 9}
b) {4, 8, 18}
c) {1, 9}
d) {2, 18}
e) {2, 8, 18}
ii) Find the value of P (Z = 18 | Y = 2).
1
a)
3
2
b)
3
3
c)
4
1
d)
4
5. From a sack of fruits containing 3 mangoes, 2 kiwis, and 3 guavas, a random sample of
4 pieces of fruit is selected. If X is the number of mangoes and Y is the number of kiwis
in the sample, then find the joint probability distribution of X and Y .
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 2 18
1
70 70 70 70
3 9 3
2 0
70 70 70
a)
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70
b)
34
X
0 1 2 3
Y
3 9 3
0 0
70 70 70
2 18 18 2
1
70 70 70 70
9 3 3
2 0
70 70 70
c)
X
0 1 2 3
Y
3 3 9
0 0
70 70 70
2 18 18 2
1
70 70 70 70
3 9 3
2 0
70 70 70
d)
6. Suppose you flip a fair coin. If the coin lands heads, you roll a fair six-sided die 50 times.
If the coin lands tails, you roll the die 51 times. Let X be 1 if the coin lands heads and
0 if the coin lands tails. Let Y be the total number of times you get the number 5 while
throwing the dice. Find P (X = 1|Y = 10).
85
a)
157
82
b)
167
72
c)
157
85
d)
167
7. Three balls are selected at random from a box containing five red, four blue, three yellow
and six green coloured balls. If X, Y and Z are the number of red balls, blue balls and
green balls respectively, choose the correct option(s) among the following.
25
a) P (X = 1, Y = 0, Z = 2) =
272
35
5
b) P (X = 1, Y = 1, Z = 1) =
34
1
c) P (X = 1, Y = 0 | Z = 2) =
4
5
d) P (X = 0, Y = 0, Z = 3) =
204
8. A computer system receives messages over three communications lines. Let Xi be the
number of messages received on line i in one hour. Suppose that the joint pmf of X1 , X2 ,
and X3 is given by
fX1 X2 X3 (x1 , x2 , x3 ) = (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2 px3 3 for x1 ≥ 0, x2 ≥ 0, x3 ≥ 0 and
0 < pi < 1.
i) Find fX1 X2 (x1 , x2 ).
a) (1 − p1 )(1 − p2 )
b) (1 − p1 )(1 − p2 )(1 − p3 )px1 1 px2 2
c) (1 − p1 )(1 − p2 )px1 1 px2 2
d) px1 1 px2 2
ii) Find fX2 (x2 ).
a) (1 − p1 )
b) (1 − p1 )(1 − p2 )px2 2
c) (1 − p1 )px1 1
d) (1 − p2 )px2 2
iii) Find P (X1 = 2, X3 = 5).
a) p21 p53
b) (1 − p1 )(1 − p3 )p21 p53
c) (1 − p1 )(1 − p2 )p21 p52
d) (1 − p1 )(1 − p3 )
9. A coin is tossed twice. Let X denote the number of heads on the first toss and Y denote
the total number of heads on the 2 tosses. If the coin is biased and a head has a 40%
chance of occurring,
i) Find P (Y = 2). Enter your answer correct to two decimals accuracy.
ii) Find P (X = 1). Enter your answer correct to two decimals accuracy.
iii) Find P (X = 1, Y = 1). Enter your answer correct to two decimals accuracy.
10. Let X1 , X2 , X3 ∼ fX1 X2 X3 where Xi ∈ {−1, 1} for each i. If fX1 X2 X3 (−1, −1, 1) =
1
8
, fX2 (1) = 16 , fX3 |X2 =−1 (1) = 51 , find fX1 |X2 =−1,X3 =1 (−1). Enter your answer correct
to two decimals accuracy.
36
11. Suppose that the number of people who visit a yoga academy each day is a Poisson
random variable with mean 30. Suppose further that each person who visits is, indepen-
dently, a girl with probability 0.5 or a boy with probability 0.5. Find the joint probability
that exactly 10 boys and 15 girls visit the yoga academy on any given day.
e−30 3025
a)
15!10!
−15
e 3025
b)
15!10!
−8
e 1525
c)
15!10!
e−30 1525
d)
15!10!
37
Chapter 2
Definition: Let X and Y be two random variables defined in a probability space with
ranges TX and TY , respectively. X and Y are said to be independent if any event defined
using X alone is independent of any event defined using Y alone. Equivalently, if the joint
PMF of X and Y is fXY , X and Y are independent if
fXY (t1 , t2 ) = fX (t1 )fY (t2 )
for t1 ∈ TX and t2 ∈ TY
2.1 Examples
1. To check for independence given the joint PMF.
The joint PMF of two random variables X and Y with range t1 and t2 , respectively is
given below:
t1 0 1 fY
t2
0 1/4 1/4 1/2
fX 1/2 1/2 1
38
We can see that fXY (t1 , t2 ) = fX (t1 )fY (t2 ), for all t1 , t2 .
t1 0 1 fY
t2
0 1/2 1/4 3/4
fX 5/8 3/8 1
1
We can observe here that fXY (0, 0) = ̸= fX (0)fY (0). So, X and Y are not indepen-
2
dent.
t1 0 1 fY
t2
0 x 1/2 − x 1/2
1 1/2 − x x 1/2
fX 1/2 1/2 1
1
For x = , X and Y are independent.
4
1
For x ̸= , X and Y are not independent.
4
4. A 2-digit number from 00 to 99 is selected at random. Partial information is available
about the number as two random variables. Let X be the digit in the units place.
Let Y be the remainder obtained when the number is divided by 4. Are X and Y
39
independent?
Solution:
X ∼ Uniform{0, 1, 2, . . . , 9}
Y ∼ Uniform{0, 1, 2, 3}
5. Let X =number of runs in the over. Let Y = number of wickets in the over. Are X
and Y independent? No. The question like this will appear very often when you model
a real life scenario.
X 0 1 2
Y
0 1/9 1/9 1/9
1.
X 0 1 2
Y
0 1/6 1/12 1/12
2.
40
X 0 1 2
Y
0 0 1/8 1/8
3.
X 0 1 2
Y
0 1/6 1/12 1/8
4.
• To show X and Y are independent, verify that the joint PMF is the product of
marginals, for all values of X and Y .
for all t1 ∈ TX , t2 ∈ TY .
• To show X and Y are dependent, it is enough to verify the inequality for some particular
values of X and Y .
fXY (t1 , t2 ) ̸= fX (t1 )fY (t2 )
for some t1 ∈ TX , t2 ∈ TY .
41
2.2 Independence of multiple random variables
Definition: Let X1 , X2 , . . . , Xn be random variables defined in a probability space with
range of Xi denoted TXi . X1 , X2 , . . . , Xn are said to be independent if events defined using
different Xi are mutually independent. Equivalently, X1 , X2 , . . . , Xn are independent iff
Examples:
1
The joint PMF of X1 , X2 and X3 is given by fX1 X2 X3 (t1 , t2 , t3 ) = for ti ∈ {0, 1}.
8
Also, Xi ∼ Uniform{0, 1}. Therefore, fX1 X2 X3 (t1 , t2 , t3 ) = fX1 (t1 )fX2 (t2 )fX3 (t3 )
X = first digit from the left, Y = number modulo 2, Z = first digit from right. Are
X, Y and Z independent?
Solution:
X ∼ Uniform{0, 1, 2, . . . , 9}
Y ∼ Uniform{0, 1}
Z ∼ Uniform{0, 1, 2, . . . , 9}
1
• fXZ (t1 , t3 ) = , for all t1 ∈ X, t3 ∈ Z
100
1
fXZ (t1 , t3 ) = fX (t1 )fZ (t3 ) =
100
Therefore, X and Z are independent.
42
1
• fXY (t1 , t2 ) = , for all t1 ∈ X, t2 ∈ Y
100
1
fXY (t1 , t2 ) = fX (t1 )fY (t2 ) =
100
Therefore, X and Y are independent.
• fY Z (1, 2) = 0 ̸= fY (1)fZ (2)
3. Even parity
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/4
0 1 1 1/4
1 0 1 1/4
1 1 0 1/4
The above joint PMF is called the even parity because in every case, the number of 1s
is even.
t1 0 1 fX 2
t2
0 1/4 1/4 1/2
fX 1 1/2 1/2 1
43
t1 0 1 fX 3
t3
0 1/4 1/4 1/2
fX 1 1/2 1/2 1
t2 0 1 fX 3
t3
0 1/4 1/4 1/2
fX 2 1/2 1/2 1
44
(a) Let X ∼ Uniform{0, 1, 2, . . . , 10}. Find the plot of fX (x).
Solution:
1
PMF of X is P (X = x) = , x ∈ {0, 1, 2, . . . , 10}.
11
The height of the plot is 1/11 for each of the xi ’s. Such kind of plots are called
the stem plots. They are very useful in the visualisation of PMF of a random
variable. From this plot, it is quite clear that the random variable is taking the
values from 0 to 10 with the probabilities 1/11 each.
(b) Let X ∼ Binomial(10, 0.5). Find the plot of fX (x).
Solution:
PMF of Y is given by
10
P (Y = k) = (0.5)10
k
45
In this case, the probabilities are different for different values of Y . You can
observe that, in this case the value Y = 5 is the most frequent, then we have
Y = 4 and Y = 6. The values Y = 0, 10 is the least likely.
So, given a PMF, you can do the stem plot and see how the distribution behaves. Later
we will see how the distribution looks like for the functions of a random variable.
Solution:
Step I: Given any distribution, we can always make a simple table like below,
which we refer to as the table method and it is a very powerful method to compute
the distributions.
46
x P (X = x) y =x−5 x P (X = x) y =x−5
0 1/11 −5 0 1/1024 −5
1 1/11 −4 1 10/1024 −4
2 1/11 −3 2 45/1024 −3
3 1/11 −2 3 120/1024 −2
4 1/11 −1 4 210/1024 −1
5 1/11 0 5 252/1024 0
6 1/11 1 6 210/1024 1
7 1/11 2 7 120/1024 2
8 1/11 3 8 45/1024 3
9 1/11 4 9 10/1024 4
10 1/11 5 10 1/1024 5
From the table, we can see the list of values that Y is taking and it is in one-to-
one correspondence to X. Different values of X go to different values of Y . Since
there is no repetition in the values, the distribution of Y is easy to compute here.
1
In the case of Uniform distribution, P (Y = y) = for y ∈ Y . While, in case of
11
Binomial distribution, the probabilities vary for different values of X.
Plots of Y
47
X ∼ Uniform{0, 1, . . . , 100}
Y =X −5
i.
48
X ∼ Binomial(10, 0.5)
Y =X −5
ii.
49
(b) X ∼ Binomial(10, 0.5)
Solution:
Step I:
x P (X = x) y = 2x x P (X = x) y = 2x
0 1/11 1 0 1/1024 1
1 1/11 2 1 10/1024 2
2 1/11 4 2 45/1024 4
3 1/11 8 3 120/1024 8
4 1/11 16 4 210/1024 16
5 1/11 32 5 252/1024 32
6 1/11 64 6 210/1024 64
7 1/11 128 7 120/1024 128
8 1/11 256 8 45/1024 256
9 1/11 512 9 10/1024 512
10 1/11 1024 10 1/1024 1024
Step II: Range of Y = {1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024}
1
In the case of Uniform distribution, P (Y = y) = for y ∈ Y . While, in case of
11
Binomial distribution, the probabilities vary for different values of X.
Plots of Y
50
X ∼ Uniform{0, 1, . . . , 100}
Y = 2X
i.
51
X ∼ Binomial(10, 0.5)
Y = 2X
ii.
52
(b) X ∼ Binomial(10, 0.5)
Solution:
Step I: Given any distribution, we can always make a simple table like below,
which we refer to as the table method and it is a very powerful method to compute
the distributions.
x P (X = x) y = (x − 5)2 x P (X = x) y = (x − 5)2
0 1/11 25 0 1/1024 25
1 1/11 16 1 10/1024 16
2 1/11 9 2 45/1024 9
3 1/11 4 3 120/1024 4
4 1/11 1 4 210/1024 1
5 1/11 0 5 252/1024 0
6 1/11 1 6 210/1024 1
7 1/11 4 7 120/1024 4
8 1/11 9 8 45/1024 9
9 1/11 16 9 10/1024 16
10 1/11 25 10 1/1024 25
From the table, we can see that the values of Y are repeating. There is no one-
to-one correspondence here.
Plots of Y
(a) In case of X ∼ Uniform{0, 1, 2, . . . , 10}, the distribution of Y is
y 0 1 4 9 16 25
f (y) 1/11 2/11 2/11 2/11 2/11 2/11
53
X ∼ Uniform{0, 1, . . . , 100}
Y = (X − 5)2
Notice that in case of X ∼ Uniform{0, 1, 2, . . . , 10}, the graph was flat. The
probabilities were 1/11 for each x. Now, in case of Y = (X − 5)2 , the PMF
have changed. So, in case of many-to-one functions, all this sort of things can
happen.
(b) In case of X ∼ Binomial(10, 0.5), the distribution of Y is
y 0 1 4 9 16 25
f (y) 252/1024 420/1024 240/1024 90/1024 20/1024 2/1024
54
X ∼ Binomial(10, 0.5)
Y = (X − 5)2
Problems
1. Let X ∼ Uniform{−5, −4, . . . , 5}. Let
(
x, x > 0
f (x) =
0, x ≤ 0
55
Find the distribution of Y = f (X).
x P (X = x) y = f (x)
−5 1/11 0
−4 1/11 0
−3 1/11 0
−2 1/11 0
−1 1/11 0
0 1/11 0
1 1/11 1
2 1/11 2
3 1/11 3
4 1/11 4
5 1/11 5
y 0 1 2 3 4 5
P (Y = y) 6/11 1/11 1/11 1/11 1/11 1/11
2. Let X ∼ Uniform{−500, −499, . . . , 500}. Let f (x) = max(x, 5). Find the distribution
of Y = f (X).
x P (X = x) y = f (x)
−500 1/1001 5
.. .. ..
. . .
.. .. ..
. . .
4 1/1001 5
5 1/1001 5
6 1/1001 6
.. .. ..
. . .
.. .. ..
. . .
500 1/1001 500
56
y 5 6 7 ... ... 500
P (Y = y) 506/1001 1/1001 1/1001 ... ... 1/1001
When the random variable take very few values, the table method that we saw for one
random variable case, will continue to work in this case too.
1. Sum
x y fXY (x, y) z
0 0 1/4 0
0 1 1/4 1
1 0 1/4 1
1 1 1/4 2
z 0 1 2
P (Z = 0) 1/4 1/2 1/4
2. Maximum
Y 0 1 2
X
0 1/2 1/4 1/8
1 1/16 1/32 1/32
Let Z = max(X, Y ).
57
x y fXY (x, y) z
0 0 1/2 0
0 1 1/4 1
0 2 1/8 2
1 0 1/16 1
1 1 1/32 1
1 2 1/32 2
z 0 1 2
P (Z = 0) 1/2 11/32 5/32
Let’s say we have two random variables, each taking the values from 1 to 100. So, we
will have a total of 104 possibilities, and the table method is not going to be very efficient
in this case.
So, when the size of this two random variables become very large, we will need something
different and that is what we are going to see in the next section.
When we have a function of two random variables g(x, y), we can do a 3D plot. For example,
suppose g(x, y) = x + y for every (x, y). For (x1 , y1 ), we will have g(x1 , y1 ), and for (x2 , y2 ),
we will have g(x2 , y2 ). In this case, we will need a third axis to denote the function value
which can be done using computers, but such graphs do not help us in solving problems in
exam.
In reality, it is very difficult to visualize such functions, but contours can be of great use
in problem solving.
• Contours I: All the values of (x, y) that will result in g(x, y) = c.
Make a plot of those (x, y) for different c.
• Regions II: All the values of (x, y) that will result in g(x, y) ≤ c.
Make a plot of those (x, y) for different c.
The contours and region are very useful in visualizing the g(x, y). Let us start with a
very simple function, i.e., sum of two random variables.
58
Sum function, g(x, y) = x + y
Contours for g(x, y) will be the set of all values of (x, y) for which x + y equals to some
value.
Notice that the contours are set of straight lines here. All the lines have the same slope
as −1.
Examples
59
1. Throw a die twice
A fair die is thrown twice. What is the probability that the sum of the two numbers
seen is 6? What is the PMF of the sum?
Solution:
Let X1 be the first number observed and X2 be the second number observed.
Let S = X1 + X2
Sum of the two numbers is 6 for (1, 5), (2, 4), (3, 3), (4, 2), (5, 1).
Total number of possible outcomes = 36
5
Therefore, the required probability is .
36
(b) PMF of S
• S ∈ {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
1
• S = 2 for (1, 1), P(S = 2) =
36
2
S = 3 for values (1, 2), (2, 1), P(S = 3) =
36
3
S = 4 for values (1, 3), (2, 2), (3, 1), P(S = 4) =
36
..
.
1
S = 12 for (6, 6), P(S = 12) =
36
2. Area of a random rectangle
Solution:
1
It is given that L ∼ Uniform{5, 7, 9, 11}. Therefore, fL (l) =
4
1
(B | L) ∼ Uniform{l − 1, l − 2, l − 3}. Therefore, fB|L (b, l) =
3
fXY = fX fY |X
60
1 1 1
Therefore, for every (l, b), fLB (l, b) = fL (l)fB|L (b, l) = × = .
4 3 12
l b LB fLB (l, b)
5 4 20 1/12
5 3 15 1/12
5 2 10 1/12
7 4 28 1/12
7 3 21 1/12
7 2 14 1/12
9 4 36 1/12
9 3 27 1/12
9 2 18 1/12
11 4 44 1/12
11 3 33 1/12
11 2 22 1/12
Therefore, Area ∼ Uniform{10, 14, 15, 18, 20, 21, 22, 27, 28, 33, 36, 44}.
PMF of g(X1 , . . . , Xn )
Definition: Suppose X1 , . . . , Xn have the joint PMF fX1 ,...,Xn with TXi denoting the range of
Xi . Let g : TX1 ×. . .×TXn → R be a function with range Tg . The PMF of X = g(X1 , . . . , Xn )
is given by
X
fX (t) = P (g(X1 , . . . , Xn ) = t) = fX1 ,...,Xn (t1 , . . . , tn )
t1 ,...,tn :g(t1 ,...,tn )=t
Examples
Let the joint PMF of three random variables X1 , X2 and X3 be given in the following
table.
61
t1 t2 t3 fX1 X2 X3 (t1 , t2 , t3 )
0 0 0 1/9
0 0 1 1/9
0 0 2 1/9
0 1 1 1/9
0 1 2 1/9
1 0 0 1/9
1 0 2 1/9
1 1 0 1/9
1 1 1 1/9
Solution:
t1 t2 t3 fX1 X2 X3 g = t1 + t2 + t3
0 0 0 1/9 0
0 0 1 1/9 1
0 0 2 1/9 2
0 1 1 1/9 2
0 1 2 1/9 3
1 0 0 1/9 1
1 0 2 1/9 3
1 1 0 1/9 2
1 1 1 1/9 3
X 0 1 2 3
P (X = x) 1/9 2/9 3/9 3/9
Solution:
62
t1 t2 t3 fX1 X2 X3 h = t2 t3
0 0 0 1/9 0
0 0 1 1/9 0
0 0 2 1/9 0
0 1 1 1/9 1
0 1 2 1/9 2
1 0 0 1/9 0
1 0 2 1/9 0
1 1 0 1/9 0
1 1 1 1/9 1
Y 0 1 2
P (Y = y) 6/9 2/9 1/9
Solution:
t1 t2 t3 fX1 X2 X3 g = t1 + t2 + t3 h = t2 t3
0 0 0 1/9 0 0
0 0 1 1/9 1 0
0 0 2 1/9 2 0
0 1 1 1/9 2 1
0 1 2 1/9 3 2
1 0 0 1/9 1 0
1 0 2 1/9 3 0
1 1 0 1/9 2 0
1 1 1 1/9 3 1
X 0 1 2 3
Y
0 1/9 2/9 2/9 1/9
1 0 0 1/9 1/9
2 0 0 0 1/9
63
2. Sum of two uniforms
Solution:
1 1 1
P (Z = 0) = P (X = 0, Y = 0) = × =
4 4 16
1 1 1 1 2
P (Z = 1) = P (X = 0, Y = 1) + P (X = 1, Y = 0) = × + × =
4 4 4 4 16
..
.
..
.
1 1 1 1 2
P (Z = 5) = P (X = 2, Y = 3) + P (X = 3, Y = 2) = × + × =
4 4 4 4 16
1 1 1
P (Z = 6) = P (X = 3, Y = 3) = × =
4 4 16
Therefore, the distribution of Z is
z 0 1 2 3 4 5 6
fZ (z) 1/16 2/16 3/16 4/16 3/16 2/16 1/16
Result: Let X1 , . . . , Xn be the results of n i.i.d. Bernoulli(p) trials. The sum of the n
random variables X1 + . . . + Xn is Binomial(n, p).
Solution:
64
Let z be an integer.
P (Z = z) =P (X + Y = z)
X∞
= P (X = x, Y = z − x)
x=−∞
X∞
= fXY (x, z − x)
x=−∞
X∞
= fXY (z − y, y)
y=−∞
∞
P
Convolution: If X and Y are independent, fX+Y (z) = fX (x)fY (z − x)
x=−∞
Solution:
X ∈ {0, 1, 2, . . .}
Y ∈ {0, 1, 2, . . .}
Z = X + Y Z ∈ {0, 1, 2, . . .}
∞
X
fZ (z) = fX (x)fY (z − x)
x=0
∞
X e−λ1 λx1 e−λ2 λz−x
2
= ·
x=0
x! (z − x)!
(
0, ifx < 0
fZ (z) =
0, ifx > z
Therefore,
z
X e−λ1 λx1 e−λ2 λz−x
2
fZ (z) = ·
x=0
x! (z − x)!
z
e−λ1 e−λ2 X z!
λx1 λz−x
2
z! x=0
x!(z − x)!
65
z z!
λx1 λz−x = (λ1 + λ2 )z
P
Now, we know that 2
x=0 x!(z − x)!
Therefore,
e−(λ1 +λ2 ) (λ1 + λ2 )z
fZ (z) =
z!
Z ∼ Poisson(λ1 + λ2 )
Solution:
P (X = k, Z = n)
P (X = k | Z = n) =
P (Z = n)
P (X = k)P (Z = n | X = k)
=
P (Z = n)
P (X = k)P (Y = n − k)
=
P (Z = n)
−λ1 k −λ2 n−k
e λ1 e λ2
k! (n − k)!
= −(λ1 +λ2 )
e (λ1 + λ2 )n
n!
k n−k
n! λ1 λ2
=
k!(n − k)! λ1 + λ2 λ1 + λ2
λ1
Therefore, X | Z ∼ Binomial n,
λ1 + λ2
λ2
Similarly, Y | Z ∼ Binomial n,
λ1 + λ2
Try it Yourself
1. Sum of n independent Bernoulli(p) trials is Binomial(n, p).
2. Sum of 2 independent Uniform random variables is not Uniform.
3. Sum of independent Binomial(n, p) and Binomial(m, p) is Binomial(n + m, p).
4. Sum of r i.i.d. Geometric(p) is Negative-Binomial(r, p).
5. Sum of independent Negative-Binomial(r, p) and Negative-Binomial(s, p) is Negative-
Binomial(r + s, p)
6. If X and Y are independent, then g(X) and h(Y ) are also independent.
66
2.3.5 Minimum and Maximum of two random variables
We saw the general functions of random variables and their sum and how they results in
many interesting relationship. Another function which occur quite often is maximum and
minimum. For example, Suppose you throw a die twice and you are interested in finding the
maximum or minimum of the two numbers seen. There are many situations where you want
to track how low or how high some value can go and this will show quite often in practice.
Given the joint PMF, the distribution of minimum or maximum can be written quite
easily. Let’s look at how to do it.
Solution:
fZ (z) = P (Z = z) = P (min{X, Y } = z)
= P (X = z, Y = z) + P (X = z, Y > z) + P (X > z, Y = z)
X X
= fXY (z, z) + fXY (z, t2 ) + fXY (t1 , z)
t2 >z t1 >z
Solution:
FZ (z) = P (Z ≤ z) = P (min{X, Y } ≤ z)
= 1 − P (min{X, Y } > z)
= 1 − [P (X > z, Y > z)]
Two fair die is tossed. Define Z as the minimum of the two numbers seen. Find the PMF
of Z.
67
Y 1 2 3 4 5 6
X
1 1/36 1/36 1/36 1/36 1/36 1/36
2 1/36 1/36 1/36 1/36 1/36 1/36
3 1/36 1/36 1/36 1/36 1/36 1/36
4 1/36 1/36 1/36 1/36 1/36 1/36
5 1/36 1/36 1/36 1/36 1/36 1/36
6 1/36 1/36 1/36 1/36 1/36 1/36
11
P (Z = 1) = P (min(X, Y ) = 1) =
36
9
P (Z = 2) = P (min(X, Y ) = 2) =
36
7
P (Z = 3) = P (min(X, Y ) = 2) =
36
5
P (Z = 4) = P (min(X, Y ) = 2) =
36
3
P (Z = 5) = P (min(X, Y ) = 2) =
36
1
P (Z = 6) = P (min(X, Y ) = 2) =
36
Independent case: Cumulative distributin function (CDF) of maximum and min-
imum
Now,
FZ (z) =P (max(X, Y ) ≤ z)
=P ((X ≤ z) and (Y ≤ z))
=P (X ≤ z)P (Y ≤ z) Since X and Y are independent
=FX (z)FY (z)
Therefore, CDF of maximum of two random variables is the product of CDFs of X and Y .
What about the CDF of minimum of two random variables? Now here, instead of less than
equal to we can go with greater than equal to. This is called the complementary CDF. So, in
68
this case the events [(X ≥ z) and (Y ≥ z)] and min(X, Y ) are same, for some z ∈ min(X, Y ).
1. min(X1 , . . . , Xn )
Solution:
P (min(X1 , . . . , Xn ) ≤ z) =1 − P (min(X1 , . . . , Xn ) ≥ z)
=1 − P (X1 ≥ z, X2 ≥ z, . . . , Xn ≥ z)
=1 − [P (X1 ≥ z) . . . P (Xn ≥ z)] Xi′ s are independent
=1 − [P (X ≥ z) . . . P (X ≥ z)] Xi′ s are identically distributed
=1 − (P (X ≥ z))n
2. max(X1 , . . . , Xn )
Solution:
P (max(X1 , . . . , Xn ) ≤ z) =P (max(X1 , . . . , Xn ) ≤ z)
=P (X1 ≤ z, X2 ≤ z, . . . , Xn ≤ z)
=P (X1 ≤ z) . . . P (Xn ≤ z)] Xi′ s are independent
=P (X ≤ z) . . . P (X ≤ z)] Xi′ s are identically distributed
=(P (X ≤ z))n
=[FZ (z)]n
Solution:
X, Y ∼ i.i.d Geometric(p)
P (min(X, Y ) ≥ k) =P (X ≥ k)P (Y ≥ k)
69
For any random variable X,
∞
X
P (X ≥ k) = P (X = k)
x=k
=P (X = k) + P (X = k + 1) + P (X = k + 2) + . . .
=p(1 − p)k−1 + p(1 − p)k + p(1 − p)k+1 + . . .
=p(1 − p)k−1 [1 + (1 − p) + (1 − p)2 + . . .]
1
=p(1 − p)k−1 ×
p
k−1
=(1 − p)
Let q = (1 − p)2
2.4 Problems
1. The joint distribution of random variables X1 , X2 and X3 each taking values in {0, 1}
is uniform with joint PMF denoted fX1 X2 X3 .
Define g(X1 , X2 , X3 ) = X1 X2 + 2X3 and h(X1 , X2 , X3 ) = 2X1 + X2 . Find the joint
distribution of g and h.
70
(b) Calculate P (Z1 ≤ 2).
4. Suppose that the random variables X, Y and Z are independent and are equally likely
to be either 0 or 1.
(a) Find the probability mass function of X + Y + Z.
(b) Find the probability mass function of U = XY + Y Z + ZX.
5. Let X and Y be two independent Geometric(p) random variables. Assume that Z =
X − Y . Find the value of fZ (k) where k is a natural number.
6. Suppose that the number of people who visit a dance academy each day is a Poisson
random variable with mean λ. Suppose further that each person who visits is, inde-
pendently, female with probability p or male with probability 1 − p. Find the joint
probability that exactly m men and w women visit the dance academy on any particular
day.
7. A random experiment consists of rolling a fair die until six appears. Let X denote the
number of times the die is rolled. Suppose six does not appear until the sixth throw,
find the probability that six will appear after the eighth throw of die.
2
1
A.
6
2
5
B.
6
8
1
C.
6
8
5
D.
6
8. Let X1 , X2 , · · · , X10 ∼ i.i.d. Geometric( 51 ). Find the probability that (X1 > 10, X2 >
10, · · · , X10 > 10).
A. (0.8)10
B. (0.8)100
C. (0.2)10
D. (0.8)20
9. Let the random variables X and Y , which represent the number of people visiting shop-
ping malls in city 1 and city 2 in an one hour interval, respectively, follow the Poisson
distribution. The average number of people visiting the shopping malls in city 1 and city
2 is 10 per hour and 20 per hour, respectively. Assume that X and Y are independent.
(a) Let Z denote the total number of people visiting shopping malls in city 1 and city
2. Find the pmf of Z, fZ (z).
71
1
e− 30 ( 30
1 z
)
A.
z!
e−30 (30)z
B.
z!
1
− 20 1 z
e ( 20 )
C.
z!
e−20 (20)z
D.
z!
(b) Find the conditional distribution of Y given that the total number of people visiting
shopping malls in city 1 and city 2 is 30.
2
A. (Y | Z = 30) ∼ Binomial 30,
3
1
B. (Y | Z = 30) ∼ Binomial 30,
3
2
C. (Y | Z = 30) ∼ Poisson
3
D. (Y | Z = 30) ∼ Poisson(30)
72