Stats Book
Stats Book
PROBABILITY
Copyright © 2023 Amit Goyal
Contents
1 Probability 11
2 Conditional Probability 21
Discrete Random
3 Variables 25
6 Sampling 59
7 Estimation 65
List of Figures
4.1 xy > u 56
List of Tables
probability 9
1 | Probability
Number of outcomes in E
Pr( E) =
Number of outcomes in S
Note that this definition assumes all outcomes are equally
likely and the sample space is a finite set.
12 amit goyal
1. Order Matters
(a) With Replacement
nk
(b) Without Replacement
n n!
Pk =
(n − k)!
n+k−1 ( n + k − 1) !
n + k −1
Ck = =
k k!(n − 1)!
Theorem 1.5
Thefollowing
equalities
hold:
n n
1. =
k n − k
n−1
n
2. n =k
k−1 k
k
m+n m n
3. =∑
k j k−j
j=0
n−1 n−1
n
4. = +
k k k−1
n
n
5. ( a + b)n = ∑ ak bn−k
k =0
k
Theorem 1.6
1. Pr( Ac ) = 1 − Pr( A)
2. If A ⊂ B, then Pr( A) ≤ Pr( B).
3. Pr( A ∪ B) = Pr( A) + Pr( B) − Pr( A ∩ B)
4. Pr( A ∪ B ∪ C ) = Pr( A) + Pr( B) + Pr(C ) − Pr( A ∩ B) −
Pr( A ∩ C ) !
− Pr( B ∩ C ) + Pr( A ∩ B ∩ C )
n n
∑ Pr( Ai ) − ∑ Pr( Ai ∩ A j )+
[
5. Pr Ai =
i =1 i =1 i< j
∑ Pr( Ai ∩ A j ∩ Ak ) − ∑ Pr( Ai ∩ A j ∩ Ak ∩ Al ) + · · · +
i< j<k i < j<k<l
!
n
n +1
\
(−1) Pr Ai
i =1
14 amit goyal
Solved Problems
Solution 1.1
7 5
−
3 3
Solution 1.2
Let Ar be the event that there is at least one red ball drawn
in the first seven balls. Likewise, Ab be the event that there is
at least one black ball drawn in the first seven balls, and Aw
be the event that there is at least one white ball in the seven
draws. We want to find the probability of the event that at
least one ball of each color is drawn in the seven draws which
is
Pr( Ar ∩ Ab ∩ Aw ) = 1 − Pr( Arc ∪ Acb ∪ Acw ).
So to find Pr( Ar ∩ Ab ∩ Aw ), we just need to find Pr( Arc ∪ Acb ∪
Acw ). By inclusion-exclusion principle,
Solution 1.3
Let P be the event that the contractor gets the plumbing con-
tract, E be the event that he gets the electricity contract. We are
given
Pr( P) = 23 , Pr( E) = 95 and Pr( P ∪ E) = 45 .
To find Pr( P ∩ E), we will use the following equality
Pr( P ∩ E) = Pr( P) + Pr( E) − Pr( P ∪ E) = 23 + 59 − 45 = 19
45
16 amit goyal
Solution 1.4
Let Ej denotes the event that player j gets three aces, where j ∈
{1, 2, 3, 4, 5}. We want to find the probability Pr( E1 ∪ E2 ∪ E3 ∪
E4 ∪ E5 ). Since E1 , E2 , E3 , E4 , E5 are mutually disjoint,
Pr( E1 ∪ E2 ∪ E3 ∪ E4 ∪ E5 ) = Pr( E1 ) + Pr( E2 ) + Pr( E3 ) +
Pr( E4 ) + Pr( E5 )
By symmetry, Pr( E1 ) = Pr( E2 ) = Pr( E3 ) = Pr( E4 ) = Pr( E5 )
holds.
Therefore,
(43) 1
Pr( E1 ∪ E2 ∪ E3 ∪ E4 ∪ E5 ) = 5 Pr( E1 ) = 5 × 52 =
( ) 1105
3
What is the probability that no two have the same face value in
a poker hand of 5 cards?
Solution 1.5
(13
5 )4
5
≈ 0.50708
(52
5)
Solution 1.6
26!
Solution 1.7
Solution 1.8
Notice that the number of ways to put 15 identical balls in
3 distinct boxes such that each box contains at least 1 and at
most 10 balls is equal to the number of positive integer solu-
tions to the following system of equations/inequalities:
x1 + x2 + x3 = 15
1 ≤ x1 ≤ 10
1 ≤ x2 ≤ 10
1 ≤ x3 ≤ 10
y1 + y2 + y3 = 12
0 ≤ y1 ≤ 9
0 ≤ y2 ≤ 9
0 ≤ y3 ≤ 9
Does it imply that the two not independent events are mutu-
ally exclusive events?
Solution 1.9
No. Consider any event A with the property that 0 < Pr( A) <
1. A is neither independent of itself nor A and A are mutually
exclusive. However, two mutually exclusive events A and B
with the property that 0 < Pr( A) ≤ 1 and 0 < Pr( B) ≤ 1 can’t
be independent. This is because Pr( A ∩ B) = 0 ̸= Pr( A) Pr( B).
probability 19
Solution 1.10
(52) + (42)
.
(92)
Solution 1.11
Solution 1.12
Theorem 2.1
1. Pr( A ∩ B) !
= Pr( B) Pr( A| B) = Pr( A) Pr( B| A)
n
\
2. Pr Ai = Pr( A1 ) Pr( A2 | A1 ) Pr( A3 | A1 ∩
i =1
A2 ) · · · Pr( An | A1 ∩ A2 ∩ A3 · · · ∩ An−1 )
Pr( B| A) Pr( A)
3. (Bayes’ Rule) Pr( A| B) =
Pr( B)
4. (Law of total Probability) Given a partition
A1 , A2 , A3 , . . . , An of S,
n n
Pr( E) = ∑ Pr(E ∩ Ai ) = ∑ Pr(E| Ai ) Pr( Ai )
i =1 i =1
Pr( B| A) Pr( A)
5. (Bayes’ Rule) Pr( A| B) =
Pr( B| A) Pr( A) + Pr( B| Ac ) Pr( Ac )
Solved Problems
A bag contains 10 white and 3 red balls while another bag con-
tains 3 white and 5 red balls. Two balls are drawn at random
and put in the second bag. Then a ball is drawn at random
from the second bag, what is the probability that it is a white
ball?
Solution 2.1
Let EWW be the event that two balls drawn from the first bag
are both white. Likewise, EWR be the event that one white ball
and one red ball are drawn from the first bag, and ERR be the
event that two balls drawn from the first bag are both red. Let
W be the event that a white ball is drawn from the second bag.
By the Law of total probability,
Solution 2.2
Three dice are thrown simultaneously. Let A be the event that
4 appears on two dice, and B be the event that 5 occurs on (ex-
actly) one dice. We want to find the conditional probability of
A given B.
Pr( A ∩ B) (3) 13 1
Pr( A| B) = = 13 652 =
Pr( B) ( 1 ) 63 25
probability 23
Solution 2.3
Pr( D | A) Pr( A)
Pr( A| D ) =
Pr( D | A) Pr( A) + Pr( D | B) Pr( B) + Pr( D |C ) Pr(C )
0.008
=
0.008 + 0.014 + 0.0075
= 0.2712
Pr( D | A) Pr( A)
Pr( A| D ) =
Pr( D | A) Pr( A) + Pr( D | B) Pr( B) + Pr( D |C ) Pr(C )
2
=
9
4
Similarly, Pr( B| D ) = 9 and Pr(C | D ) = 13 .
Discrete Random
3 | Variables
1. FX is monotonically non-decreasing.
2. lim FX ( x ) = 0 and lim FX ( x ) = 1
x →−∞ x →∞
3. FX is right-continuous.
26 amit goyal
p X (1) = Pr( X = 1) = p
p X (0) = Pr( X = 0) = 1− p
p X ( x ) = Pr( X = x ) = p(1 − p) x
e−λ λ x
p X ( x ) = Pr( X = x ) =
x!
where x ∈ {0, 1, 2, 3 . . .}.
probability 27
E( X ) = ∑ xp X ( x ) = ∑ X (s) Pr({s})
x ∈ X (S) s∈S
Theorem 3.2
Theorem 3.3
E ( X + Y ) = E ( X ) + E (Y )
E(cX ) = cE( X )
28 amit goyal
pY ( y ) = ∑ pX (x)
{ x:g( x )=y}
E( g( X )) = ∑ g( x ) p X ( x )
x ∈ X (S)
noted by σX .
Theorem 3.7
r+x−1 r
p X ( x ) = Pr( X = x ) = p (1 − p ) x
r−1
where x ∈ {0, 1, 2, 3 . . .}
X ∼ DUnif( a, a + n) if
1
p X ( x ) = Pr( X = x ) =
n+1
where x ∈ { a, a + 1, a + 2, . . . , a + n}.
pX (x) = ∑ p X,Y ( x, y)
y ∈Y ( S )
Pr( X = x ∩ A)
p X | A ( x ) = Pr( X = x | A) =
Pr( A)
p X |Y ( x | y ) = Pr( X = x |Y = y)
Pr( X = x, Y = y)
=
Pr(Y = y)
p X,Y ( x, y)
or simply
pY ( y )
p X,Y ( x, y) = p X ( x ) pY (y)
for all x, y.
E( g( X, Y )) = ∑ ∑ g( x, y) p X,Y ( x, y)
x ∈ X ( S ) y ∈Y ( S )
probability 31
Theorem 3.8
Theorem 3.9
nm
1. For X ∼ Hyper( N, n, m), E( X ) = , and V( X ) =
N
nm( N − m)( N − n)
.
N 2 ( N − 1)
r (1 − p )
2. For X ∼ NBin(r, p), E( X ) = , and V( X ) =
p
r (1 − p )
.
p2
Theorem 3.10
−1 ≤ ρ( X, Y ) ≤ 1
Theorem 3.11
E ( X |Y = y ) = ∑ xp X |Y ( x |y)
x ∈ X (S)
Note1 1
E( X |Y ) is a random variable
Theorem 3.12
E( X ) = E(E( X |Y ))
Theorem 3.13
V( X ) = V(E( X |Y )) + E(V( X |Y ))
1
Pr( X ≤ m X ) ≥
2
1
Pr( X ≥ m X ) ≥
2
Solved Problems
Solution 3.1
V( XY ) = E( X 2 Y2 ) − (E( XY ))2
= E( X 2 )E(Y2 ) − (E( X )E(Y ))2
= 4E( X 2 )E( X 2 ) − 4(E( X ))4
" #
(n + 1)(2n + 1) 2 n+1 4
= 4 −
6 2
probability 33
Solution 3.2
Therefore, Cov( X, Y ) = 3 − 4 = −1
Solution 3.3
Solution 3.4
Solution 3.5
Solution 3.6
Solution 3.7
Solution 3.8
Solution 3.9
Solution 3.10
Solution 3.11
Solution 3.12
Given that
X = number of objects that end up in the first box
Also let
Y = number of objects that end up in the second box
Z = number of objects that end up in the third box
Since four identical objects are distributed randomly into these
3 distinct boxes, we have
X+Y+Z = 4
Therefore,
E ( X ) + E (Y ) + E ( Z ) = 4
By symmetry, E( X ) = E(Y ) = E( Z )
So we get E( X ) = 34 .
Solution 3.13
Solution 3.14
Solution 3.15
Solution 3.16
N = I1 + I2 + · · · + I9 + I10
By linearity of expectation,
E( N ) = E( I1 ) + E( I2 ) + · · · + E( I9 ) + E( I10 )
E( I1 ) = E( I10 ) and E( I2 ) = · · · = E( I9 )
(51)(51)8! 5
E( I1 ) = Pr( I1 = 1) = =
10! 18
E( I2 ) = Pr( I2 = 1)
= 1 − Pr( I2 = 0)
= 1 − [Pr(seat number 2 is occupied by man) +
Pr(seat numbers 1, 2 and 3 are occupied by women)]
" #
(51)9! (53)3!7!
1 1 5
= 1− + = 1− + =
10! 10! 2 12 12
Therefore,
E( N ) = E( I1 ) + E( I2 ) + · · · + E( I9 ) + E( I10 )
5 10 35
= [2 × E( I1 )] + [8 × E( I2 )] = + =
9 3 9
40 amit goyal
Solution 3.17
n0 = 1 + 0.5n0 + 0.5n1
n1 = 1 + 0.5n2 + 0.5n0
n2 = 1 + 0.5n0
n0 = 14, n1 = 12, n2 = 8
Solution 3.18
Solution 3.19
Solution 3.20
np = 4, np − np2 = 2.4
n = 10, p = 0.4
Z ∞
E( X ) = x f X ( x )dx
−∞
V( X ) = E( X − E( X ))2 = E( X 2 ) − (E( X ))2
1 1
Table of comparison between Discrete
and Continuous RVs
X ∼ U [ a, b] if
1
if a ≤ x ≤ b
f X (x) = b − a
0 otherwise
44 amit goyal
dx
f Y (y) = f X ( x )
dy
where y = g( x ).
1 1 x −µ 2
f X (x) = √ e− 2 ( σ )
σ 2π
2 2
68%-95%-99.7% Rule
MX (t) = E(etX )
, if this is finite for some interval (− a, a), a > 0. Not only can a
moment-generating function be used to find moments of a ran-
dom variable, it can also be used to identify which probability
mass function a random variable follows.
Theorem 4.3
Theorem 4.4
Theorem 4.5
1 2 t2
MGF of X ∼ N (µ, σ2 ) is MX (t) = eµt+ 2 σ .
probability 45
Its CDF is
Z x 1 − e−λx , if x > 0
FX ( x ) = f X (t)dt =
−∞ 0, otherwise
Theorem 4.6
Theorem 4.7
λ
MGF of X ∼ Expo(λ) is MX (t) = where λ > t.
λ−t
Pr( X ≥ s + t| X ≥ s) = Pr( X ≥ t)
for all s, t ∈ R+ .
Theorem 4.8
We say T ∼ tn if
Z
T= √
X/n
where Z ∼ N (0, 1) and X ∼ χ2 (n). Also, Z and X are indepen-
dent.
FX,Y ( x, y) = Pr( X ≤ x, Y ≤ y)
f X,Y ( x, y)
f X |Y ( x | y ) =
f Y (y)
Theorem 4.9
X
We say Y ∼ Gamma( a, λ) if Y = for X ∼ Gamma( a, 1).
λ
Density of Y is
λ (λy) a−1 e−λy ,
if y > 0
f Y (y) = Γ( a)
0,
otherwise
where (y1 , y2 ) = g( x1 , x2 ).
48 amit goyal
Solved Problems
Solution 4.1
2X − Y X
if Y > 2
min(2X − Y, X + Y ) =
X
X + Y if Y ≤ 2
Therefore,
E [min(2X − Y, X + Y )]
Z 1Z 1 Z 1Z x
2
= x
(2x − y)dydx + ( x + y)dydx
0 2 0 0
5
=
12
Solution 4.2
Now we can obtain the CDF from the PDF in this way:
0 for t ≤ 0
t
Z t
2
for 0 < t ≤ 1
FX (t) = Pr( X ≤ t) = f x ( x )dx = 12 for 1 < t < 3
−∞
1 t −3
2 + 2 for 3 < t < 4
for t ≥ 4
1
Solution 4.3
1
Z θ
θ
E (Y ) = ydy =
0 θ 2
and
1 2 θ2
Z θ
2
E (Y ) = y dy =
0 θ 3
Consequently, the variance is
θ2
V (Y ) = E(Y2 ) − (E(Y ))2 =
12
Solution 4.4
Z 1
Pr(Y 2 > X > t) = Pr(Y 2 > X > t|Y = y) f Y (y)dy
0
Z 1
= Pr(y2 > X > t|Y = y)dy
0
Z 1
= √ Pr(y2 > X > t)dy
t
Z 1
2
= √ (y − t)dy
t
1 2 3
= − t + t2
3 3
Solution 4.5
Solution 4.6
Solution 4.7
Solution 4.8
Given that Y ∼ N (µY , σY2 ), and X = aeY for a > 0, what is the
density function of X?
Solution 4.9
Solution 4.10
Therefore, every point in the interval [0, 1] solves (1). Hence all
values in the interval [0, 1] are modes of the distribution of X.
Solution 4.11
Solution 4.12
Given that X and Y are iid Unif(0, 1), the joint density of X
and Y is
1 if 0 < x < 1 and 0 < y < 1
f X,Y ( x, y) =
0 elsewhere
Solution 4.13
∞
E( S ) = ∑ E( A i )
i =1
where
E( A1 ) = 12
E( A2 ) = E(E( A2 | A1 )) = E( A21 ) = 21 E( A1 ) = 14
E( A3 ) = E(E( A3 | A2 )) = E( A22 ) = 21 E( A2 ) = 18
Likewise by induction step,
if E( An ) = 21n then E( An+1 ) = 2n1+1 .
Here is the proof:
E( An+1 ) = E(E( An+1 | An )) = E( A2n ) = 12 E( An ) = 1
2n +1
Therefore,
∞ ∞
E( S ) = ∑ E( A i ) = ∑ 1
2i
= 1.
i =1 i =1
Solution 4.14
E (Y | X = x ) = x
V(Y | X = x ) = 31
So,
E(Y ) = E(E(Y | X )) = E( X ) = 1
and
V(Y ) = V(E(Y | X )) + E(V(Y | X )) = V( X ) + E 13 = 5
3 + 1
3 =
2.
Solution 4.15
Pr(max( X, 1 − X ) ≥ 3 min( X, 1 − X ))
= Pr( X ≥ 3(1 − X )) + Pr(1 − X ≥ 3X )
3 1
= Pr X ≥ + Pr X ≤
4 4
1 1
= +
4 4
1
=
2
What is E( X )?
Solution 4.16
Density of X is
Z 1− x
24xydy = 12x (1 − x )2 if 0 < x < 1
f X (x) = 0
0 elsewhere
Solution 4.17
Therefore,
2
u − 2u ln u if u ∈ (0, 1)
FU (u) = 1 − Pr(U > u) = 0 if u ≤ 0
1 if u ≥ 1
1
0 otherwise
0
0 u 1
x→
Figure 4.1: xy > u
5 | Topics in Random Variables
E( g( X )) ≥ g(E( X ))
E(| X |)
Pr(| X | ≥ a) ≤
a
4. Chebyshev’s Inequality. For any a > 0,
V( X )
Pr(| X − µ| ≥ a) ≤
a2
where µ = E( X ).
lim Pr(|Yn − a| ≥ ϵ) = 0
n→∞
6 | Sampling
X1 + X2 + · · · + X n
Mn =
n
Theorem 6.1
Let X1 , X2 , . . . be i.i.d random variables with mean µ and vari-
ance σ2 , we have Mn converges to µ in probability.
60 amit goyal
Since Ij s are i.i.d with mean p, it follows from the law of large
numbers that estimate n1 ∑nj=1 Ij converges to p in probability
as the number of points approach infinity.
probability 61
Rn ( x )
F̂n ( x ) =
n
for every z.
62 amit goyal
Y∼
˙ N (np, np(1 − p))
Y∼
˙ N (n, n)
Theorem 6.9
Show that
(n − 1)Sn2
∼ χ2n−1
σ2
Solved Problems
Solution 6.1
Given the data above, we can find expectation and variances of
Xi and X n as follows:
E( Xi ) = 74 = 1.75
V( Xi ) = 163
E( X n ) = 74 = 1.75
V( X n ) = 16n3
By Chebyshev’s Inequality,
Pr(| X n − 1.75| > 0.05) ≤ 75
n
Therefore, lim Pr(| X n − 1.75| > 0.05) = 0.
n→∞
Working on the desired quantity Pr( X n ≤ 1.8), we get
Since lim Pr(| X n − 1.75| > 0.05) = 0, the above implies that
n→∞
lim Pr( X n ≤ 1.8) = 1.
n→∞
Alternatively, we can also apply weak law of large numbers
according to which X n converges in probability to 1.75 and
consequently, lim Pr( X n ≤ 1.8) = 1 holds.
n→∞
7 | Estimation
x <- 1:1000
y <- rep(0, length(x))
# for loop
for (i in 1:length(x)){
if (x[i]%%3 == 0 | x[i] > 50){
y[i] = x[i]
} else { def cost(X, y, theta):
y[i] = -x[i] m = len(y)
} J =
} (1/(2*m))*(((X@theta-y)**2).sum())
sum(y) return J
Solution 7.2