0% found this document useful (0 votes)
48 views6 pages

0 Notes

The document provides an overview of several common discrete and continuous probability distributions including the Bernoulli, binomial, hypergeometric, Poisson, geometric, uniform, exponential, and gamma distributions. For each distribution, the key properties such as the probability mass function or probability density function, expected value, variance, moment generating function, and Fisher information are defined. Several theorems relating different distributions are also proved, such as the binomial distribution being the sum of independent Bernoulli trials, and the Poisson distribution being the sum of independent exponential distributions.

Uploaded by

NOUB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views6 pages

0 Notes

The document provides an overview of several common discrete and continuous probability distributions including the Bernoulli, binomial, hypergeometric, Poisson, geometric, uniform, exponential, and gamma distributions. For each distribution, the key properties such as the probability mass function or probability density function, expected value, variance, moment generating function, and Fisher information are defined. Several theorems relating different distributions are also proved, such as the binomial distribution being the sum of independent Bernoulli trials, and the Poisson distribution being the sum of independent exponential distributions.

Uploaded by

NOUB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MTL390-Statistical Methods

Lecture 1
Discrete Distributions

Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝


We have 𝔼(𝑋) = 𝑝, 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = 𝑝𝑒 𝑡 + 𝑞, 𝑣𝑎𝑟(𝑋) = 𝔼(𝑋 2 ) − 𝔼(𝑋)2 = 𝑝 − 𝑝2 = 𝑝𝑞

Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞


2
𝜕 log(𝑓(𝑥))
Fischer Information: 𝔼 (( 𝜕𝑝
) )

2
𝜕 log(𝑓(𝑥)) 𝑥2 2𝑥
We have 𝑓(𝑥) = 𝑝 𝑥 (1 − 𝑝)1−𝑥 ⇒ [ ] = 1/𝑞 2 (𝑝2 + 1 − )
𝜕𝑝 𝑝

2
𝜕 log(𝑓(𝑥)) 1 𝑝 2𝑝 1 1 1
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 (𝑝2 + 1 − 𝑝
) = 𝑞2 (𝑝 − 1) = 𝑞𝑝

Binomial Distribution
The probability mass function of Binomial distribution is given by 𝑓(𝑥) =𝑛 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥

Can be considered as sum of n independent Bernoulli trials

Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =


𝑛
(𝑞 + 𝑝𝑒 𝑡 )𝑛 and 𝜙(𝑡) = (𝑞 + 𝑝𝑒 𝑖𝑡 )
2
𝜕 log(𝑓(𝑥)) 1 𝑥2 2𝑛𝑥
Now, we have [ 𝜕𝑝
] = (
𝑞 2 𝑝2
+ 𝑛2 − 𝑝
)

2
𝜕 log(𝑓(𝑥)) 1 𝑛𝑝𝑞+𝑛2 𝑝2 2𝑛𝑛𝑝 1 𝑛(𝑞+𝑛𝑝)
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 ( 𝑝2
+ 𝑛2 − 𝑝
) = 𝑞2 ( 𝑝
+ 𝑛2 − 2𝑛2 ) =
𝑛 𝑛
(𝑞 + 𝑛𝑝 − 𝑛𝑝) = 𝑝𝑞
𝑝𝑞2

Result: If X and Y are independent and X~B(n,p) and Y~B(m,p), then X+Y~B(m+n,p)

Proof:

Method 1: Using MGF

We have for RV X+Y

M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Method 2:

We have ℙ(𝑋 + 𝑌 = 𝑘) = ∑𝑘𝑖=0 ℙ(𝑋 = 𝑖)ℙ(𝑌 = 𝑘 − 𝑖) = ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝑝𝑖 𝑞 𝑛−𝑖 𝐶(𝑚, 𝑘 −
𝑖)𝑝𝑘−𝑖 𝑞 𝑚−𝑘+𝑖

= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘


Combinatorical argument. Number of ways of selecting k objects from a pile containing m + n
objects is equivalent to dividing the objects into two groups of m objects and n objects respectively
and then finding the number of ways of choosing 0,1,2 ,…, k objects from first pile and remaining
objects from the other pile

Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)

Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Hypergeometric distribution:
It models the probability of m successes in M draws without replacement from a finite population
of size N in which n objects are associated with success

Denoted as 𝐻𝐺(𝑁, 𝑛, 𝑀, 𝑚)
𝑛 )( 𝑁−𝑛 )
(𝑚 𝑀−𝑚
We have ℙ(𝑋 = 𝑚) =
(𝑁 )
𝑀
𝑛 𝑀 𝑁−𝑀 𝑁−𝑛
Hence, we have 𝔼(𝑋) = 𝑀 and 𝑣𝑎𝑟(𝑋) = 𝑛
𝑁 𝑁 𝑁 𝑁−1

Poisson Distribution(Poi(𝜆) )
𝑒 −𝜆 𝜆𝑥
We have 𝑓(𝑥) = 𝑥!
,𝑥 = 0,1,2, …
𝑡 −1
Hence, we have 𝔼(𝑋) = 𝜆, 𝑣𝑎𝑟(𝑋) = 𝜆, 𝑀(𝑡) = 𝑒 𝜆𝑒 , 𝜙(𝑡) = exp (𝜆(𝑒 𝑖𝑡 − 1))

𝜕 log(𝑓(𝑥)) 𝑥
Fischer information: log(𝑓(𝑥)) = −𝜆 + 𝑥 log 𝜆 − log(𝑥!) ⇒ 𝜕𝜆
= −1 + 𝜆

𝑓(𝑥) 2 𝑥2 2𝑥 𝜆2 +𝜆 2𝜆 1
Hence, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 𝔼 [1 + 𝜆2 − 𝜆
] =1+ 𝜆2
− 𝜆 =𝜆

𝑓(𝑥) 2
For passion distribution, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 1/𝜆

Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)

Proof:

We have 𝑀(𝑡) = (exp (𝜆(𝑒 𝑖𝑡 − 1)) × (exp (𝛾(𝑒 𝑖𝑡 − 1))) = exp ((𝜆 + 𝛾)(𝑒 𝑖𝑡 − 1))

Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)

Geometric Distribution:
We have 𝑓(𝑥) = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2, …
1
Hence, we have 𝔼(𝑋) = 𝑝 , 𝑣𝑎𝑟(𝑋) = 𝑞/𝑝2

Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)

Result: If a discrete distribution has memory less property, then it needs to be geometric
Proof:

We have ℙ(𝑋 > 2|𝑋 > 1) = ℙ(𝑋 > 1) ⇒ ℙ(𝑋 > 2) = ℙ(𝑋 > 1)2

Let ℙ(𝑋 > 1) = 𝑝

Hence, we have ℙ(𝑋 > 2) = 𝑝2 , ℙ(𝑋 > 3) = ℙ(𝑋 > 3|𝑋 > 2)ℙ(𝑋 > 2) = 𝑝(𝑝2 ) = 𝑝3

Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛

Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =


𝑝𝑛−1 (1 − 𝑝) = 𝑝𝑛−1 𝑞
Hence, we have 𝑋~𝐺𝑒𝑜(𝑝)

For geometric distribution, we have ∑∞


𝑖=0(1 − 𝐹(𝑖)) = 𝔼(𝑋)

Continuous Distributions
Uniform Distribution(U(a,b))
0, 𝑥 ≤ 𝑎
1 𝑥−𝑎
We have 𝑓(𝑥) = ,𝑎 ≤ 𝑥 ≤ 𝑏, 𝐹(𝑥) = {𝑏−𝑎 ,𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1, 𝑥 ≥ 𝑏
𝑎+𝑏 (𝑏−𝑎)2 𝑒 𝑡𝑎 −𝑒 𝑡𝑏
Hence, we have 𝔼(𝑋) = , 𝑣𝑎𝑟(𝑋) = , 𝑀(𝑡) = ,𝑡 ≠0
2 12 𝑡(𝑏−𝑎)

Exponential Distribution(𝐸𝑥𝑝(𝜆))
We have 𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ; 𝑥 > 0, 𝐹(𝑥) = 1 − 𝑒 −𝜆𝑥 ; 𝑥 > 0
1 1 𝜆
Hence, we have 𝔼(𝑋) = 𝜆 , 𝑣𝑎𝑟(𝑋) = 𝜆2 , 𝑀(𝑡) = 𝜆−𝑡

Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥

𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥


Hence, we have 𝜕𝜆
=𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2 + 𝑥 2 − 𝜆

𝜕 log 𝑓(𝑥) 2 1 1 1 2 1
Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2 + (𝜆2 + 𝜆2 ) − 𝜆2 = 𝜆2

Exponential distribution has memoryless property

Proof:
ℙ(𝑋>𝑡+𝑠) 𝑒 −𝜆(𝑡+𝑠)
We have ℙ(𝑋 > 𝑡 + 𝑠|𝑋 > 𝑡) = ℙ(𝑋>𝑡)
= 𝑒 −𝜆𝑡
= 𝑒 −𝜆𝑠 = ℙ(𝑋 > 𝑠)∎

Gamma Distribution( Γ(𝜆, 𝛼 ))


𝜆𝛼
The pdf is 𝑓(𝑥) = Γ𝛼 𝑒 −𝜆𝑥 𝑥 𝛼−1 ; 𝛼 > 0, 𝜆 > 0, 𝑥 > 0

𝜆𝛼 ∞ 𝜆𝛼 𝛼
Hence, we have 𝔼(𝑋) = Γ𝛼 ∫0 𝑒 −𝜆𝑥 𝑥 𝛼+1−1 = λα+1 Γ𝛼 Γ(𝛼 + 1) = 𝜆

𝛼(𝛼+1)
Similarly, we have 𝔼(𝑋 2 ) = 𝜆2
𝛼
Hence, we have 𝑣𝑎𝑟(𝑋) =
𝜆2

𝜆𝛼 ∞ −𝜆𝑥 𝛼−1 𝑡𝑥 𝜆𝛼 Γ𝛼 𝑡 −𝛼
Finally, we have 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = ∫ 𝑒 𝑥 𝑒 = × (𝜆−𝑡)𝛼 = (1 − )
Γ𝛼 0 Γ𝛼 𝜆

Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥


𝜕 log 𝑓(𝑥) 𝛼 𝜕 log 𝑓(𝑥) 2 𝛼2 2𝛼𝑥
Hence, we have 𝜕𝜆
= 𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2
+ 𝑥2 − 𝜆

𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼


Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2
+ 𝜆 2 − 𝜆2
= 𝜆2

If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)

Proof:

Let 𝑍 = 𝑋 + 𝑌
𝑧 𝜆𝛼 𝜆𝛽 𝜆𝛼+𝛽 𝑧
We have 𝑓𝑍 (𝑧) = ∫0 Γ𝛼
𝑒 −𝜆𝑥 𝑥 𝛼−1 Γ𝛽 𝑒 −𝜆(𝑧−𝑥) (𝑧 − 𝑥)𝛽−1 𝑑𝑥 = Γ𝛼Γ𝛽 𝑒 −𝜆𝑧 ∫0 𝑥 𝛼−1 (𝑧 − 𝑥)𝛽−1

𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1


Hence, we have 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 (𝑧 ) (1 − )
Γ𝛼Γ𝛽 𝑧

𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1


Put = 𝑦, we get 𝑓𝑍 (𝑧) = 𝑒 𝑧 ∫0 𝑥 (1 − 𝑥)𝛽−1 = 𝑒 𝑧 𝐵(𝛼, 𝛽 )
𝑧 Γ𝛼Γ𝛽 Γ𝛼Γ𝛽

Γ𝛼Γ𝛽
We have 𝐵(𝛼, 𝛽) =
Γ(𝛼+𝛽)

𝜆𝛼+𝛽
Hence, we have 𝑓𝑍 (𝑧) = Γ(𝛼+𝛽) 𝑒 −𝜆𝑧 𝑧 𝛼+𝛽−1

Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)

Normal Distribution (𝑁(𝜇, 𝜎 2 ))


1 (𝑥−𝜇)2 𝜎2𝑡 2
We have 𝑓(𝑥) = exp (− 2𝜎 2
),𝑀(𝑡) = exp (𝜇𝑡 + )
√2𝜋𝜎 2

If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )

If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )

Tutorial 1
Q1: Solution: 𝑌 = −𝜆 log(1 − 𝑋)

Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞


𝑦
Hence, we have ℙ(𝑌 ≤ 𝑦) = ℙ(−𝜆 log(1 − 𝑋) ≤ 𝑦) = ℙ (1 − 𝑋 ≥ exp (− 𝜆 )) =
𝑦 𝑦
ℙ (𝑋 ≤ 1 − exp (− 𝜆 )) = 1 − exp (− 𝜆 )
𝑦 1 𝑦
Hence, we have 𝐹(𝑦) = 1 − exp (− 𝜆 ) ⇒ 𝑓(𝑦) = 𝜆 exp (− 𝜆 )

1
Hence, we have 𝑌~𝐸𝑥𝑝 (𝜆)
Q2: Solution: We have Y = F(X)

Hence, we have 𝐹𝑌 (𝑦) = ℙ(𝑌 ≤ 𝑦) = ℙ(𝐹𝑋 (𝑋) ≤ 𝑦) = ℙ (𝑋 ≤ 𝐹𝑋−1 (𝑦)) = 𝐹𝑋 (𝐹𝑋−1 (𝑦) = 𝑦

Hence, we have 𝑌~𝑈(0,1)

Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘

Here 𝔼( 𝑋𝑖 ) = 𝑝 − 𝑞

Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘

Q4: Solution: We have ℙ(𝑁𝑡 = 𝑘) = ℙ(𝐸𝑥𝑎𝑐𝑡𝑙𝑦 𝑛 − 𝑘 𝑐𝑜𝑜𝑙𝑒𝑟𝑠 ℎ𝑎𝑣𝑒 𝑓𝑎𝑖𝑙𝑒𝑑 𝑡𝑖𝑙𝑙 𝑡𝑖𝑚𝑒 𝑘) =
𝑛−𝑘 𝑘
𝐶(𝑛, 𝑛 − 𝑘)ℙ(𝑋 ≤ 𝑡)𝑛−𝑘 ℙ(𝑋 ≥ 𝑡)𝑘 = 𝐶(𝑛, 𝑛 − 𝑘)(𝑒 −𝜆𝑡 ) (1 − 𝑒 −𝜆𝑡 )

Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )

Q5: Done

Q6: Done

Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)

Q8:Solution: Max number of virus in 1st gen = 1

Max number of virus in 2nd gen = 2

Max number of virus in 3rd gen = 4

To infect atleast 7 humans atleast 4 viruses are needed

Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]

Q9: Easy
𝜆𝛼 𝑥 𝛼−1 −𝜆𝑥
Q10: Solution We have 𝑓(𝑥) = Γ𝛼
𝑒

𝑑𝑓(𝑥)
For maxima/ minima we have 𝑑𝑥
= 0 ⇒ −𝜆𝑥 𝛼−1 𝑒 −𝜆𝑥 + (𝛼 − 1)(𝑥 𝛼−2 𝑒 −𝜆𝑥 ) = 0 ⇒ 𝑥 =
𝛼−1
0 𝑜𝑟 𝜆𝑥 = 𝛼 − 1 ⇒ 𝑥 = 𝜆
𝛼−1
By first derivative test 𝑥 = 𝜆
is mode.

Q11:Solution: First point can be chosen with probability 1

Second point can only be chosen on the arc subtended by the equilateral triangle
1
Hence, 𝑝 = 3

You might also like