Probability Distributions: 4.1. Some Special Discrete Random Variables 4.1.1. The Bernoulli and Binomial Random Variables

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Chapter 6 Probability Distributions

4.1. Some Special Discrete Random Variables

4.1.1. The Bernoulli and Binomial random variables

▪ Suppose that a trial, or an experiment, whose outcome can be classified as either a


success or a failure is performed. If we let X = 1 when the outcome is a success and X
= 0 when it is a failure, then the probability mass function of X is given by

𝒑(𝟎) = 𝑷(𝒙 = 𝟎) = 𝟏 − 𝒑
{
𝒑(𝟏) = 𝑷(𝒙 = 𝟏) = 𝒑
Where p, 0 ≤ p ≤ 1, is the probability that the trial is a success.
A random variable X is said to be a Bernoulli random variable if its probability mass
function is given by Equations above for some p ∈ (0, 1).
The expectation and variance of X can be calculated from the pdf:
E(X) = p & Var(X) = p(1-p)

Bernoulli distribution:

𝑝 𝑝𝑜𝑢𝑟 𝑥 = 1
𝑃[𝑋 = 𝑥] = {
𝑞 = 1 − 𝑝 𝑝𝑜𝑢𝑟 𝑥 = 0
E(X) = p and Var(X) = pq

Example:
Let X be the number of female. We have 23% male & 77% female.
• p(Male)=0.23; p(Female)=0.77
• E(X) = p = 0.77
• Var(X) = p×q= 0.77×0.23 = 0.1771

▪ Suppose now that n independent trials, each of which results in a success with
probability p and in a failure with probability 1-p, are to be performed. If X represents
the number of successes that occur in the n trials, then X is said to be a binomial
random variable with parameters (n, p). Thus, a Bernoulli random variable is just a
binomial random variable with parameters (1, p).
The probability mass function of a binomial random variable having parameters (n, p)
is given by
𝒑(𝑿 = 𝒌) = 𝑪𝒌𝒏 𝒑𝒌 (𝟏 − 𝒑)𝒏−𝒌
The expectation and variance of X can be calculated from the pmf:
E(X) = np & Var(X) = np(1-p)

Binomial distribution :
𝑃[𝑋 = 𝑘] = 𝐶𝑛𝑘 𝑝𝑘 𝑞𝑛−𝑘
E(X) = np and Var(X) = npq

If X ~Binomial (n, p) and Y ~Binomial (m, p) 2 independent RV


=> X+Y ~ Binomial (n+m, p)

4.1.2. The Poisson distribution

A random variable X that takes on one of the values 0, 1, 2, … is said to be a Poisson


random variable with parameter λ if, for some λ > 0,
𝝀𝒌
𝒑(𝒌) = 𝒑(𝑿 = 𝒌) = 𝒆−𝝀
𝒌!
The Poisson probability distribution was introduced by Siméon Denis Poisson in a book he
wrote regarding the application of probability theory to lawsuits, criminal trials, and the
like. This book, published in 1837, was entitled « Recherches sur la probabilité des
jugements en matière criminelle et en matière civile » (Investigations into the Probability
of Verdicts in Criminal and Civil Matters).

- Poisson Approximation of Binomial Probabilities


The Poisson random variable has a tremendous range of applications in diverse
areas because it may be used as an approximation for a binomial random variable with
parameters (n, p) when n is large and p is small enough so that np is of moderate
size. To see this, suppose that X is a binomial random variable with parameters (n, p),
where n is sufficiently large (n> 30) and p (or q) is sufficiently small, so the binomial
distribution can be replaced by a Poisson distribution with parameters 𝛌np (or nq) (np
or nq < 5)
Example

A factory produces nails and packs them in boxes of 200. If the probability that a nail is
substandard is 0.006, find the probability that a box selected at random contains at most
two nails which are substandard
6-b
Chapter Continuous Probability Distributions
5.2. Some special Continuous Probability Distributions

5.2.1 Normal distribution


We say that X is a normal random variable, or simply that X is normally distributed, with
parameters μ and σ2 if the density of X is given by:

𝟏 (𝑿−𝝁)𝟐

𝒇(𝒙) = ×𝒆 𝟐𝝈𝟐
𝝈√𝟐𝝅

This density function is a bell-shaped curve that is symmetric about μ.

The normal distribution was introduced by the French mathematician Abraham DeMoivre
in 1733, who used it to approximate probabilities associated with binomial random
variables when the binomial parameter n is large. This result was later extended by
Laplace and others and is now encompassed in a probability theorem known as the central
limit theorem.

Probability and Normal Distribution:

If a random variable, X, is normally distributed you can find the probability that X will
fall in a given interval by calculating the area under the normal curve for that interval.

Example- Find the area under normal curve to the right of z = 1.45; P (z > 1.45)

P(z>1.45) = 0.500-0.4265 = 0.0735 (From Normal table)


Example- A bottling machine is adjusted to fill bottles with a mean of 32.0 OZ of soda and
standard deviation of 0.02. Assume the amount of fill is normaly distributed and a bottle
is selected at random:
1) Find the probability the bottle contains between 32.0 and 32.025
2) Find the probability the bottle contains more than 31.97
Solutions:
𝑋−𝜇
1) When X = 32.00; we assume that 𝑧 = 𝜎
=> z =0
32.025−32
When X = 32.025; 𝑧 = 0.02
=1.25
32−32 𝑋−32 32.025−32
So, P(32.0 < x < 32.025) = 𝑃 ( 0.02
< 0.02
< 0.02
) = 𝑃(0 < 𝑧 < 1.25) =

𝐹(1.25) − 𝐹(0) = 0.3944


𝑋−32 31.97−32
2) P(x>31.94) = 𝑃 ( 0.02 > 0.02
) = 𝑃(𝑧 > −1.5) = 1 − 𝑃(𝑧 < −1.5)

= 1 − (1 − 𝑝(𝑧 < 1.5) = 0.9332

Normal Distribution :

1 1 𝑥−𝑚 2
− ( )
𝑓(𝑥) = 𝑒 2 𝜎 ∀ 𝑥 ∈ ℝ , 𝑚 ∈ ℝ, 𝑎𝑛𝑑 𝜎 ∈ ℝ∗+
𝜎√2𝜋

m is the mean and 𝜎 is the standard deviation of X


+∞
∫ 𝑓(𝑥) = 1
−∞

If X ~𝒩 (m1, 𝜎1) and Y ~ 𝒩 (m2,2) 2 independent RV


=> X+Y ~ 𝒩 (m1+m2, 𝜎 1+𝜎2)

Standard normal distribution :

The standard normal distribution is a normal distribution with a mean of zero and
𝑋−𝑚
standard deviation of 1: 𝑈 = 𝜎 ~ 𝒩 (0, 1)

𝟏 𝟏 𝟐
𝒇(𝒖) = 𝒆−𝟐𝒖
√𝟐𝝅
▪ ∀u∈ℝ F(u) = P(U ≤ u]
▪ ∀u∈ℝ F(u) + F(-u) = 1
▪ ∀u∈ℝ F(u) - F(-u) =2 F(u) – 1
▪ ∀u∈ℝ P [U ≥ u] = P [ U ≤ -u] = 1 - P [ U < u] = 1- F(u)

5.2.2. approximate continuous distribution by discrete distribution


1 1
→ X = a : 𝑎 − 2 < 𝑋∗ < 𝑎 + 2
1
→ X < a : 𝑋∗ < 𝑎 − 2
1
→ X > a : 𝑋∗ > 𝑎 + 2
1
→ X ≤ a : 𝑋∗ < 𝑎 + 2
1
→ X ≥ a : 𝑋∗ > 𝑎 − 2
1 1
→ a<X < b : 𝑎 + 2 < 𝑋 ∗ < 𝑏 − 2
1 1
→ a ≤X ≤ b : 𝑎 − 2 < 𝑋 ∗ < 𝑏 + 2
5.2.3. The normal approximation to the binomial distribution

If X ~ Binomial B(n ,p) and n≥30, np≥5 et nq≥5


𝑚 = 𝑛𝑝
=> X ~Normal distribution : N(m, σ) where : {
𝜎 = √𝑛𝑝𝑞

Example: if X ~B(1000 ; 0,04) => n>30, np=40 ≥ 5 and nq=960 ≥ 5

=> X ~N(40, √960)

1 1
1 1 𝑘 − 2 − 40 𝑘 + 2 − 40
𝑝(𝑋 = 𝑘) = 𝑝 [𝑘 − < 𝑋 ∗ < 𝑘 + ] = 𝑝 [ <𝑍< ]
2 2 √960 √960

1 1
𝑘 + 2 − 40 𝑘 − 2 − 40
=𝐹( )−𝐹( )
√960 √960

5.2.4. The normal approximation to the Poisson distribution

𝑚=𝜆
If X ~p(𝜆) and if 𝜆≥20 => X ~ normal distribution N(m, σ) where : {
𝜎 = √𝜆
Worksheet 4&5

Exercise 1

Consider a population including 0.1% individuals who have problems in paying back their
debts.
Let X be a real random variable of people having this problem in a city of 4000 person.
1) What is the probability distribution of X? Determine its expected value (E(X)) and its
variance (Var(X)).
2) Calculate p(X>2),
3) Show that it is legitimate to approach the distribution of X using the Poisson
probability distribution that we can determine.
4) Calculate p(X≤1).

Exercise 2
The proportion of defective tubes produced by a company is 2%.
1) What is the probability distribution (X) of the number of defective tubes in a sample of
200 tubes? Determine its expectation (EX) and variance (Var(X)).
2) Show that it is legitimate to approach the probability distribution of X by a Poisson
distribution that we determine.

Exercise 3

1% of the telephone bills mailed to households are incorrect, per month. A sample of 300
bills is selected for verification.
Let X a real random variable of number of incorrect bills.

1) What is the probability distribution of X? Determine its expected value (E(X)) and its
variance (Var(X)).

2) Calculate p(X>2),

3) Show that it is legitimate to approach the distribution of X using the Poisson


probability distribution that we can determine.
4) Calculate p(X≤1).

Exercise 4
X is the variable "number of kilometers before the first accident," X follows a normal
distribution 𝓝 (3; 1).
𝑿–𝟑
We then know the associated standard normal distribution: 𝑿′ = 𝟏
follows a standard

normal distribution 𝓝 (0; 1).


1) Calculate p (X > 4)
2) Calculate p (2< X < 5)
3) Determine a if p(X > a) = 0.3

Exercise 5
In a car garage containing 200 cars, we denote by X the number of breakdown of cars. The
probability that a car breaks down is 2%.
1) What is the probability distribution of X? Determine E(X)) and Var(X).
2) Show that it is legitimate to approach the distribution of X by a Poisson distribution
that is determined
3) Calculate p(X>1)
Exercise 6
The lifetime of a car is modeled by a random variable following a normal distribution with
mean 20 and standard deviation 6, X follows the normal distribution (20 ; 6).
1) Calculate p(X < 10)
2) Calculate p(X>30)
3) Determine a if p(X < a) = 0.75

Exercise 7: Part I, II and III are independent

Part I:

A company pays its employees an average wage of μ $ an hour with a standard deviation
of σ $. If the wages are approximately normally distributed, find μ and σ given that:

59.87% of workers get paid less than $16 per hour and 16.85% of workers get paid more
than $18.84 per hour.

Part II:

Suppose in this part that the average wage is $3.25 an hour and the standard deviation is
60 cents. Determine:

1. The proportion of workers getting wages between 2.75$ and 3.69$ an hour
2. The minimum wage of the highest 4.95%

Part II:
5% of workers at a factory say they use public transportation to get to work.You randomly
select 250 factory workers and ask them if they use public transportation to get to work.

3. Find the probability that exactly 16 workers will say yes.

Solution

Part I:
16 − 𝜇 16 − 𝜇
𝑃(𝑋 ≤ 16) = 𝑃 (𝑍 < ) = 0.5987 = 𝑃(𝑍 ≤ 0.25) ⟹ = 0.25 ⟹ 𝟎. 𝟐𝟓𝝈 + 𝝁 = 𝟏𝟔
𝜎 𝜎
18.84 − 𝜇 18.84 − 𝜇
𝑃(𝑋 > 18.84 ) = 1 − 𝑃 (𝑋 < ) = 0.1685 ⟹ 𝑃 (𝑋 < ) = 1 − 0.1685
𝜎 𝜎
= 0.8315
18.84 − 𝜇 18.84 − 𝜇
⟹ 𝑃 (𝑋 < ) = 𝐹(0.96) ⟹ = 0.96 ⟹ 𝟎. 𝟗𝟔𝝈 + 𝝁 = 𝟏𝟖. 𝟖𝟒
𝜎 𝜎
0.25𝜎 + 𝜇 = 16.00
{ ⟹ 𝝁 = 𝟏𝟓 𝒂𝒏𝒅 𝝈 = 𝟒
0.96𝜎 + 𝜇 = 18.84
Part II:
2.75 − 3.25 3.69 − 3.25
𝑃(2.75 ≤ 𝑋 ≤ 3.69) = 𝑃 ( <𝑍< ) = 𝑃(−0.83 ≤ 𝑍 ≤ 0.73)
0.6 0.6
= 𝐹(0.73) − 𝐹(−0.83) = 0.7673 − 0.2033 = 𝟎. 𝟓𝟔𝟒 𝒐𝒓 𝟓𝟔. 𝟒%
𝑎 − 3.25 𝑎 − 3.25
𝑃(𝑋 > 𝑎) = 0.0495 ⟹ 1 − 𝑃 (𝑍 < ) = 0.0495 ⟹ 𝑃 (𝑍 < ) = 0.9505
0.6 0.6
𝑎 − 3.25
= 𝑃(𝑍 < 1.65) ⟹ = 1.65 ⟹ 𝒂 = 𝟒. 𝟐𝟒$
0.6
Part III
𝑋𝑓𝑜𝑙𝑙𝑜𝑤𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙𝑒𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝐵(𝑛, 𝑝)𝑤𝑖𝑡ℎ 𝑛 = 250 &𝑝 = 0.05 è 𝑋~𝐵(250,0.05)
16
𝑃(𝑋 = 16) = 𝐶250 0.0516 0.95234 = 0.0637

Exercise 8-
A reservation office receives, between 10h and 12h, on average, 1.2 telephone
calls per minute.
1) Determine the probability that between 11 a.m. and 11:01 a.m. we have:
a) no phone calls;
b) one call;
c) two calls
2) Determine the probability of receiving 4 calls between 11 a.m. and 11:02 a.m.

1) Average = 1.2 per one minute.


▪ X= Discrete random variable represents the number of telephone calls per
minute
▪ X~Poisson (λ) and λ = 1.2 per one minute
The probability that between 11 a.m. and 11:01 => One minute
1.20
✓ 𝑝(𝑋 = 0) = 𝑒 −1.2 0!
= 0.30119421
1
−1.2 1.2
✓ 𝑝(𝑋 = 1) = 𝑒 = 0.36143305
1!
1.22
✓ 𝑝(𝑋 = 2) = 𝑒 −1.2 2! = 0.21685983
2) Between 11 a.m. and 11:02 a.m. => two minutes
λ = 2×1.2 = 2.4 per two minutes
2.44
✓ 𝑝(𝑋 = 4) = 𝑒 −2.4 = 0.21685983
4!

Exercise 2-
3% of water bottles manufactured by a factory are defective.
1) Let X be the number of defective bottles in a batch of 10 bottles
a) What is the probability distribution of X
b) Calculate the probability of the following events:
A «no defective bottles»
B « at least one 2 defective bottles »
2) Let Y be the number of defective bottles in a batch of 1000 bottles
a) Can a normal distribution be used to approximate the distribution of Y?
Determine its parameters.
b) Calculate the probability of the following events:
A «a batch has at least 40 defective bottles»
B « a batch has at most 45 defective bottles»
C « between 10 and 50 defective bottles
3) Calculate : E(2Y – 3) ; V(– 3Y + 1) ; E(Y2) et E(2X – 3Y + 4).

Solution
▪ 2 possible outcomes = {D=Defective; ND=Non defective}
▪ P(D) = 0,03
1)
a) X~Bin(n=10, p =0,03) => 𝑃𝑟(𝑋 = 𝑘) = 𝐶𝑛𝑘 𝑝𝑘 (1 − 𝑝)𝑛−𝑘
b) p(X=0) = 𝐶10
0
(0,03)0 (1 − 0,03)10−0 = 0,7374
2)
a) Y~Bin(n=1000, p =0,03)
Normal Approximation :
▪ n=1000≥30
▪ np=30≥5
▪ n(1-p)=970≥5
▪ alors ~(m=np=30,σ=√𝑛𝑝(1 − 𝑝) =5,4)
2)
39,5−30
A) 𝑝𝑟(𝑌 ≥ 40) = 𝑝𝑟(𝑌 ∗ > 39,5) = 𝑝𝑟 (𝑍 > ) = 𝑝𝑟(𝑍 > 1,76) = 1 − 𝑝𝑟(𝑍 <
5,4

1,76) = 1−= 1 − 0,96080 = 0,0392


39,5−30
B) 𝑝𝑟(𝑌 ≤ 45) = 𝑝𝑟(𝑌 ∗ < 45,5) = 𝑝𝑟 (𝑍 < 5,4
) = 𝑝𝑟(𝑍 < 2.87) = 0.99795
10,5−30 50,5−30
C) 𝑝𝑟(10 < 𝑌 ≤ 50) 𝑝𝑟(10,5 < 𝑌 ∗ ≤ 50,5) = 𝑝𝑟 ( ≤𝑍≤ )
5,4 5,4

= 𝑝𝑟(−3.61 ≤ 𝑍 ≤ 3.79) = 𝑝(𝑍 < 3.79) − 𝑝(𝑍 < −3.61)


= 𝑝(𝑍 < 3.79) − 1 + 𝑝(𝑍 < 3.61) = 0.99992 − 1 + 0.99985 = 0.99977
3) E(2Y-3) = 2 × E(Y) - 3=2 × (30) -3 = 57
V(-3Y+1) = 9 × V(Y) = 9 (𝜎𝑌 )2 = 9 × (5,4)2 = 262.44
Exercise 3-
The average number of phone calls arriving per minute at a switchboard between
10 a.m. and 12 p.m. is 3.
1) Let X be the number of telephone calls arriving per minute between 10 a.m.
and 12 p.m.
a) What is the probability distribution of X
b) Calculate the probability of the following events
A « X=2 »
B « at least 2 calls per minute»
C « at most 3 calls per minute »
2) Let Y be the number of telephone calls arriving in a quarter of an hour
between 10 am and 12 p.m.
a) Can a normal distribution be used to approximate the distribution of Y?
Determine its parameters.
b) Calculate the probability of the following events
A «at least 45 calls»
B « at most 48 calls »
D « at least 42 calls »
E « between 30 and 60 calls »
3) Calculate E(X + 4Y + 1) et E(Y2 – X2 – 4)

The average number of phone calls arriving per minute at a switchboard between
10 a.m. and 12 p.m. is 3.

X: the number of telephone calls arriving per minute between 10 a.m. and 12 p.m.

a) X~Poisson (λ) and λ = 3 per minute in 2 hours


b)
32
✓ 𝑝(𝑋 = 2) = 𝑒 −3 2! =
✓ 𝑝(𝑋 ≥ 2) = 1 − 𝑝(𝑋 < 2) = 1 − [𝑝(𝑋 = 0) + 𝑝(𝑋 = 1)]
30 31
= 1 − [𝑒 −3 + 𝑒 −3 ] =
0! 1!
✓ 𝑝(𝑋 ≤ 3) = [𝑝(𝑋 = 0) + 𝑝(𝑋 = 1) + 𝑝(𝑋 = 2) + +𝑝(𝑋 = 3)]
30 31 32 33
= [𝑒 −3 + 𝑒 −3 + 𝑒 −3 + 𝑒 −3 ] =
0! 1! 2! 3!

2) Y: the number of telephone calls arriving in a quarter of an hour between 10


am and 12 p.m.
a) Y~Poisson (λ) and λ = 3×15 = 45 >5

=> Y ∼ Poisson(45) ≈ N(45, √45)


𝑌 ∗ −45
Let 𝑍 = 45

𝑌 ∗ −45 44.5−45
▪ P(Y≥45) = p(Y* >45-1/2)=𝑝(𝑌 ∗ > 44.5) = 𝑝 ( > ) = 𝑝(𝑍 > −0.07) =
√45 √45

𝑝(𝑍 < .07) =

48.5−45
▪ P(Y≤48) = p(Y* <48+1/2)= 𝑝 (𝑍 < ) = 𝑝(𝑍 < 0.52)
√45

▪ P(Y≥42) = p(Y* >42-1/2)


▪ P(30<Y<60) =
3)
▪ E(X + 4Y + 1) = E(X) + 4E(Y) + 1 =

▪ E(Y2 – X2 – 4) = E(Y2) –E(X2) – 4


V(Y) = E(Y2) – (E(Y))2 => E(Y2) = V(Y) + (E(Y))2
▪ => E(Y2 – X2 – 4) = E(Y2) –E(X2) – 4 = V(Y) + (E(Y))2 –E(X2) – 4

You might also like