0% found this document useful (0 votes)
10 views38 pages

Continuous Probability Distributions

The document discusses key concepts related to the normal distribution including: - The normal distribution is the most important continuous probability distribution in statistics with a bell-shaped curve. - It is characterized by its mean and variance. - The area under the normal curve between two values represents the probability that a random variable falls within that range. - Tables of the standard normal distribution are used to find these areas by transforming the random variable into a standard normal variable. - The normal distribution can also be used to approximate the binomial distribution for large values of n.

Uploaded by

raachelong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views38 pages

Continuous Probability Distributions

The document discusses key concepts related to the normal distribution including: - The normal distribution is the most important continuous probability distribution in statistics with a bell-shaped curve. - It is characterized by its mean and variance. - The area under the normal curve between two values represents the probability that a random variable falls within that range. - Tables of the standard normal distribution are used to find these areas by transforming the random variable into a standard normal variable. - The normal distribution can also be used to approximate the binomial distribution for large values of n.

Uploaded by

raachelong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Lecture 6

Some continuous probability


distributions

CH2010 Engineering Statistics PL AY2023v6 1


The quincunx machine
https://www.mathsisfun.com/data/quincunx.html

CH2010 Engineering Statistics PL AY2023v6 2


6.1 Normal distribution
Normal distribution, aka Gaussian distribution is the most important
continuous probability distribution in the entire field of statistics.
It has a “bell-shaped” curve called the normal curve.

Point of inflection

CH2010 Engineering Statistics PL AY2023v6 3


6.1 Normal distribution
Naturally occurring variations

CH2010 Engineering Statistics PL AY2023v6 4


6.1 Normal distribution
Student scores

CH2010 Engineering Statistics PL AY2023v6 5


6.1 Normal distribution
• A continuous random variable X having the normal distribution is a
normal random variable.
• It has mean μ and variance σ2.
• We denote the density of X by n(x; μ, σ).

1
1 − ( x− )2
n( x;  ,  ) = 2
,−  x  
2
e
2 

CH2010 Engineering Statistics PL AY2023v6 6


6.1 Normal distribution
• The mode occurs at x = μ.
• The curve is symmetric about x = μ.
• Points of inflection at x = μ ± σ. The curve is concave downward if μ –
σ < X < μ + σ and is concave upwards otherwise.
• The normal curve approaches the horizontal axis asymptotically as we
proceed in either direction away from the mean.
• The total area under the curve, above the horizontal axis equals to 1.
• The standard deviation is σ.

CH2010 Engineering Statistics PL AY2023v6 7


6.2 Areas under the normal curve
• The area under the curve for x1 ≤ x ≤ x2 is the probability that the random variable
X assumes a value between x1 and x2:
1
x2 1 x2 − ( x− )2
P( x1  X  x2 ) =  n( x;  ,  )dx =  2 2
e dx
x1
2  x1

this is represented by the area of the shaded region.

CH2010 Engineering Statistics PL AY2023v6 8


6.2 Area under the curve
1
1 x2 − ( x− )2

 2 2
• The evaluation of the integral e dx cannot be done
analytically. 2  x1

• One would usually use tabulated values of the integral to solve problems
associated with P(x1 ≤ X ≤ x2).
• This means we would need to have a separate table for each set of conceivable
values of μ and σ – practically impossible!
• Alternatively, we can “normalise” the normal distributions by introducing a
normal random variable Z:
X −
Z=

CH2010 Engineering Statistics PL AY2023v6 9
6.2 Areas under the curve
• For X = x, Z = z = (x – μ)/σ.
• Therefore, x1 ≤ x ≤ x2 can be transformed into z1 ≤ z ≤ z2, where z1 = (x1 – μ)/σ and
z2 = (x2 – μ)/σ.
• Thus,
1 1
1 x2 − ( x− )2 1 z2 − z2
P( x1  X  x2 ) =  2
dx = 
2
2
e e dz
2  x1
2 z1

z2
=  n( z;0,1)dz = P( z1  Z  z 2 )
z1

where Z is a normal random variable with mean 0 and variance 1.

CH2010 Engineering Statistics PL AY2023v6 10


6.2 Areas under the curve
The distribution of Z is a standard normal distribution.
The values of the cumulative standard normal distribution.
𝑧
1 1
−2𝑧 2
𝑃 𝑋<𝑧 = න 𝑒 𝑑𝑥
2𝜋 −∞
is tabulated in Table A.3.

CH2010 Engineering Statistics PL AY2023v6 11


Example 6.1
Given the standard normal distribution, find the area under the curve that lies
(a) To the right of z = 1.84 and
(b) between z = −1.97 and z = 0.86.

Answer
(a) The area to the right of z = 1.84 equals to 1 minus the
area to the left of z = 1.84, which can be found in
Table A.3. From Table A.3, the P(Z ≤ 1.84) = 0.9671.
Therefore P(Z > 1.84) = 1 – 0.9671 = 0.0329.

CH2010 Engineering Statistics PL AY2023v6 12


Example 6.1 (cont.)
(b) The area between z = −1.97 and z = 0.86 equals to the area
to the left of z = 0.86 minus the area to the left of z = −
1.97.
Table A.3 only tabulates positive Z numbers. However, the
standard normal distribution is symmetric about Z = 0.
Therefore, P(Z ≤ − 1.97) = 1 – P(Z ≥ 1.97)
From Table A.3, P(Z ≤ − 1.97) = 0.97558.
P(Z ≤ − 1.97) = 1 – 0.97558 = 0.0244.
P(Z ≤ 0.86) = 0.8051.
Therefore P(− 1.97 ≤ Z ≤ 0.86) = 0.8051 – 0.0244 = 0.7807.
CH2010 Engineering Statistics PL AY2023v6 13
Example 6.2
Given a normal distribution with µ = 40 and σ = 6, find the value of x that has:
(a) 45% of the area to the left and
(b) 14% of the area to the right.

Answer
(a) Table A.3 only tabulates positive z with
cumulative probability greater than 50%.
However, we can exploit the fact that the
standard normal distribution is symmetric
about z = 0.

CH2010 Engineering Statistics PL AY2023v6 14


Example 6.2
Answer
(a) To find out z corresponding to 45% of the
area to the left, we can first find the value of
–z corresponding to 55% of the area to the
left.
Looking up Table A.3, we find that –z = 0.13
for P = 0.55. Thus z = – 0.13.
We know that z = (x – µ)/ σ; therefore
x = σz + µ = (6)(– 0.13) + 40 = 39.22.

CH2010 Engineering Statistics PL AY2023v6 15


Example 6.2 (cont.)
Answer
(b) Table A.3 tabulates probabilities for Z < z.
Therefore we need to look for the area to the
left of the z value, that is 1 – 0.14 = 0.86.
From Table A.3, z = 1.08.
Therefore, x = σz + µ = (6)(1.08) + 40 =
46.48.

CH2010 Engineering Statistics PL AY2023v6 16


Example 6.3
Light bulbs manufactured from a production line have a life, before burn-out, that is
normally distributed with mean of 800 hours and a standard deviation of 40 hours.
Find the probability that a bulb burns out between 778 and 834 hours.

CH2010 Engineering Statistics PL AY2023v6 17


Example 6.3 (cont.)
Answer
Second, we need to transform the problem into a standard normal distribution problem.
The z values corresponding to x1 = 778 and x2 = 834 are:
z1 = (778 – 800)/40 = –0.55 and z2 = (834 – 800)/40 = 0.85.
Using Table A.3, with the technique demonstrated in the previous examples, the P values
corresponding to Z < –0.55 and Z < 0.85 are estimated to be 0.2912 and 0.8023,
respectively.
Therefore,
P(778 < X < 834) = P(–0.55 < Z < 0.85)
= 0.8023 – 0.2912 = 0.5111
CH2010 Engineering Statistics PL AY2023v6 18
6.3 Normal approximation to the binomial
• The normal distribution is often a good approximation to a discrete distribution, if
the discrete distribution is symmetric and bell-shaped.
• Last week, we covered binomial distribution, which looks symmetric and bell-
shaped for large n (e.g. n > 15).
• Therefore, the binomial distribution for large n can be approximated by the normal
distribution.
• If X is a binomial random variable with mean μ = np and variance σ2 = npq, then
the limiting form of the distribution is the standard normal distribution.
X − np
lim Z = = n( z;0,1)
n → npq
CH2010 Engineering Statistics PL AY2023v6 19
Example 6.4
For b(x;15,0.4), μ = np = (15)(0.4) = 6 and variance σ2 = npq = (15)(0.4)(0.6) = 3.6.
Overlaying the histogram of b(x;15,0.4) and n(x;6,√3.6), we get:

CH2010 Engineering Statistics PL AY2023v6 20


Example 6.4
Now let’s examine the accuracy of the approximation.
For the binomial distribution b(x;15,0.4);
15  4
P( X = 4) =  0.4 (1 − 0.4) (15− 4 ) = 0.1268
4
For the normal distribution approximation n(x;6,√3.6) = n(x;6,1.897), we need to
transform the density function into a probability, which is the area under the curve
between x1 = 3.5 and x2 = 4.5. Then, transform X to Z:
3.5 − 6 4.5 − 6
z1 = = −1.32 and z2 = = −0.79
1.897 1.897
 P( X = 4)  P(−1.32  Z  −0.79)
CH2010 Engineering Statistics PL AY2023v6 21
Example 6.4 (cont.)
Looking up Table A.3, P(Z < – 0.79) = 0.2148, P(Z < –1.32) = 0.0934.
Therefore, P(X = 4) ≈ 0.2148 – 0.0934 = 0.1214.
The approximation is fairly close to the exact value of 0.1268.

CH2010 Engineering Statistics PL AY2023v6 22


6.3 Normal approximation to the binomial
Why are so many (naturally
occurring) variations normally
distributed, such as height of a
specific population?
• Many factors affects one’s height,
e.g. sleep, nutrition, genetic factors,
exercise, etc.
• When the randomness of so many
“binomial processes” (large n)
stacks up, it would naturally results
in a normal distribution.

CH2010 Engineering Statistics PL AY2023v6 23


6.3 Normal approximation to the binomial
For small n, the approximation is less accurate, as the distribution no longer assumes
symmetrical bell-shape.

Histogram for b(x; 6, 0.2) Histogram for b(x; 15, 0.2)

CH2010 Engineering Statistics PL AY2023v6 24


Lecture 6
Some continuous probability
distributions

CH2010 Engineering Statistics PL AY2023v6 25


6.4 Lognormal distribution
• The lognormal distribution has a variety of applications and takes the form:
 1 1
− 2 [ln( x ) −  ]2
 e 2 , x0
f ( x;  ,  ) 2 x
0, x0

• ln(X) is normally distributed with mean μ and standard deviation σ.
• The random variable X is lognormal distributed with mean and variance

CH2010 Engineering Statistics PL AY2023v6 26


Example 6.5
Concentration of pollutants from chemical plants exhibits behaviour that resembles
a lognormal distribution. This is important when one considers compliance with
government regulations. Suppose it is assumed that the concentration of a certain
pollutant, in parts per million (ppm), has a lognormal distribution with μ = 3.2 and σ
= 1. What is the probability that the concentration exceeds 8 ppm?
Solution
Let the random variable X be the pollutant concentration. Then
P(X > 8) = 1 – P(X ≤ 8)

CH2010 Engineering Statistics PL AY2023v6 27


Example 6.5 (cont.)
Since X has a log normal distribution, ln(X) follows a normal distribution with μ =
3.2 and σ = 1. Therefore, we can introduce a new random variable Z:
ln( X ) −  ln( X ) − 3.2
Z= = = ln( X ) − 3.2
 1
Thus, Z follows a standard normal distribution.
 P( X  8) = Z  = ln(8) − 3.2 =  (−1.12) = 0.1314.
where Φ denotes the cumulative probability distribution function of the standard
normal distribution.
Hence, the probability that the pollutant concentration exceeds 8 ppm is 1 – 0.1314
= 0.8686
CH2010 Engineering Statistics PL AY2023v6 28
6.5 Exponential distribution
• The exponential distribution has the following density function:

1 −𝑥/𝛽
𝑒 , 𝑥>0
𝑓 𝑥; 𝛽 = ൞𝛽
0, elsewhere

where β > 0. With mean and variance of:

 =  and  2 =  2 .

CH2010 Engineering Statistics PL AY2023v6 29


Relationship to the Poisson process
Recall Poisson process (Poisson experiment)
• A Poisson process looks at the probability to reach the xth successes (or failures)
in time t, where the average number of successes per unit time is λ.
• For a Poisson process, the probability of getting no success (x = 0) in time t is:

e − t (  t ) 0
p(0; t ) = = e − t
0!
• Therefore, the probability of getting one or more success in time t is:
𝑃 𝑥 ≥ 0 = 1 − 𝑒 −𝜆𝑡

CH2010 Engineering Statistics PL AY2023v6 30


Relationship to the Poisson process
• If t is the random variable, i.e. the time taken for one or more success to occur :
𝑃 0 ≤ 𝜏 ≤ 𝑡 = 𝐹 𝑡 = 1 − 𝑒 −𝜆𝑡
• Then t has probability density function:
𝜕𝐹
= 𝑓 𝑡 = 𝜆𝑒 −𝜆𝑡
𝑑𝑡
• Converting random variable from t to x:
f ( x ) =  e − x
• This is the density function of the exponential distribution with λ = 1/β.
• The random variable x is the time required for a Poisson process to yield at least
one success.

CH2010 Engineering Statistics PL AY2023v6 31


Memoryless property
The Poisson process and the exponential distribution are both memoryless.
• In a Poisson process p(x, λt), the expected number of event occurrences per unit
“time” is λ.
• The expected number of event occurrences for time 0 – t is λ.
• The expected number of event occurrences for time t – 2t is also λ.
• In a Poisson process, longer absolute time does not make the change of an event
occuring per unit time more likely, i.e. the process is memoryless.
• In reality, not all processes are memoryless – older components may become more
likely to fail owing to wear and tear.

CH2010 Engineering Statistics PL AY2023v6 32


Example 6.6
Suppose a system contains a certain type of component whose time, in years, to
failure is given by T. The random variable T is modelled nicely by the exponential
distribution with mean time to failure β = 5. If 5 of these components are installed in
different systems, what is the probability that at least 2 are still functioning at the
end of an 8-year period?
Solution
The probability that a specific component still functions after 8 years is:
1  −t / 5
P(T  8) =  e dt = e −8 / 5  0.2.
5 8

CH2010 Engineering Statistics PL AY2023v6 33


Example 6.6 (cont.)
Whether a component still functions or fails after 8 years corresponds to a Bernoulli
experiment. Therefore, the event that X units are functioning is described by the binomial
distribution.
For a binomial distribution, if X = number of functioning units, then n = 5 (5 components
studied) and p ≃ 0.2 (expected probability that a unit is still functioning after 8 years).
The sum of the entire binomial
Up to 5 functioning units distribution, i.e. unity
5 5 1
P( X  2) =  b( x;5,0.2) =  b( x;5,0.2) −  b( x;5,0.2) = 1 − 0.7373 = 0.2627
x=2 x =0 x =0
At least 2 Cumulative Cumulative Look up Table A.1
functioning units distribution of 0, 1, distribution of 0
2, 3, 4 or 5 units unit or 1 units
functioning functioning
CH2010 Engineering Statistics PL AY2023v6 34
6.6 Chi-square distribution
The chi-squared distribution has the following density function:

 1 v / 2 −1 − x / 2
 v/2 x e , x  0,
f ( x; v) =  2 (v / 2)
0, elsewhere,

where Γ(α) is the gamma function defined by:

( ) =  x −1e − x dx, for   0. (not examinable)
0

CH2010 Engineering Statistics PL AY2023v6 35


6.6 Chi-square distribution
The chi-squared distribution is specified by a single parameter, v , known as the
degrees of freedom.
The chi-squared distribution is very important in statistical hypothesis testing and
estimation.
The mean and variance of the chi-squared distribution are

 = v and  2 = 2v

The cumulative probability of the Chi-squared distribution is given in Table A.5.

CH2010 Engineering Statistics PL AY2023v6 36


Summary (1)
• Normal distribution
1
1 − 2 ( x− )2
• Density function n( x;  ,  ) = e 2 ,−  x  
2 
• Mean = μ and variance = σ2

• Standard normal distribution


1 𝑧2
−2
• Density function 𝑓 𝑧 = 𝑒 , −∞ <𝑧<∞
2𝜋
• Mean = 0 and variance = 1

• Transformation from normal distribution: Z = (X – μ)/σ

• Cumulative function tabulated in Table A.3.

CH2010 Engineering Statistics PL AY2023v6 37


Summary (2)
 1 1
− 2 [ln( x ) −  ]2
 e 2 , x0
• Log-normal distribution f ( x;  ,  ) 2 x
0, x0
• Density function 
• Mean and variance

1 −𝑥/𝛽
• Exponential distribution 𝑓 𝑥; 𝛽 = ൞𝛽
𝑒 , 𝑥>0
• Density function 0, elsewhere
• Mean and variance  =  and  2 =  2 .

 1 v / 2 −1 − x / 2
 x e , x  0,
• Chi-squared distribution f ( x; v) =  2v / 2 (v / 2)
0, elsewhere,
• Density function 
• Mean and variance  = v and  2 = 2v
CH2010 Engineering Statistics PL AY2023v6 38

You might also like