Unit 4
Unit 4
Introduction
You are welcome to this unit. In Unit 3, we discussed various discrete probabilities, how they are
calculated and applied in real life situation. It is therefore prudent that in this unit, we treat some
specific continuous distributions and how their probabilities are calculated and applied in the real
life situation. The most important of them that we consider in this chapter are the Normal,
Uniform, Exponential and the Gamma distributions. Unit 4 is structured under the following
sections:
Objectives
By the end of this unit, you should be able to:
Identify each of the continuous distributions under study by their properties
Use each of the continuous distributions to solve problems on probabilities
Apply the specific continuous probability distributions in real life situations
Section 1: Normal Distribution
Introduction
We once again we welcome you to section 1 of Unit 4. In this section, we consider the most
important distribution in probability and statistics-the Normal distribution. We will enumerate
the various properties of the normal distribution that will enable you to easily identify a
distribution as normal. We then proceed to determine how to calculate probabilities for the
normal distribution and their applications in real life situations. I wish you well in this unit.
Objectives
By the end of this section, you should be able to:
State the properties of the normal distribution
Identify the normal distribution as a continuous probability distribution
Apply the normal probability distribution in real life situations
The normal distribution is often referred to as the Gaussian distribution in honour of Gauss
(1777-1855), who derived its equation from a study of errors in repeated measurement of the
same quantity. The normal distribution is the most important continuous distribution in the
entire field of statistics. Its graph called the normal curve is the bell-shaped curve (Fig. 4.1)
which describes approximately many phenomena that occur in nature, industry and research.
Physical measurements in areas such as meteorological experiments, rainfall studies and
measurements of manufactured parts are often more than adequately explained with a normal
distribution. In addition, errors in scientific measurements are extremely well approximated by a
normal distribution.
A random variable X having the bell-shaped distribution above is called a normal random
variable. The mathematical equation for the probability distribution of the continuous normal
variable depends on two parameters and , its mean and standard deviation. Hence, we shall
denote the density function of X by n( x; , ) .
Definition The probability density function of the normal random variable with mean and
variance 2 is given by
( x )2
1
n( x : , ) e 2 2 , for x ,
2
The two curves are identical in form but are centred at different positions along the horizontal
axis.
In Fig 4.3, we have sketched two normal curves with the same mean but different standard
deviations This time, we see that the two curves are cantered at exactly the same position on the
horizontal axis, but the curve with the larger standard deviation is lower and spreads out further.
We remind you that the area under the probability curve must be equal to 1, and therefore the
more variable the set of observations the lower and wider the corresponding curve will be.
Fig 4.4 shows the results of sketching two normal curves having different means and different
standard deviations.
Figure 4.4: Normal curves with 𝜇1 < 𝜇2 and 𝜎1 < 𝜎2
You could see that they are cantered at different positions on the horizontal axis and their shapes
reflect the two different values of .
2
1 [( x )/ ]2
E( X ) xe 2 dx.
1 2
z
E( X ) ( z )e 2 dz
2
1 2
2
z z
e 2 dz ze 2 dz.
2
2
The first term on the right is times the area under a normal curve with mean zero and variance
1, and hence equal to . By straightforward integration, the second term is equal to 0. Hence
E ( X ) .
2 2
z
E[( X )2 ] z 2 e 2 dz.
2
E[( X )2 ]
2
2
ze z
2
/2
|
2
e z /2 dz 2 (0 1) 2 .
Many random variables have probability distributions that can be described adequately by the
normal curve once and 2 are specified. The normal distribution finds enormous application
as a limiting distribution. Under certain conditions the normal distribution provides a good
continuous approximation to the binomial and the hypergeometric distributions.
SECTION 2: Areas under the Normal Curve
Introduction
Dear student, we learnt in unit 1 section 3 that in order to find the probability of a continuous
random variable given the probability density function (p.d.f), we need to integrate the p.d.f in
the given interval. The difficulty encountered in solving integrals of normal density functions
necessitates the tabulation of normal curve areas for quick reference, for the special case where
=0 and =1. In this section, we will lean how to determine the probability of a normal
distribution without actually going through the p.d.f given above which is tedious to calculate.
Objectives
By the end of this section, you should be able to
Calculate the probability of a given standard normal distribution
Transform an unstandardized normal distribution to the standard form
Find the probability of any normal probability distribution
The curve of any continuous probability distribution or density function is constructed so that the
area under the curve bounded by the two ordinates x x1 and x x2 equals the probability that
the random variable X assumes a value between x x1 and x x2 . Thus, for the normal curve
in Fig. 4.5
x2 1 x2 1 2 ( x )2
P( x1 X x2 ) n( x; , )dx e 2 dx,
x1
2 x1
Dear student, you have realised that the normal curve is dependent on the mean and the
standard deviation of the distribution under investigation. The difficulty encountered in
solving integrals of normal density functions necessitates the tabulation of normal curve areas for
quick reference. However it would be a tedious task to attempt to set up separate tables for every
conceivable values of and . Fortunately, we are able to transform all the observations of
any normal random variable X to a new set of observations of a normal random variable with
mean 0 and variance 1. This can be done by means of the transformation
x
Z
Whenever X assumes a value x , the corresponding value of Z is given by z ( x ) / .
Therefore if X falls between the values x x1 and x x2 , the random variable Z will fall
between the corresponding values z2 ( x2 ) / and z1 ( x1 ) / . Consequently, we may
write
x2 1 x2 1 2 ( x )2 x2 1 z2 z 2
P( x1 X x2 ) e 2 dx e dz
x1
2 x1 x1
2 z1
z2
1 n( z;0,1)dz P( z1 Z z2 ).
z
where Z is seen to be a normal random variable with mean 0 and variancr1.
To find the probability that a random variable having the standard normal distribution will take
on a value between a and b , we use the relation
F ( z ) 1 F ( z )
due to the symmetry of the normal curve. It is very important to learn how to read the normal Commented [j1]: Include normal table and teach how to read
it?
table. Almost all books on statistics and statistical tables contain standard normal distribution
tables.
Example 4.1
Find the probability that a random variable having the standard normal distribution will take on a
value:
1. less than 1.72
2. less than –2.56
3. between 0.87 and 1.28
4. between –0.34 and 0.62
Solution
In solving problems involving normal distributions, it is necessary to sketch the normal curve
and indicate the area of interest.
1. We look up the entry corresponding to z=1.72 in the normal
table, add 0.5000 (see Fig 4.2a), and get
0.4573 + 0.5000=0.9573.
0 1.72
0
Fig.4.6a
- Fig.4.6b
2.56
If in a given problem 0 and 1 then there is the need to standardised the distribution by
means of the transformation
x
Z ,
where is the mean, is the standard deviation and x is a random variable. Therefore if X falls
between the values x x1 and x x2 , the random variable z will fall between the corresponding
values;
x x
z1 1 and z 2 2 .
Consequently, we may write
x2 ( x )2
1
P( x1 x x2 )
2
x1
e 2 2 dx
z2 2
1 z
2 e
z1
2
dz
P( z1 x z 2 ) ,
wherez is seen to be a normal random variable with =0 and =1, and its entry is the value of
x 2
1 t
F (Z )
2 e
2
dt .
To determine probabilities relating to random variables having normal distributions other than
x
the standard normal distribution, we make use of the transformation z .
Example 4.2
Given a normal distribution with =50 and =10, find the probability that x assumes a value
between 45 and 62.
Solution
The z values corresponding to x1 45 and x2 62 are
45 50 62 50
z1 0.5 and z 2 1.2 .
10 10
-0.5 0 1.2
Fig. 4.7
From tables we find that P(0.5 z 1.2) 0.1915 0.3849 0.5764 .
Example 4.3
Given that X has a normal distribution with 300 and 50 ,find the probability that X
assumes a value greater than 362
Solution
The normal probability distribution showing the desired area is shown by Figure 4.4. To find the
P( X 362) , we need to evaluate the area under the normal curve to the right of x 362 .
This can be done by transforming x 362 to the corresponding z value, obtaining the area to the
left of z from normal distribution tables and the subtracting this area from 1. We find that
362 300
z 1.24
50
Hence,
P( X 362) P(Z 1.24) 1 P( 1.24) 1 0.8925 0.1075.
SECTION 3: Applications of the Normal Distribution
Introduction
Dear student, we hope you have enjoyed this unit so far. It is promising to be more interesting as
we present some applications of normal distribution and even go further to look at how some
discrete distributions can be approximated by the normal distribution, you now realise why the
normal distribution is the most important probability distribution. We will further discuss the
normal approximation to other probability distributions
Objectives by the end of this section, you should be able to
Apply the techniques of the normal distribution to solve real life problems
Approximate discrete distributions using the continuous normal distribution
We start this section by considering some of the many problems for which the normal
distribution is applicable by way of examples.
Example 4.3.1
A certain type of storage battery lasts on the average 3.0 years, with a standard deviation of 0.5
years. Assuming that the battery lives are normally distributed, find the probability that a given
battery will last less than 2.3 years.
Solution
First, we construct a normal distribution curve showing the distribution of battery lives and the
desired area (Fig 4.4). It is required to find P( x 2.3) .
0 2.3
Fig.4.9.
4
We therefore, need to evaluate the area under the normal curve to the left of 2.3. From tables,
we read 1.4 to get 0.4192 and subtract from 0.5000. The z value is given by
2.3 3
z 1.4 .
0.5
Therefore
P( z 1.4) 0.5000 0.4192 0.0808
Example 4.3.2
An electrical firm manufactures light bulbs that have a life before burn-out that is normally
distributes with mean equal to 800hours and a standard deviation of 40 hours. Find the
probability that a bulb urns between 778 and 834 hours.
Solution
778 800 834 800
The z values corresponding to z1 0.55 and z2 0.85.
40 40
Hence
P(778 X 834) P(0.55 Z 0.85)
Solution
A percentage is found by multiply the relative frequency by 100%. Since the relative frequency
for an interval is equal to the probability of falling in the interval, we must find the area to the
right of x 43 . This can be done by transforming x 43 to the corresponding z value,
obtaining the area to the left of z from tables, and then subtract from this area from 1. We find
43 40
z 1.5
2
Therefore,
P( X 43) P(Z 1.5) 1 P(Z 1.5) 1 0.9332 0.0668
Note that although x takes on only the values 0, 1, 2, …n in the limit as n , the distribution
of the corresponding standardised random variable is continuous and the corresponding
probability density is the standard normal density. A good rule of thumb is to use the normal
approximation to the binomial distribution when np and 𝑛𝑝𝑞 are both greater than 5.
Example 4.3.4
If 20% of items in a large consignment are defectives, what is the probability that in a lot of 100
randomly chosen items for inspection,
1. exactly 15 will be defective?
2. at most 15 will be defective?
Solution
In this problem n=100 and p=0.02. This is a binomial distribution problem. Since n is large and
p is small we apply the normal distribution for finding its solution. Here np
=(100)(0.02)=20 and
npq (100)(0.02)(0.08) 4 .
1. We are required to find P( x 15) . Since we want to apply the normal distribution
approach whose random variable X is continuous, it is meaningless to find P( x 15)
which refers to a discrete random variable.
Therefore, we create an interval between
x1 14.4 and x2 15.5 in which x=15 obviously lies. So that
2. In this case we are required to determine P( x 15) . Similarly, for the random variable
x 15 , we find P( x 15) P( x 15.5) . This is to ensure that the point x 15 is
included in the interval. For x 15 , we calculate the z value to be
15.5 20 4.5
z .
4 4
We find from tables that
Example 4.3.5
The probability that a patient recovers from a rare blood disease is 0.4. If 100 people are known
to have contracted this disease. What is the probability that less than 30 survive?
Solution
Let the binomial variable X represent the number of patients that survive. Since
n np (100)(0.4) 40,
and
npq (100)(0.4)(0,6) 4.899
To obtain the desired probability, we have to find the area to the left of x 29.5 . The z value
corresponding to 29.5 is
29.5 40
z 2.14
4.899
And the probably of fewer tan 30 of the 100 patients surviving is given by the shaded region I
fig, 4.6.
Example 4,6
A multiple-choice quiz has 200 questions each of with 4 possible answers of which only 1 is the
correct answer. What is the probability that sheer guesswork yields from 25 to 30 correct
answers from 80 of the 200 problems about which the student has no knowledge?
Solution
The probability of a correct answer for each of the 80 questions is p 1 4 . If X represents the
number of correct answers due to guesswork, then
30
P(25 X 30) b( x;80,1/ 4)
25
Using the normal-curve approximation with and
1
np (80) 20
4
and
npq (80)(1/ 4)(3 / 4) 3.873,
we need the area between x1 24.5 and x2 30.5 . The corresponding z values are
24.5 20 30.5 20
z1 1.16 and z2 2.71
3.873 3.873
The probability of correctly guessing from 25 to 30 questions is given by the shaded region in fig
4.5
Objectives
By the end of this section, you should be able to
Explain the concept of uniform probability distribution
Solve problems concerning uniform distributions
A random variable X has a uniform distribution on the interval from a to b if its density on that
interval is constant. The probability density function of a uniform distribution is given by
1 , a xb
f ( x) b a .
0, elsewhere
The values of f (x) at points a and b are not defined. But this is inessential since the
probability of falling in any of them is zero. Hence, the probability of any event related to the
random variable X does not depend on the value of the density at the points a and b. All values
of x from a to b . are ‘equally likely’ in the sense that x lies in an interval of width x entirely
contained in the interval from a to b . which is equal to x (b a) , regardless of the exact
location of the interval. The graph of the probability density of a uniform distribution is shown
below (Figure 4.1)
f(X)
1
ba
X
o a b
Figure 4.1
Figure 4.2: The density function for a random variable on the interval [a, b]
It should be emphasized that the density function forms a rectangle with base b a and constant
1
height . As a result, the uniform distribution is often called the rectangular distribution.
ba
Probabilities are simple to calculate for the uniform distribution due to the simple nature of the
density function. However, note that the application of this distribution is based on the
assumption that the probability of falling in an interval of fixed length within a, b is constant.
Example 4.4.1
Suppose that a large conference room for a certain company can be reserved for no more than 4
hours. However, the use of the conference room is such that both long and short conferences
occur quite often. In fact , it can be assumed that length X of a conference has a uniform
distribution on the interval [0,4].
(a) The appropriate density function for the uniformly distributed random variable X in this
situation is
Figure 4.13: The density function for a random variable on the interval [1, 3]
1, 0 x4
f ( x) 4
0, elsewhere
4
(b) . P( X 3) 1
dx 14 .
3 4
Example 4.4.2
In a certain experiment, the error made in determining the density of a substance is a random
variable having a uniform density with 𝑎 = −0.015 and 𝑏 = 0.015. Find the probability that
such errors will be between –0.002 and 0.003
Solution
0.003
1 0.005
P(0.002 x 0.003)
0.002
0.015 0.015
dx
0.03
0.167
Example 4.4.3
Buses arrive at a specific stop at 15 minutes’ interval starting at 7am. That is, they arrive at 7,
7:15, 7:30, 7:45 and so on. If a passenger arrives at the stop at a time that is uniformly
distributed between 7 and 7:30, find the probability that he waits
(a) less than 5 minutes for a bus;
(b) more than 10 minutes for a bus.
Solution
(a) Let X denote the number of minutes pass 7 that the passenger arrives at the stop. Since
X is a uniform random variable over the interval (0,30) , it follows that the passenger
will have to wait less than 5 minutes if (and only if) he arrives between 7:10 and 7:15 or
between 7;25 and 7:30. Hence the desired probability is
15 30
1 1 1
P(10 x 15) P(25 x 30) dx dx
10
30 25
30 3
(b) Similarly, the passenger would have to wait more than 10 minutes if he arrives between 7
and 7:05 or between 7:15 and 7:20 and so the probability is
5 20
1 1
𝑃(0 < 𝑥 < 5) + 𝑃(15 < 𝑥 < 20) = ∫ 𝑑𝑥 + ∫ 𝑑𝑥
0 30 15 30
1 5 1 20
= 𝑥| + 𝑥|
30 0 30 15
5 20 15
= +( − )
30 30 30
1
=
3
5 20
1 1 1
P(0 x 5) P(15 x 20) dx dx
0
30 15
30 3
The mean , the variance 2 and the standard deviation of the uniform distribution, are
given respectively by
b 1 ab
x dx ,
a ba 2
b 1 1
2 (x )2 dx (b a) 2
a ba 12
and
ba
.
2 3
For examples 4.4.1 to 4.4.3 calculate the mean and the standard deviation for each of them.
Section 5: Gamma Distribution
Introduction
Although the normal distribution can be used to solve many problems in engineering and
science, there are still numerous situations that require different types of density functions. Two
such density functions, the gamma and the exponential distributions are discussed in this section
and the section that follows.
Objectives
By the end of this section you should be able to
Identify the gamma distribution as a continuous probability distribution
Solve probability problems concerning the gamma distribution
The graph of any normal probability density function is bell shaped and thus, symmetric. There
are many practical situations in which the variable of interest to the experimenter might have a
skewed distribution. A family of p.d.f’s that yield a wide variety of skewed distribution shape is
the gamma family. The gamma distribution derives its name from the well-known gamma
function, studied in many areas of mathematics. To define the family of gamma distributions,
we first need to introduce the gamma function that plays an important role in many branches of
mathematics.
( 1) e x x 2 dx
0
( ) ( 1)( 1) .
and hence
(n) (n 1)!.
We now include the gamma function in the definition of the gamma distribution.
Definition The continuous random variable X has a gamma distribution, with parameters
and , if its density function is given by
1 1
x
x e , x0
( )
f ( x)
0, elsewhere
where 0 and 0 . The standard gamma distribution has =1. The mean and variance of
the gamma distribution may be obtained by making use of the gamma function. For the mean we
have
1 x
( ) 0
x x 1e dx .
Example 5.1
In a biomedical study with rats, a dose-response investigation is used to determine the effect of
the dose of a toxicant on their survival time. The toxicant is one that is frequently discharged
into the atmosphere from jet fuel. For a certain dose of the toxicant the study determines that the
survival time, in weeks has a gamma distribution with 5 and 10 . What is the probability
that a rat survives no longer than 60 weeks?
Solution
Let the random variable X be the survival time (time to death). The required probability is
x 1e x
60
1
P( X 60)
0
(5)
dx
The integral above can be solved through the use of the incomplete gamma function which
becomes the cumulative distribution function for the gamma distribution. This function is
written as
x 1 y
y e
F ( x; ) dy.
0
( )
If we let y x , so x y , we have
y 4e y
6
1
P( X 60)
(5) dy
0
Which is denoted by F (6;5) in the table of incomplete gamma function in the appendix. Note
that this allows a quick computation of probabilities for the gamma distribution. Indeed, for this
problem the probability that the rat survives no longer than 60 days is given by
Example 4.5.1
It is known, from previous data, that the length of time in months between customers; complaints
about a certain product is a gamma distribution with 2 and 4 . Changes were made that
involve a tightening of quality control requirements. Following these changes, it took 20 months
before the first complaint. Does it appear as if the quality control tightening was effective?
Solution
Let X be the time to the first complaint, which, under conditions prior to the changes, follows a
gamma distribution with 2 and 4 . The question centres around how rare is X 20
given that and remain at values 2 and 4 respectively. In other words, under the prior
conditions is a ;time to complaint; as large as 20 months reasonable? Thus,
1 20 x 1e /
P( X 20) 1
0 ( )
dx.
where F(5;2) F (5;2) 0.96 is from tables. As a result, we could conclude that the conditions of
the gamma distribution with 2 and 4 are not supported by the data that an observed
time to complaint is as large as 20 months. As a result, it is reasonable to conclude that the
quality control work was effective.
.
Section 6: Exponential Distribution
Introduction
Dear learner, welcome to the last section of this unit. Here, we discuss another continuous
probability distribution called the exponential distribution which is a special case of the gamma
distribution for = 1.
The exponential distribution has many applications in the field of statistics, particularly in the
areas of reliability theory and waiting times or queuing problems. For example, in connection
with Poisson process, the waiting time between successive arrivals (successes) has an
exponential distribution. That is, if in a Poisson process the mean arrival rate (average number
of arrivals per unit time) is , the time until the first arrival, or the waiting time between
successive arrivals, has an exponential distribution with 1 . For us to proceed , we need to
consider the definition below.
1 x
e , x 0
f ( x)
0, elsewhere
where 0 .
Example 4.6.1
Suppose that a system contains a certain type of component whose time in years to failure is
given by the random variable X. If five of these components are installed in different systems,
what is the probability that at least two are still functioning at the end of 8 years?
Solution
The probability that a given component is still functioning after 8 years is given by
x
P( X 8) 1
5 e
8
5
dx
8
e 5 0.2 .
Therefore, the probability that at least two of the components are still functioning at the end of 8
years is approximately 0.2
Example 4.6.2
Find the probability that a random variable having an exponential distribution with 10
assumes a value greater than 6.
Solution
Substituting the values of 10 and x 6 , we find that the probability that a random variable
having an exponential distribution with 10 assumes a value greater than 6 is 0.5844.
Example 4.6.3
Use the information in example 4 to find the probability that a random variable assumes value
between 8 and 10.
Solution
Again, substituting the given values, we find that the required probability is 0.8519
The mean and variance of the exponential distribution is given respectively by and
2 2.
Activity Set 4
1. If Z is a random variable having the standard normal distribution, find the probabilities that
it will take on a value
(a) greater than 1.14.
(b) greater than –0.36.
(c) between –0.46 and –0.09.
(d) between –0.58 and 1.12.
2. Suppose that during periods of transcendental meditation, the reduction of a person’s oxygen
consumption is a random variable having a normal distribution with 37.6 cc per minute
and 4.6 cc per minute. Find the probabilities that during a period of transcendental
meditation, a person’s oxygen consumption will be reduced by
(a) at least 44.5 cc per minute.
(b) at most 35.0 cc per minute.
(c) anywhere from 30.0 to 40.0 cc per minute.
3. A multiple choice examination has 200 questions each with four possible answers of which
only one is the correct answer. What is the probability that sheer guess yields from 25 to 30
correct answers for 80 of the 200 problems about which the student has no knowledge?
5. In a certain experiment, the error made in determining the density of a substance is a random
variable having the uniform density with a 0.025 and b 0.025 . What is the probability
that, such an error will be between:
(a) 0.010 and0.015?
(b) –0.012 and 0.012?
6. The daily amount of coffee, in litres, dispensed by a machine located in an airport lobby is a
random variable having a continuous uniform distribution with A = 7 and B = 10. Find the
probability that on a given ay the amount of coffee dispensed by this machine will be (a)
at most 8.8 litres
(b) more than 7.4 litres but less than 9.5 litres
(c) at least 8.5 litres
9. If a random variable X has the gamma distribution with 2 and, 1 find the
probability that the random variable will take on a value less than 4. Find P(1.8 x 2.4)
10. In a certain biomedical research activity, it was determined that the survival time, in weeks,
of an animal when subjected to a certain exposure of gamma radiation has a gamma
distribution with 5 and 10 .
(a) What is he mean survival time of a randomly selected animal of the type used in the
experiment?
(b) What is the standard deviation of survival time?
(c) What is the probability that an animal survives more than 30 weeks?
Solution Set 8
1. (a) 0.1271 (b) 0.6406 (c) 0.5876
2. (a) 0.0823 (b) 0.3228 (c) 0.649
3. (a) 0.0951
4. (a)0.3085 (b) 0.0197
5. (a) 0.1 (b) 0.48
6. (a) 0.6 (b) 0.7 (c) 0.5
7. (a) 0.1889 (b) 0.0357
8. (a) 0.3968
9. (a) 0.1545
10. (a) 50 (b) 500 (c) 0.815