0% found this document useful (0 votes)
208 views41 pages

MODULE+4+ +Continuous+Probability+Distributions+2022+

The document summarizes key topics in continuous probability distributions including: 1. Continuous random variables that can take on any value in an interval rather than discrete values. 2. Probability density functions and cumulative distribution functions are used rather than tables. 3. Common continuous distributions are discussed including normal, exponential, and more. 4. Properties of continuous distributions like means, variances, and probabilities of intervals.

Uploaded by

Hemis Resd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views41 pages

MODULE+4+ +Continuous+Probability+Distributions+2022+

The document summarizes key topics in continuous probability distributions including: 1. Continuous random variables that can take on any value in an interval rather than discrete values. 2. Probability density functions and cumulative distribution functions are used rather than tables. 3. Common continuous distributions are discussed including normal, exponential, and more. 4. Properties of continuous distributions like means, variances, and probabilities of intervals.

Uploaded by

Hemis Resd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

MODULE 4 – Continuous Probability Distributions 1

Engr. Caesar Pobre Llapitan

Topics:
I. Continuous Random Variables and their Probability Distribution
II. Expected Values of Continuous Random Variables
III. Normal Distribution
IV. Normal Approximation to the Binomial and Poisson Distribution
V. Exponential Distribution

Learning Objectives
After finishing this module, the students should be able to do the following:
1. Determine probabilities from probability density functions.
2. Determine probabilities from cumulative distribution functions and cumulative
distribution functions
3. from probability density functions, and the reverse.
4. Calculate means and variances for continuous random variables.
5. Understand the assumptions for each of the continuous probability distributions
presented.
6. Select an appropriate continuous probability distribution to calculate probabilities in
specific applications.
7. Calculate probabilities, determine means and variances for each of the continuous
probability distributions presented.
8. Standardize normal random variables.

I. CONTINUOUS RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTION

A continuous random variable has a probability of assuming exactly any of its values.
Consequently, its probability distribution cannot be given in tabular form.

A. Introduction
Many random variables observed in real life are not discrete random variables because the
number of values they can assume is not countable. In contrast to discrete random variables,
these variables can take on any value within an interval. For example: the daily rainfall at some
location, the strength of a steel bar and the intensity of sunlight at a particular time of day.
These random variables are called continuous random variables.

For a continuous random variable, we deal with probabilities of intervals rather than
probabilities of particular individual values.

Definition 1
Let X be a continuous random variable assuming any value in the interval (-  , +  ). Then
the cumulative distribution function F(x) of the variable X is defined as follows

F(x) = P(X  x) (1)

i.e., F(x) is equal to the probability that the variable X assumes values, which are less than
or equal to x.

Note that here and from now on we denote by letter X a continuous random variable and denote
by x a point on number line.

From the definition of the cumulative distribution function F(x) it is easy to show the following
its properties.
MODULE 4 – Continuous Probability Distributions 2
Engr. Caesar Pobre Llapitan

B. Properties of the cumulative distribution function F(x) for a continuous random


variable X
1. 0  F ( x)  1 ,
2. F(x) is a monotonically non-decreasing function, that is, if a  b then F (a)  F (b)
for any real numbers a and b.
3. P(a  X  b) = F (b) − F (a)
4. F ( x ) → 0 as x → − and F ( x ) → 1 as x → +

A large data set can be described by means of a relative frequency distribution. If the data
represent measurements on a continuous random variable and if the amount of data is very
large, we can reduce the width of the class intervals until the distribution appears to be a smooth
curve. A probability density is a theoretical model for this distribution.

The distinction between discrete random variables and continuous random variables is usually
based on the difference in their cumulative distribution functions.

Definition 2
Let X be such a random variable. We say that X is a continuous random variable if there
exists a nonnegative function f, defined for all real x ∈ (−, ), having the property that, for
any set B of real numbers,

P  X  x = B f ( x ) dx (2)

The function f is called the probability density function of the random variable X.

In words, Equation (2) states that the probability that X will be in B may be obtained by
integrating the probability density function over the set B. Since X must assume some value, f
must satisfy

1 = P  X  ( − ,  ) =

− f (x) dx (3)

The density function for a continuous random variable X, the model for some real-life
population of data, will usually be a smooth curve as shown in Figure 1.

Figure 1 Density function f(x) for a continuous random variable

x
It follows that F ( x ) =  f (t )dt . Thus, the cumulative area under the curve between -  and a
-
point x0 is equal to F(x0).
MODULE 4 – Continuous Probability Distributions 3
Engr. Caesar Pobre Llapitan

All probability statements about X can be answered in terms of f. If an interval is likely to contain
a value for X, its probability is large and it corresponds to large values for f(x).

Definition 3
The probability that X is between a and b is determined as the integral of f(x) from a to b.

b
P a  X  b =  f (x) dx (4)
a

P(a ≤ X ≤ b) = area of shaded region

Figure 2 Probability density function f

The density function for a continuous random variable must always satisfy the two properties
given below.

Definition 4
For a continuous random variable X, a probability density function is a function such that
1. f(x) ≥ 0

2. − f (x) dx = 1
b
3. P ( a  X  b ) =  f ( x ) dx = area under f(x) from a to b for any a and b.
a

Properties of a density function


1. f(x)  0
+
2.  f (x )dx = F () = 1
−

Examples
1. Suppose that X is a continuous random variable whose probability density function is given
by

f (x) = 
 (
C 4 x − 2 x
2
)
0 x 2

0 otherwise

a) What is the value of C?


b) Find P{X > 1}.

Solution:
MODULE 4 – Continuous Probability Distributions 4
Engr. Caesar Pobre Llapitan


a) Since f is a probability density function, we must have − f (x) dx = 1 ; implying that
2
(
C  4 x − 2 x 2 dx = 1
0
)
x =2
 2x 3 
C 2 x 2 −  =1
 3 
x =0
3
C=
8

Hence,
b) P{X > 1} = 1

f (x) dx =
3 2
8 1
(
4 x − 2 x 2 dx =
1
2
)
2. The amount of time in hours that a computer functions before breaking down is a
continuous random variable with probability density function given by

e − x / 100
x0
f (x) = 
0 x 0

What is the probability that


a) a computer will function between 50 and 150 hours before breaking down?
b) it will function for fewer than 100 hours?

Solution:
a. Since
 
1 =  f ( x ) dx =   e − x /100 dx
− 0

we obtain
 1
1 = − ( 100 ) e − x/100 = 100 or =
0 100

Hence, the probability that a computer will function between 50 and 150 hours before
breaking down is given by

150 1 − x/100 150


P 50  X  150 =  e dx = −e − x/100
50 100 50

= e −1/2 − e −3/2  0.384

b. Similarly,
1001 − x/100 100
P  X  100 =  e dx = −e − x/100
0 100 0
−1
= 1 − e  0.633

In other words, approximately 63.3 percent of the time, a computer will fail before
registering 100 hours of use.

3. The lifetime in hours of a certain kind of radio tube is a random variable having a probability
density function given by
MODULE 4 – Continuous Probability Distributions 5
Engr. Caesar Pobre Llapitan

0 x  100

f ( x ) =  100
 2 x > 100
x

What is the probability that exactly 2 of 5 such tubes in a radio set will have to be replaced
within the first 150 hours of operation? Assume that the events Ei, i =1, 2, 3, 4, 5, that the ith
such tube will have to be replaced within this time are independent.

Solution:
From the statement of the problem, we have
150
P ( Ei ) =  f ( x ) dx
0
150
= 100  x −2 dx
100
1
=
3

Hence, from the independence of the events Ei, it follows that the desired probability is

2
 5  1   2  80
     =
 2   3   3  243

The relationship between the cumulative distribution F and the probability density f is
expressed by

a
F ( a ) = P  X  (−, a] =  f ( x ) dx
−

Differentiating both sides of the preceding equation yields

d
F ( a) = f ( a)
da

That is, the density is the derivative of the cumulative distribution function. A somewhat
more intuitive interpretation of the density function may be obtained


    a+
P a −  X  a +  =  2 f ( x ) dx   f ( a )
 2 2  a− 2

when ε is small and when f (·) is continuous at x = a. In other words, the probability that X
will be contained in an interval of length ε around the point a is approximately εf (a). From
this result we see that f (a) is a measure of how likely it is that the random variable will be
near a.

4. If X is continuous with distribution function F(X) and density function f(X), find the density
function of Y = 2X.

Solution:
We will determine fY in two ways. The first way is to derive, and then differentiate, the
distribution function of Y:
MODULE 4 – Continuous Probability Distributions 6
Engr. Caesar Pobre Llapitan

FY ( a ) = P Y  a
= P 2 X  a
= P  X  a / 2
= FX ( a / 2 )

Differentiation gives

1
fY ( a ) = f X ( a / 2)
2

Classroom Activity 1
1. Suppose that f(x) = x/8 for 3 < x < 5. Determine the following probabilities:
a. P(X < 4) c. P(4 < X < 5)
b. P(X > 3.5 d. P(X < 3.5 or X > 4.5)
2. The probability density function of the time to failure of an electronic component in a
e− x/1000
copier (in hours) is f ( x ) = for x > 0. Determine the probability that
1000
a. A component lasts more than 3000 hours before failure.
b. A component fails in the interval from 1000 to 2000 hours.
c. A component fails before 1000 hours.
d. Determine the number of hours at which 10% of all components have failed.
3. The probability density function of the net weight in pounds of a packaged chemical
herbicide is f(x) = 2.0 for 49.75 < x < 50.25 pounds.
a. Determine the probability that a package weighs more that 50 pounds.
b. How much chemical is contained in 90% of all packages?

Definition 3
The cumulative distribution function of a continuous random variable X is

f (x)= P ( X  x) =
x
− f ( u ) du (5)

for - ≤ x ≤ .

Figure 3 A pdf and associated cdf

Extending the definition of f(x) to the entire real line enables us to define the cumulative
distribution function for all real numbers.
MODULE 4 – Continuous Probability Distributions 7
Engr. Caesar Pobre Llapitan

Example 1
Let X, the thickness of a certain metal sheet, have a uniform distribution on [A, B]. The
density function is shown in the Figure.

For x < A, F(x) = 0, since there is no area under the graph of the density function to the left
of such an x. For x  B, F(x) = 1, since all the area is accumulated to the left of such an x.
Finally, for A  x  B,

x x 1 1 y=x x−A
F (x)=  f ( y ) dy =  dy = y y= A =
− A B−A B−A B−A

The entire cdf is

0 x A
x − A

F (x)= A x  B
B − A
1 xB

The graph of this cdf appears is

Example 2
The time until a chemical reaction is complete (in milliseconds) is approximated by the
cumulative distribution function

0 x 0
F (x) =  −0.01 x
1 − e 0 x

Determine the probability density function of X. What proportion of reactions is complete


within 200 milliseconds?

Solution:
Using the result that the probability density function is the derivative of the F(x), we obtain
MODULE 4 – Continuous Probability Distributions 8
Engr. Caesar Pobre Llapitan

0 x 0
F (x) =  −0.01 x
0.01 e 0x

The probability that a reaction completes within 200 milliseconds is

P ( X  200 ) = F ( 200 ) = 1 − e −2 = 0.8647

Classroom Activity 2
1. Suppose the cumulative distribution function of the random variable X is

0 x − 2

F ( x ) = 0.25 x + 0.5 − 2  x  2
1 2 x

Determine the following:
a. P(X < 1.8) c. P(X > -1.5)
b. P(X < -2) d. P(-1 < x < 1)
2. The gap width is an important property of a magnetic recording head. In coded units, if
the width is a continuous random variable over the range from 0 < x < 2 with f(x) = 0.5x,
determine the cumulative distribution function of the gap width.

C. Using F(x) to Compute Probabilities


The importance of the cdf here is that probabilities of various intervals can be computed from
a formula for or table of F(x).

Proposition 1
Let X be a continuous rv with pdf f (x) and cdf F(x). Then for any number a,

P(X > a) = 1 − F(a) (6)

and for any two numbers a and b with a < b,

P(a  X  b) = F(b) − F(a) (7)

Figure 4 illustrates the second part of this proposition; the desired probability is the shaded area
under the density curve between a and b, and it equals the difference between the two shaded
cumulative areas. This is different from what is appropriate for a discrete integer-valued random
variable (e.g., binomial or Poisson):

P(a  X  b) = F(b) − F(a − 1) when a and b are integers.

Figure 4 Computing P(a  X  b) from cumulative probabilities

Example 3
Suppose the pdf of the magnitude X of a dynamic load on a bridge (in newtons) is given by
MODULE 4 – Continuous Probability Distributions 9
Engr. Caesar Pobre Llapitan

1 3
 + x 0 x 2
f ( x)= 8 8
0 otherwise

For any number x between 0 and 2,


x  1 3  x 3
F ( x ) =   + y  dy = + x 2
−  8 8  8 16

Thus,
0 x 0
1 3

F (x)= + x 0 x 2
8 8
1 2< x

The graphs of f (x) and F(x) are shown in the Figure below.

The probability that the load is between 1 and 1.5 is

P ( 1  X  1.5 ) = F ( 1.5 ) − F ( 1 )
1 3 2  1 3 2 19
=  ( 1.5 ) + ( 1.5 )  −  ( 1 ) + ( 1 )  = = 0.297
8 16  8 16  64

The probability that the load exceeds 1 is

1 3 2
P ( X  1) = 1 − P ( X  1) = 1 − F (1)= 1 −  (1 ) + (1 ) 
8 16 
11
= = 0.688
16

Once the cdf has been obtained, any probability involving X can easily be calculated without
any further integration.

Obtaining f(x) from F(x)


For X discrete, the pmf is obtained from the cdf by taking the difference between two F(x) values.
The continuous analog of a difference is a derivative. The following result is a consequence of
the Fundamental Theorem of Calculus.

Proposition 2
If X is a continuous random variable with pdf f(x) and cdf F(x), then at every x at which
the derivative F'(x) exists, F'(x) = f (x).
MODULE 4 – Continuous Probability Distributions 10
Engr. Caesar Pobre Llapitan

Example 4
Refer to Example 1. When X has a uniform distribution, F(x) is differentiable except at x 5 A
and x = B, where the graph of F(x) has sharp corners. Since F(x) = 0 for x < A and F(x) = 1 for
x > B, F'(x) = 0 = f(x) for such x. For A < x < B,

d x−A 1
F '( x ) =  = = f (x)
dx  B − A  B − A

D. Percentiles of a Continuous Distribution


When we say that an individual’s test score was at the 85th percentile of the population, we
mean that 85% of all population scores were below that score and 15% were above. Similarly,
the 40th percentile is the score that exceeds 40% of all scores and is exceeded by 60% of all
scores (having a value corresponding to a high percentile is not necessarily good; e.g., you
would not want to be at the 99th percentile for blood alcohol content).

Definition 4
Let p be a number between 0 and 1. The (100p)th percentile of the distribution of a
continuous random variable X, denoted by (p), is defined by

 ( p)
p = F ( ( p ) ) =  f ( y ) dy (8)
−

According to Expression (8), (p) is that value on the measurement axis such that 100p% of the
area under the graph of f(x) lies to the left of (p) and 100(1 − p)% lies to the right. Thus (.75),
the 75th percentile, is such that the area under the graph of f(x) to the left of (.75) is .75. Figure
5 illustrates the definition.

Figure 5 The (100p)th percentile of a continuous distribution

Example 5
The distribution of the amount of gravel (in tons) sold by a particular construction supply
company in a given week is a continuous rv X with pdf
3
 1−x
f (x)=2
2
(
0 x1 )
0 otherwise

The cdf of sales for any x between 0 and 1 is

y=x
3 y3  3 
F (x) = 
0
x 3
2
( )
1 − y 2 dy =  y − 
2  3 
= x −
2 
x3
3

y =0 
MODULE 4 – Continuous Probability Distributions 11
Engr. Caesar Pobre Llapitan

The graphs of both f (x) and F(x) appear in Figure below.

The (100p)th percentile of this distribution satisfies the equation

 ( ( p)) 
3
3
p = F ( ( p ) ) =  ( p ) − 
2 3 
 
That is,
( ( p ) )3 − 3 ( p ) + 2 p = 0
For the 50th percentile, p = 0.5, and the equation to be solved is 3 - 3 + 1 = 0; the solution
is  = (.5) = 0.347.

If the distribution remains the same from week to week, then in the long run 50% of all
weeks will result in sales of less than 0.347 ton and 50% in more than 0.347 ton.

Exercises 1
1. Let the continuous random variable X denote the current measured in a thin copper wire in
milliamperes. Assume that the range of X is [0, 20 mA], and assume that the probability
density function of X is f(x) = 0.05 for 0 ≤ x ≤ 20. What is the probability that a current
measurement is less than 10 milliamperes?
2. Let the continuous random variable X denote the diameter of a hole drilled in a sheet metal
component. The target diameter is 12.5 millimeters. Most random disturbances to the
process result in larger diameters. Historical data show that the distribution of X can be
modeled by a probability density function f ( x ) = 20e ( )
−20 x − 12.5
, x ≥ 12.5.
a. If a part with a diameter larger than 12.60 millimeters is scrapped, what proportion
of parts is scrapped?
b. What proportion of parts is between 12.5 and 12.6 millimeters?
3. The current in a certain circuit as measured by an ammeter is a continuous random variable
X with the following density function:
0.075 x + 0.2 3x 5
f (x) = 
0 otherwise

a. Graph the pdf and verify that the total area under the density curve is indeed 1.
b. Calculate P(X  4). How does this probability compare to P(X < 4)?
c. Calculate P(3.5  X  4.5) and also P(4.5 < X).
4. Suppose the reaction temperature X (in 8 0C) in a certain chemical process has a uniform
distribution with A = -5 and B = 5.
a. Compute P(X < 0).
b. Compute P(-2.5 < X < 2.5).
MODULE 4 – Continuous Probability Distributions 12
Engr. Caesar Pobre Llapitan

c. Compute P(-2  X  3).


d. For k satisfying -5 < k < k + 4 < 5, compute P(k < X < k + 4).
5. The article “Second Moment Reliability Evaluation vs. Monte Carlo Simulations for
Weld Fatigue Strength” (Quality and Reliability Engr. Intl., 2012: 887–896) considered
the use of a uniform distribution with A 5 .20 and B 5 4.25 for the diameter X of a certain
type of weld (mm).
a. Determine the pdf of X and graph it.
b. What is the probability that diameter exceeds 3 mm?
c. What is the probability that diameter is within 1 mm of the mean diameter?
d. For any value a satisfying .20 < a < a + 1 < 4.25, what is P(a < X < a + 1)?
6. Let X denote the amount of time a book on two-hour reserve is actually checked out, and
suppose the cdf is
0 x 0
 2
x
F (x)= 0 x 2
4
1 2 x

a. Calculate P(X  1). c. Calculate P(.5  X  1).


b. Calculate P(X < 1.5).

II. EXPECTED VALUES OF CONTINUOUS RANDOM VARIABLES

For a discrete random variable X, E(X) was obtained by summing x  p(x) over possible X values.
Here we replace summation by integration and the pmf by the pdf to get a continuous weighted
average.

Definition 5
Let X be a continuous random variable with density function f(x). Then the mean or the
expected value of X is

+
E( X ) =  xf (x)dx (9)
-

Definition 6
Let X be a continuous random variable with density function f(x) and g(x) is a function of
x. Then the mean or the expected value of g(X) is

+
E[ g( X )] =  g( x ) f ( x )dx (10)
-

That is, just as E(X) is a weighted average of possible X values, where the weighting function is
the pdf f(x), E[g(X)] is a weighted average of g(X) values.

In the discrete case, the variance of X was defined as the expected squared deviation from µ and
was calculated by summation. Here again integration replaces summation.
MODULE 4 – Continuous Probability Distributions 13
Engr. Caesar Pobre Llapitan

Definition 7
Let X be a continuous random variable with the expected value E(X) =  . Then the
variance of X is
 2 = E[( X -  )2 ] (11)

The standard deviation of X is the positive square root of the variance  =  2

Example 1
The pdf of weekly gravel sales X was

3
 1−x
f (x)=2
2
( ) 0 x1
0 otherwise

So,
E ( X )=

−
1
x  f ( x ) dx =  x 
0
3
2
( )
1 − x 2 dx =
3
8

When the pdf f (x) specifies a model for the distribution of values in a numerical population,
then µ is the population mean, which is the most frequently used measure of population location
or center.

Example 2
Two species are competing in a region for control of a limited amount of a certain resource.
Let X = the proportion of the resource controlled by species 1 and suppose X has pdf
1 0 x1
f (x)=
0 otherwise

which is a uniform distribution on [0, 1]. (In her book Ecological Diversity, E. C. Pielou
calls this the “broken-stick” model for resource allocation, since it is analogous to breaking
a stick at a randomly chosen point.) Then the species that controls the majority of this
resource controls the amount
 1
1 − X if 0  X < 2
h ( X ) = max ( X ,1 − X ) = 
X 1
if  X  1
 2

The expected amount controlled by the species having majority control is then
 1
E h ( X )  =  max ( x ,1 − x )  f ( x ) dx =  max ( x ,1 − x )  1 dx
− 0
1/2 1 3
=
0
( 1 − x )  1 dx + 1/2 x  1 dx =
4

The variance and standard deviation give quantitative measures of how much spread there is in
the distribution or population of x values. Again,  is roughly the size of a typical deviation from
µ. Computation of 2 is facilitated by using the same shortcut formula employed in the discrete
case.

( )
V ( X ) = E X 2 −  E ( X )
2
(12)
MODULE 4 – Continuous Probability Distributions 14
Engr. Caesar Pobre Llapitan

Example 3
For X = weekly gravel sales, we computed E(X) = 3/88 . Since

( ) 
−
1 3
(
X X 2 =  x 2  f ( x ) dx =  x 2  1 − x 2 dx
0 2
)
=
13
02
(
x − x dx =
2 4
) 1
5

2
1 3 19
V ( X )= −   = = 0.059 and X = 0.244
5  8  120

Classroom Activity 3
1. Suppose f ( x ) = 0.125 x for 0 < x < 4. Determine the mean and variance of X.
2. The thickness of a conductive coating in micrometers has a density function of 600 x-2
for 100 µm < x < 120 µm.
a. Determine the mean and variance of the coating thickness.
b. If the coating costs $ 0.50 per micrometer of thickness on each part, what is the
average cost of the coating per part?
3. The probability density function for the diameter of a drilled hole in millimeters is
10e ( ) for x > 5 mm. Although the target diameter is 5 millimeters, vibrations, tool
−10 x − 5

wear, and other nuisances produce diameters larger than 5 millimeters.


a. Determine the mean and variance of the diameter of the holes.
b. Determine the probability that a diameter exceeds 5.1 millimeters.

III. UNIFORM DISTRIBUTION

The continuous random variable X has the uniform distribution between 1 and 2, with 1 < 2

 1
 1  x   2
f ( x ) =  2 − 1 (13)
0
 otherwise

X U (1 ,  2 ) for short

The density function forms a rectangle with base 2− 1 and constant height 1/[2− 1]. As a
result, the uniform distribution is often called the rectangular distribution. Note, however,
that the interval may not always be closed: [1, 2]. It can be (1, 2) as well.

Mean and variance:


( −  )
2
1 + 2
=  = 2 1
2
(14)
2 12
MODULE 4 – Continuous Probability Distributions 15
Engr. Caesar Pobre Llapitan

Occurrence of the Uniform distribution


1. Waiting times from random arrival time until a regular event (see below)
2. Engineering tolerances: e.g. if a diameter is quoted "0.1mm", it sometimes assumed
(probably incorrectly) that the error has a U(-0.1, 0.1) distribution.
3. Simulation: programming languages often have a standard routine for simulating the
U(0, 1) distribution. This can be used to simulate other probability distributions.

Example 1
Suppose that a large conference room at a certain company can be reserved for no more
than 4 hours. Both long and short conferences occur quite often. In fact, it can be assumed
that the length X of a conference has a uniform distribution on the interval [0, 4].
a. What is the probability density function?
b. What is the probability that any given conference lasts at least 3 hours?

Solution:
a. The appropriate density function for the uniformly distributed random variable X in
this situation is

1
 0 x 4
f (x)= 4
0 otherwise

1 1
P  X  3 = 
4
b. dx =
3 4 4

Example 2 Disk wait times


In a hard disk drive, the disk rotates at 7200rpm. The wait time is defined as the time
between the read/write head moving into position and the beginning of the required
information appearing under the head.
a. Find the distribution of the wait time.
b. Find the mean and standard deviation of the wait time.
c. Booting a computer requires that 2000 pieces of information are read from
random positions. What is the total expected contribution of the wait time to
the boot time, and rms deviation?

Solution:
Rotation time = 8.33 ms.

Wait time can be anything between 0 and 8.33 ms and each time in this range is as likely
as any other time.

Therefore, distribution of the wait time is U(0, 8.33 ms) (i.e., 1 = 0 and 2 = 8.33 ms).
0 + 8.33
= ms = 4.2 ms
2
( 8.33 − 0 )
2

 =
2
ms = 5.8 ms2   = 2.4 ms
12

For 2000 reads the mean time is 2000 × 4.2 ms = 8.3 s.


The variance is 2000 5.7 ms2 = 0.012s2, so  = 0.11 s.
MODULE 4 – Continuous Probability Distributions 16
Engr. Caesar Pobre Llapitan

Classroom Activity 4
1. Suppose X has a continuous uniform distribution over the interval [1.5, 5.5].
a. Determine the mean, variance and standard deviation of X.
b. What is P(X < 2.5)?
2. The net weight in pounds of a packaged chemical herbicide is uniform for 49.75 < x <
50.25 pounds.
a. Determine the mean and variance of the weight of packages.
b. Determine the cumulative distribution function of the weight of packages.
c. Determine P(X < 50.1)
3. Suppose the time it takes a data collection operator to fill out an electronic form for a
database is uniformly between 1.5 and 2.2 minutes.
a. What is the mean and variance of the time it takes an operator to fille out the form?
b. What is the probability that it will take less than two minutes to fill out the form?
c. Determine the cumulative distribution function of the time it takes to fill out the
form.
4. The probability density function of the time it takes a hematology cell counter to
complete a test on a blood sample is f(x) = 0.2 for 50 < x < 75 seconds.
a. What percentage of tests require more than 70 seconds to complete.
b. What percentage of tests require less than one minute to complete.
c. Determine the mean and variance of the time to complete a test on a sample.

IV. NORMAL DISTRIBUTION

The normal (or Gaussian) density function was proposed by C. F. Gauss (1777-1855) as a model
for the relative frequency distribution of errors, such errors of measurement. Amazingly, this
bell-shaped curve provides an adequate model for the relative frequency distributions of data
collected from many different scientific areas.

A. The Density Function, Mean and Variance for A Normal Random Variable

A density curve is a curve that:


▪ is always on or above the horizontal axis
▪ has an area of exactly 1 underneath it
▪ Measures of center and spread apply to density curves as well as to actual sets of
observations

o Density curves are lines that show the location of the individuals along the horizontal
axis and within the range of possible values.
o They help researchers to investigate the distribution of a variable.
o Some density curves have certain properties that help researchers draw conclusions
about the entire population.

A density curve describes the overall pattern of a distribution. The area under the curve and
above any range of values on the horizontal axis is the proportion of all observations that fall in
that range.

Distinguishing the Median and Mean of a Density Curve


▪ The median of a density curve is the equal-areas point, the point that divides the area
under the curve in half.
▪ The mean of a density curve is the balance point, at which the curve would balance if
made of solid material.
MODULE 4 – Continuous Probability Distributions 17
Engr. Caesar Pobre Llapitan

▪ The median and the mean are the same for a symmetric density curve. They both lie at
the center of the curve.

A Normal distribution is described by a Normal density curve. Any particular Normal


distribution is completely specified by two numbers: its mean 𝜇 and its standard deviation 𝜎.
▪ The mean of a Normal distribution is the center of the symmetric Normal curve.
▪ The standard deviation is the distance from the center to the change-of-curvature points
on either side.
▪ The Normal distribution is abbreviated with mean 𝜇 and standard deviation 𝜎 as 𝑁(𝜇, 𝜎)

Definition 8
A continuous random variable X is said to have a normal distribution with parameters µ
and  (or µ and 2), where − < µ <  and 0 < , if the probability density function of X is

1
f ( x;  ,  ) = e−( x −  ) /2
2 2
(15)
 2

The parameters  and 2 are the mean and the variance, respectively, of the normal random
variable

There is infinite number of normal density functions – one for each combination of  and .
The mean measures the location and the variance measures its spread. Several different normal
density functions are shown in Figure 6.

Figure 6 Several normal distributions:


Curve 1 with  = 3,  = 1, Curve 2 with  = −1,  = 0 , and Curve 3 with  = 0,  = 1.5

The 68-95-99.7 Rule

P(  −    ) = 0.6826
P(  −   2 ) = 0.9544
P(  −   3 ) = 0.9973

These equalities are known as  , 2 and 3 rules, respectively and are often used in statistics.
Namely, if a population of measurements has approximately a normal distribution the
probability that a random selected observation falls within the intervals ( - ,  + ), ( - 2, 
+2), and ( - 3,  + 3), is approximately 0.6826, 0.9544 and 0.9973, respectively.
MODULE 4 – Continuous Probability Distributions 18
Engr. Caesar Pobre Llapitan

Characteristics of the Normal distribution


1. Normal distributions are symmetric around their mean.
2. The mean, median, and mode of a normal distribution are equal.
3. The area under the normal curve is equal to 1.0.
4. Normal distributions are denser in the center and less dense in the tails.
5. Normal distributions are defined by two parameters, the mean (μ) and the standard
deviation (σ).
6. 68% of the area of a normal distribution is within one standard deviation of the mean.
7. Approximately 95% of the area of a normal distribution is within two standard
deviations of the mean.

Areas Under Normal Distributions


Figure 7 shows a normal distribution with a mean of 50 and a standard deviation of 10. The
shaded area between 40 and 60 contains 68% of the distribution.

Figure 7 Normal distribution with a mean of 50 and standard deviation of


10. 68% of the area is within one standard deviation (10) of the mean (50)
MODULE 4 – Continuous Probability Distributions 19
Engr. Caesar Pobre Llapitan

Figure 8 shows a normal distribution with a mean of 100 and a standard deviation of 20. As in
Figure 7, 68% of the distribution is within one standard deviation of the mean.

Figure 8 Normal distribution with a mean of 100 and standard deviation of


20. 68% of the area is within one standard deviation (20) of the mean (100).

The normal distributions shown in Figures 7 and 8 are specific examples of the general rule that
68% of the area of any normal distribution is within one standard deviation of the mean.

Figure 9 shows a normal distribution with a mean of 75 and a standard deviation of 10. The
shaded area contains 95% of the area and extends from 55.4 to 94.6. For all normal distributions,
95% of the area is within 1.96 standard deviations of the mean. For quick approximations, it is
sometimes useful to round off and use 2 rather than 1.96 as the number of standard deviations
you need to extend from the mean so as to include 95% of the area.

Figure 9 A normal distribution with a mean of 75 and a standard deviation


of 10. 95% of the area is within 1.96 standard deviations of the mean

B. Standard Normal Distribution


The computation of P(a  X  b) when X is a normal rv with parameters µ and  requires
evaluating

− ( x −  ) /( 2 2 )
2
b 1
a 2 
e dx (16)

None of the standard integration techniques can be used to accomplish this. Instead, for µ = 0
and  = 1, Expression (16) has been calculated using numerical techniques and tabulated for
certain values of a and b. This table can also be used to compute probabilities for any other
values of µ and  under consideration.
MODULE 4 – Continuous Probability Distributions 20
Engr. Caesar Pobre Llapitan

Definition 9
The normal distribution with parameter values  = 0 and  = 1 is called the standard
normal distribution. A random variable having a standard normal distribution is called a
standard normal random variable and will be denoted by Z. The probability density
function of Z is

1
e− z
2
f (z;0,1) = /2
−  z . (17)
2

The distribution with this density function is called the standardized normal distribution.

The graph of f (z; 0, 1) is called the standard normal (or z) curve. Its inflection points are at 1 and
-1.

Figure 10 The standardized normal density distribution

If  is a normal random variable with the mean  and variance  then


1) the variable

 −
z= (18)

is the standardized normal random variable;

2) P(  −   n ) = 2(n) , where
x
1
e
− t 2 /2
( x ) = dt (19)
2 0

This function is called the Laplace function and it is tabulated.

Figure 11 Standard normal cumulative areas tabulated


MODULE 4 – Continuous Probability Distributions 21
Engr. Caesar Pobre Llapitan

Table 1. A portion of a table of the standard normal distribution.

How to Use Table 3 to Calculate Probabilities under the Standard Normal Curve
▪ To calculate the area to the left of a z-value, find the area directly from Table 3.
▪ To calculate the area to the right of a z-value, find the area in Table 3, and subtract from
1.
▪ To calculate the area between two values of z, find the two areas in Table 3, and subtract
one area from the other.

Properties of a Normal Curve


1. The mode, which is the point on the horizontal axis where the curve is a maximum,
occurs at x = μ.
2. The curve is symmetric about a vertical axis through the mean μ.
3. The curve has its points of inflection at x = μ ± σ; it is concave downward if μ −σ <X < μ+
σ and is concave upward otherwise.
4. The normal curve approaches the horizontal axis asymptotically as we proceed in either
direction away from the mean.
5. The total area under the curve and above the horizontal axis is equal to 1.
MODULE 4 – Continuous Probability Distributions 22
Engr. Caesar Pobre Llapitan

A value from any normal distribution can be transformed into its corresponding value on a
standard normal distribution using the following formula:

X−
z= (20)

where Z is the value on the standard normal distribution, X is the value on the original
distribution, μ is the mean of the original distribution, and σ is the standard deviation of the
original distribution.

Figure 11 The original and transformed normal distributions

If all the values in a distribution are transformed to Z scores, then the distribution have a mean
of 0 and a standard distribution 1. This process of transforming a distribution to one with a mean
of 0 and a standard deviation of 1 is called standardizing the distribution.

C. Areas under the Normal Curve


The curve of any continuous probability distribution or density function is constructed so that
the area under the curve bounded by the two ordinates x = x1 and x = x2 equals the probability
that the random variable X assumes a value between x = x1 and x = x2.

Figure 12 P(x1 < X < x2) = area of the shaded region.


1
1 x2 − ( x −  )2 dx
P ( x 1  X  x2 ) = x e 2 2
2  1
MODULE 4 – Continuous Probability Distributions 23
Engr. Caesar Pobre Llapitan

Definition 10
The distribution of a normal random variable with mean 0 and variance 1 is called a
standard normal distribution.

Empirical Rule
If the population distribution of a variable is (approximately) normal, then
1. Roughly 68% of the values are within 1 SD of the mean.
2. Roughly 95% of the values are within 2 SDs of the mean.
3. Roughly 99.7% of the values are within 3 SDs of the mean.

Figure 13 Probabilities associated with a normal distribution.

General Procedure
1. We first convert the problem into an equivalent one dealing with a normal variable
measured in standardized deviation units, called a standardized normal variable. To do
this, if X ∼ N(μ, σ2), then

X −
z=

2. A table of standardized normal values can then be used to obtain an answer in terms of
the converted problem.
3. If necessary, we can then convert back to the original units of measurement. To do this,
simply note that, if we take the formula for Z, multiply both sides by σ, and then add μ
to both sides, we get

Z=Xσ+μ

4. The interpretation of Z values is straightforward. Since σ = 1, if Z = 2, the corresponding


X value is exactly 2 standard deviations above the mean. If Z = -1, the corresponding X
value is one standard deviation below the mean. If Z = 0, X = the mean, i.e. μ.

Let’s determine the following standard normal probabilities.

Example 1
(a) P(Z  1.25), (b) P(Z > 1.25), (c) P(Z  −1.25), (d) P(−0.38  Z  1.25), and (e) P(Z  5).

Solution:
MODULE 4 – Continuous Probability Distributions 24
Engr. Caesar Pobre Llapitan

a) P(Z  1.25) = (1.25), a probability that is tabulated in Appendix Table A.3 at the
intersection of the row marked 1.2 and the column marked .05. The number there is
.8944, so P(Z  1.25) = .8944.

b) P(Z > 1.25) = 1 − P(Z  1.25) = 1 − (1.25), the area under the z curve to the right of 1.25
(an upper-tail area). Then (1.25) = .8944 implies that P(Z > 1.25) = .1056.

Since Z is a continuous rv, P(Z  1.25) = .1056.

c) P(Z  −1.25) = (−1.25), a lower-tail area. Directly from Appendix Table A.3, (1.25) =
.1056. By symmetry of the z curve, this is the same answer as in part (b).

d) P(−0.38  Z  1.25) is the area under the standard normal curve above the interval whose
left endpoint is −0.38 and whose right endpoint is 1.25. If X is a continuous rv with cdf
F(x), then P(a  X  b) = F(b) − F(a).

Thus P(−0.38  Z  1.25) = (1.25)  (−0.38) = .8944 − .3520 = 0.5424.

e) P(Z  5) = (5), the cumulative area under the z curve to the left of 5. This probability
does not appear in the table because the last row is labeled 3.4. However, the last entry
in that row is Φ(3.49) = .9998. That is, essentially all of the area under the curve lies to
the left of 3.49 (at most 3.49 standard deviations to the right of the mean). Therefore
we conclude that P(Z  5)  1.

Example 2
Given a random variable X having a normal distribution with μ = 50 and σ = 10, find the
probability that X assumes a value between 45 and 62.

Solution:

The z values corresponding to x1 = 45 and x2 = 62 are


MODULE 4 – Continuous Probability Distributions 25
Engr. Caesar Pobre Llapitan

45 − 50
z1 = = −0.5
10

62 − 50
z2 = = 1.2
10

Therefore, P(45 < X < 62) = P(−0.5 < Z < 1.2)

This area may be found by subtracting the area to the left of the ordinate z = −0.5 from the
entire area to the left of z = 1.2. Using Table A.3, we have

P(45 < X < 62) = P(−0.5 < Z < 1.2) = P(Z < 1.2) − P(Z < −0.5)
= 0.8849 − 0.3085 = 0.5764

Example 3
Given a normal distribution with μ = 40 and σ = 6, find the value of x that has
a. 45% of the area to the left and
b. 14% of the area to the right.

Solution:

a) We require a z value that leaves an area of 0.45 to the left. From Table A.3 we find
P(Z < −0.13) = 0.45, so the desired z value is −0.13. Hence,

x = (6)(−0.13) + 40 = 39.22.

b) This time we require a z value that leaves 0.14 of the area to the right and hence an area
of 0.86 to the left. Again, from Table A.3, we find P(Z < 1.08) = 0.86, so the desired z value
is 1.08 and

x = (6)(1.08) + 40 = 46.48.
MODULE 4 – Continuous Probability Distributions 26
Engr. Caesar Pobre Llapitan

Example 4
Suppose that Z is a standard normal random variable. Find the value w so that P(Z  w) =
0.60.

Solution:

The problem is asking for w such that P(-w  Z  w) = 0.60. Note that

P(Z < -w) + P(-w  Z  w) + P(Z > w) = 1

Also, P(Z < -w) = P(Z > w). In this case, we must have that P(Z < -w) = 0.20.

Therefore, P(Z > w) = 0.80.

Now, (w) = 0.80. The normal CDF table tells us that


(0.84) = 0.7795 and (0.85) = 0.8023

We take w to be the closer of these two values, i.e. w  0.84.

Classroom Activity 5
Let Z be a standard normal random variable and calculate the following probabilities, drawing
pictures wherever appropriate.
a. P(0  Z  2.17) f. P(−1.75  Z)
b. P(0  Z  1) g. P(−1.50  Z  2.00)
c. P(−2.50  Z  0) h. P(1.37  Z  2.50)
d. P(−2.50  Z  2.50) i. P(1.50  Z)
e. P(Z  1.37) j. P( Z  2.50)

Exercises 2
1. Given a standard normal distribution, find the area under the curve that lies
a. to the left of z = −1.39;
b. to the right of z = 1.96;
c. between z = −2.16 and z = −0.65;
d. to the left of z = 1.43;
e. to the right of z = −0.89;
f. between z = −0.48 and z = 1.74.
2. Find the value of z if the area under a standard normal curve
a. to the right of z is 0.3622;
b. to the left of z is 0.1131;
c. between 0 and z, with z > 0, is 0.4838;
d. between −z and z, with z > 0, is 0.9500.
3. Given a standard normal distribution, find the value of k such that
a. P(Z > k) = 0.2946;
b. P(Z < k) = 0.0427;
c. P(−0.93 < Z < k) = 0.7235.
4. Given a normal distribution with μ = 30 and σ = 6, find
a. the normal curve area to the right of x = 17;
b. the normal curve area to the left of x = 22;
c. the normal curve area between x = 32 and x = 41;
d. the value of x that has 80% of the normal curve area to the left;
e. the two values of x that contain the middle 75% of the normal curve area.
5. Given the normally distributed variable X with mean 18 and standard deviation 2.5,
find
MODULE 4 – Continuous Probability Distributions 27
Engr. Caesar Pobre Llapitan

a. P(X <15);
b. the value of k such that P(X <k) = 0.2236;
c. the value of k such that P(X >k) = 0.1814;
d. P(17 < X < 21).

D. Applications of the Normal Distribution

1. Most graduate schools of business require applicants for admission to take the Graduate
Management Admission Council’s GMAT examination. Scores on the GMAT are roughly
normally distributed with a mean of 527 and a standard deviation of 112. What is the
probability of an individual scoring above 500 on the GMAT?

Solution:
X −
Normal distribution: z=

µ = 527  = 112

500 − 527
z= = −0.24107
112

P{X > 500} = P{Z > -0.24} = 1 – 0.4052 = 0.5948

How high must an individual score on the GMAT in order to score in the highest 5%?

P(X > ?) = 0.05 ⇒ P(Z > ?) = 0.05


P(Z < ?) = 1 - 0.05 = 0.95 ⇒ Z = 1.645

Z = X σ + μ: X = 527 + 1.645(112)
X = 527 + 184.24 = 711.24

2. The length of human pregnancies from conception to birth approximates a normal


distribution with a mean of 266 days and a standard deviation of 16 days. What proportion
of all pregnancies will last between 240 and 270 days (roughly between 8 and 9 months)?

Solution:

Normal distribution: µ = 266  = 16


MODULE 4 – Continuous Probability Distributions 28
Engr. Caesar Pobre Llapitan

240 − 266
z1 = = −1.625
16
270 − 266
z2 = = 0.25
16

P(240 < X < 270) = P(-1.63 < Z < 0.25)


P(-1.63 < Z < 0.25) = P(Z < 0.25) - P(Z < -1.63)
P(-1.63 < Z < 0.25) = 0.5987 – 0.0516 = 0.5471

What length of time marks the shortest 70% of all pregnancies?

P(X < ?) = 0.70 ⇒ P(Z < ?) = 0.70 ⇒ Z = 0.52

Z = X σ + μ: X = 266 + 0.52(16)
X = 266 + 8.32 = 274.32

Examples from Walpole and Myers (9th edition)


1. A certain type of storage battery lasts, on average, 3.0 years with a standard deviation of 0.5
year. Assuming that battery life is normally distributed, find the probability that a given
battery will last less than 2.3 years.

Solution:

To find P(X < 2.3), we need to evaluate the area under the normal curve to the left of 2.3.
This is accomplished by finding the area to the left of the corresponding z value. Hence, we
find that

2.3 − 3
z= = −1.4
0.5

and then, using Table A.3, we have: P(X <2.3) = P(Z < −1.4) = 0.0808.
MODULE 4 – Continuous Probability Distributions 29
Engr. Caesar Pobre Llapitan

2. An electrical firm manufactures light bulbs that have a life, before burn-out, that is normally
distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
3. In an industrial process, the diameter of a ball bearing is an important measurement. The
buyer sets specifications for the diameter to be 3.0 ± 0.01 cm. The implication is that no part
falling outside these specifications will be accepted. It is known that in the process the
diameter of a ball bearing has a normal distribution with mean μ = 3.0 and standard
deviation σ = 0.005. On average, how many manufactured ball bearings will be scrapped?
4. Gauges are used to reject all components for which a certain dimension is not within the
specification 1.50 ± d. It is known that this measurement is normally distributed with mean
1.50 and standard deviation 0.2. Determine the value d such that the specifications “cover”
95% of the measurements.
5. A certain machine makes electrical resistors having a mean resistance of 40 ohms and a
standard deviation of 2 ohms. Assuming that the resistance follows a normal distribution
and can be measured to any degree of accuracy, what percentage of resistors will have a
resistance exceeding 43 ohms?

Classroom Activity 6
1. Hamburger Meat The meat department at a local supermarket specifically prepares its “1-
pound” packages of ground beef so that there will be a variety of weights, some slightly more
and some slightly less than 1 pound. Suppose that the weights of these “1-pound” packages
are normally distributed with a mean of 1.00 pound and a standard deviation of .15 pound.
a. What proportion of the packages will weigh more than 1 pound?
b. What proportion of the packages will weigh between .95 and 1.05 pounds?
c. What is the probability that a randomly selected package of ground beef will weigh less
than .80 pound?
d. Would it be unusual to find a package of ground beef that weighs 1.45 pounds? How
would you explain such a large package?
2. Braking Distances For a car traveling 30 miles per hour (mph), the distance required to
brake to a stop is normally distributed with a mean of 50 feet and a standard deviation of 8
feet. Suppose you are traveling 30 mph in a residential area and a car moves abruptly into
your path at a distance of 60 feet.
a. If you apply your brakes, what is the probability that you will brake to a stop within 40
feet or less? Within 50 feet or less?
b. If the only way to avoid a collision is to brake to a stop, what is the probability that you
will avoid the collision?

Nonstandard Normal Distributions

Proposition 3
If X has a normal distribution with mean µ and standard deviation , then

X −
z=

has a standard normal distribution. Thus

a− b− 
P (a  X  b) = P  Z
   
(21)
b−  a− 
=   −  
     
MODULE 4 – Continuous Probability Distributions 30
Engr. Caesar Pobre Llapitan

a−  b− 
P ( X  a) =    P ( X  b) = 1 −    (22)
     

Examples:
1. An expert witness in a paternity suit testifies that the length (in days) of human gestation is
approximately normally distributed with parameters μ = 270 and σ2 = 100. The defendant in
the suit is able to prove that he was out of the country during a period that began 290 days
before the birth of the child and ended 240 days before the birth. If the defendant was, in
fact, the father of the child, what is the probability that the mother could have had the very
long or very short gestation indicated by the testimony?

Solution:
Let X denote the length of the gestation, and assume that the defendant is the father. Then
the probability that the birth could occur within the indicated period is

P X > 290 or X < 240 = P  X  290 + P  X  240


 X − 270   X − 270 
= P  2 + P   −3 
 10   10 
= 1 −  (2) + 1 −  ( 3 )
~ 0.0241

2. A machine that dispenses corn flakes into packages provides amounts that are
approximately normally distributed with mean weight 20 ounces and standard deviation 0.6
ounce. Suppose that the weights and measures law under which you must operate allows
you to have only 5% of your packages under the weight stated on the package. What weight
should you print on the package?

Solution:
Let X be the weight in ounces of a typical package; this is approximately normally distributed
with mean µ = 20 and standard deviation  = 0.6. We seek a printed weight, w, such that
P(X < w) = 0.05.

X −
Since z = , we have the following relation:

 X −  w − 20   w − 20   w − 20 
0.05 = P ( X  w ) = P   =P  Z  = 
  0.6   0.6   0.6 

Thus,
w − 20
= −1 ( 0.05 )
0.6

With a normal table: (-1.64) = 0.0505 and (-1.65) = 0.0495

So, -1(0.05)  -1.645

w − 20
Finally, = 1.654 or w = 19.01
0.6

We would probably round this to “19.01” and print “19 ounces" on the box.
MODULE 4 – Continuous Probability Distributions 31
Engr. Caesar Pobre Llapitan

Classroom Activity 7
1. The time that it takes a driver to react to the brake lights on a decelerating vehicle is
critical in helping to avoid rear-end collisions. The article “Fast-Rise Brake Lamp as a
Collision-Prevention Device” (Ergonomics, 1993: 391–395) suggests that reaction
time for an in-traffic response to a brake signal from standard brake lights can be
modeled with a normal distribution having mean value 1.25 sec and standard deviation
of .46 sec. What is the probability that reaction time is between 1.00 sec and 1.75 sec?
2. The breakdown voltage of a randomly chosen diode of a particular type is known to be
normally distributed. What is the probability that a diode’s breakdown voltage is within
1 standard deviation of its mean value?
3. The top 5% of applicants (as measured by GRE scores) will receive scholarships. If GRE
~ N(500, 1002), how high does your GRE score have to be to qualify for a scholarship?
4. Family income ~ N($25000, $100002). If the poverty level is $10,000, what percentage of
the population lives in poverty?
5. A new tax law is expected to benefit “middle income” families, those with incomes
between $20,000 and $30,000. If Family income ~ N($25000, $10000 2), what percentage
of the population will benefit from the law?

Exercises 3
1. The dressed weights of Excelsior Chickens are approximately normally distributed with
mean 3.20 pounds and standard deviation 0.40 pound. About what proportion of the
chickens have dressed weights greater than 3.60 pounds?
2. Suppose that an automobile muffler is designed so that its lifetime (in months) is
approximately normally distributed with mean 26 months and standard deviation 4 months.
a) The manufacturer has decided to use a marketing strategy in which the muffler is
covered by warranty for 18 months. Approximately what proportion of the muffler will
fail before the warranty expires?
b) Suppose that the manufacturer in the previous example would like to extend the
warranty time to 24 months. Approximately what proportion of the mufflers will fail
before the extended warranty expires?
c) Of all the mufflers that fail under the extended warranty, what proportion of them have
failures in the interval (18 months; 24 months)?
3. Suppose that the daily demand for change (meaning coins) in a particular store is
approximately normally distributed with mean $800.00 and standard deviation $60.00.
a) What is the probability that, on any particular day, the demand for change will be below
$600?
b) Find the amount M of change to keep on hand if one wishes, with certainty 99%, to have
enough change. That is, find M so that P(X  M) = 0.99.
4. Christmas Trees The diameters of Douglas firs grown at a Christmas tree farm are normally
distributed with a mean of 4 inches and a standard deviation of 1.5 inches.
a) What proportion of the trees will have diameters between 3 and 5 inches?
b) What proportion of the trees will have diameters less than 3 inches?
c) Your Christmas tree stand will expand to a diameter of 6 inches. What proportion of the
trees will not fit in your Christmas tree stand?
5. Cerebral Blood Flow Cerebral blood flow (CBF) in the brains of healthy people is normally
distributed with a mean of 74 and a standard deviation of 16.
a) What proportion of healthy people will have CBF readings between 60 and 80?
b) What proportion of healthy people will have CBF readings above 100?
c) If a person has a CBF reading below 40, he is classified as at risk for a stroke. What
proportion of healthy people will mistakenly be diagnosed as “at risk”?
6. A Phosphate Mine The discharge of suspended solids from a phosphate mine is normally
distributed, with a mean daily discharge of 27 milligrams per liter (mg/l) and a standard
deviation of 14 mg/l. On what proportion of days will the daily discharge exceed 50 mg/l?
MODULE 4 – Continuous Probability Distributions 32
Engr. Caesar Pobre Llapitan

7. Bacteria in Drinking Water Suppose the numbers of a particular type of bacteria in


samples of 1 milliliter (ml) of drinking water tend to be approximately normally distributed,
with a mean of 85 and a standard deviation of 9. What is the probability that a given 1-ml
sample will contain more than 100 bacteria?
8. Breathing Rates The number of times x an adult human breathes per minute when at rest
has a probability distribution that is approximately normal, with the mean equal to 16 and
the standard deviation equal to 4. If a person is selected at random and the number x of
breaths per minute while at rest is recorded, what is the probability that x will exceed 22?
9. Mall Rats An article in American Demographics claims that more than twice as many
shoppers are out shopping on the weekends than during the week. Not only that, such
shoppers also spend more money on their purchases on Saturdays and Sundays! Suppose
that the amount of money spent at shopping centers between 4 P.M. and 6 P.M. on Sundays
has a normal distribution with mean $85 and with a standard deviation of $20. A shopper is
randomly selected on a Sunday between 4 P.M. and 6 P.M. and asked about his spending
patterns.
a) What is the probability that he has spent more than $95 at the mall?
b) What is the probability that he has spent between $95 and $115 at the mall?
c) If two shoppers are randomly selected, what is the probability that both shoppers have
spent more than $115 at the mall?
10. Sunflowers An experimenter publishing in the Annals of Botany investigated whether the
stem diameters of the dicot sunflower would change depending on whether the plant was
left to sway freely in the wind or was artificially supported. Suppose that the unsupported
stem diameters at the base of a particular species of sunflower plant have a normal
distribution with an average diameter of 35 millimeters (mm) and a standard deviation of 3
mm.
a) What is the probability that a sunflower plant will have a basal diameter of more than
40 mm?
b) If two sunflower plants are randomly selected, what is the probability that both plants
will have a basal diameter of more than 40 mm?
c) Within what limits would you expect the basal diameters to lie, with probability .95?
d) What diameter represents the 90th percentile of the distribution of diameters?

V. NORMAL APPROXIMATION TO THE BINOMIAL POISSON DISTRIBUTION

Although the normal distribution is continuous, it is interesting to note that it can sometimes
be used to approximate discrete distributions. Namely, we can use normal distribution to
approximate binomial probability distribution.

Suppose we have a binomial distribution defined by two parameters: the number of trials n and
the probability of success p. The normal distribution with the parameters  and  will be a good
approximation for that binomial distribution if both

 − 2 = np − 2 np(1 − p) and  + 2 = np + 2 np( 1 − p) lie between 0 and n.

If X ~ B(n, p) and n is large and np is not too near 0 or 1, then X is approximately N(np, np(1-p)).

For example, the binomial distribution with n = 10 and p = 0.5 is well approximated by the
normal distribution with  = np = 10 × 0.5 = 5.0 and σ = np( 1 − p) = 0.5 × 10 = 1.58. See Figure
14 or Table 2.
MODULE 4 – Continuous Probability Distributions 33
Engr. Caesar Pobre Llapitan

Figure 14 Approximation of binomial distribution (bar graph) with n=10, p=0.5 by a normal
distribution (smoothed curve)

Table 2The binomial and normal probability distributions for the same values of x
X Binomial Normal
distribution distribution
0 0.000977 0.0017
1 0.009766 0.010285
2 0.043945 0.041707
3 0.117188 0.113372
4 0.205078 0.206577
5 0.246094 0.252313
6 0.205078 0.206577
7 0.117188 0.113372
8 0.043945 0.041707
9 0.009766 0.010285
10 0.000977 0.0017

Proposition 4
Let X be a binomial rv based on n trials with success probability p. Then if the
binomial probability histogram is not too skewed, X has approximately a normal
distribution with  = np and  = npq . In particular, for x = a possible value of X,

 area under the normal curve 


P ( X  x ) = B ( x , n, p )   
 to the left of x + 0.5 
(23)
 x + 0.5 − np 
= 
 npq
 
MODULE 4 – Continuous Probability Distributions 34
Engr. Caesar Pobre Llapitan

In practice, the approximation is adequate provided that both np  10 and nq  10


(i.e., the expected number of successes and the expected number of failures are both
at least 10), since there is then enough symmetry in the underlying binomial
distribution.

Rule of Thump
The approximation is good for
np  5 and n( 1 − p)  5

The probability of getting k from the Binomial distribution can be approximated as the
probability under a Normal distribution for getting x in the range from k – ½ to k + ½ .

f ( x ) dx where f(x) is the Normal distribution:


6.5
For example P(k ≤ 6) can be approximated as  −

The normal approximation will, in general, be quite good for values of n satisfying  = np(1 − p)
 10.

Examples.
1. Suppose that 25% of all students at a large public university receive financial aid. Let X be
the number of students in a random sample of size 50 who receive financial aid, so that p =
.25. Then µ = 12.5 and  = 3.06. Since np = 50(.25) = 12.5  10 and nq = 37.5  10, the
approximation can safely be applied. The probability that at most 10 students receive aid is

 10 + 0.5 − 12.5 
P ( X  10 ) = B ( 10;50,0.25 )   
 3.06 
= ( −0.65 ) = 0.2578

Similarly, the probability that between 5 and 15 (inclusive) of the selected students receive
aid is
P ( 5  X  15 ) = B ( 15;50,0.25 ) − B ( 4;50,0.25 )
 15 + 0.5 − 12.5   14 + 0.5 − 12.5 
   −   = 0.8320
 3.06   3.06 

2. Suppose 50% of the population approves of the job the governor is doing, and that 20
individuals are drawn at random from the population. Solve the following, using both the
binomial distribution and the normal approximation to the binomial.
a. What is the probability that exactly 7 people will support the governor?
b. What is the probability that 7 or fewer people will support the governor?
c. What is the probability that exactly 11 will support the governor?
d. What is the probability that 11 or fewer will support the governor?

Solution:
MODULE 4 – Continuous Probability Distributions 35
Engr. Caesar Pobre Llapitan

Note that N = 20, p = .5, so μ = Np = 10 and σ = Npq = 5, σ = 2.236. Since Npq ≥ 3, it is probably
safe to assume that X has approximately a N(10, 5) distribution.

a. For the binomial, find P(X = 7). Appx. E, Table II shows P(7) = .0739.

For the normal, find P(6.5 ≤ X ≤ 7.5).

We convert 6.5 and 7.5 to their corresponding z-scores (-1.57 and -1.12), and the problem
becomes finding

P(-1.57 ≤ Z ≤ -1.12) = (1.57) - (1.12) = .9418 - .8686 = .0732.

b. To use the binomial distribution, find P (X ≤ 7). Using Appx. E, we get


P(7) + P(6) + P(5) + P(4) + P(3) + P(2) + P(1) + P(0)
= .0739 + .0370 + .0148 + .0046 + .0011 + .0002 + 0 + 0 = .1316.

To use the normal approximation to the binomial, find P(X ≤ 7.5).

As noted above, the z-score that corresponds to 7.5 is -1.12.

(-1.12) = 1 - (1.12) = 1 - .8686 = .1314.

c. For the binomial, find P(X = 11). Appx. E shows P(11) = .1602.

For the normal, find P(10.5 ≤ X ≤ 11.5).

If we convert 10.5 and 11.5 to their corresponding z-scores, the problem becomes a matter
of finding

P(.22 ≤ Z ≤.67) = (.67) - (.22) = .7486 - .5871 = .1615.

d. For the binomial, find P(X ≤ 11). From Appx E Table 2, you can determine that this is
.7483.

For the normal, find P(X ≤ 11.5).

The z-score that corresponds to 11.5 is .67, and (.67) = .7486.

In all of the above, note that the results obtained using the binomial distribution and
the normal approximation to the binomial are almost identical.

3. In each of 25 races, the Democrats have a 60% chance of winning. What are the odds that
the Democrats will win 19 or more races? Use the normal approximation to the binomial.

Solution.
Np = 15, Npq = 6, so X ~ N(15, 6).

Using the normal approximation to the binomial, we want to find P(X ≥ 18.5).

Let Z = (X - 15)/√6. When x = 18.5, z = 3.5/√6 = 1.43.

Hence,
P(X ≥ 18.5) = P(Z ≥ 1.43) = 1 - (1.43) = 1 - .9236 = .0764.
MODULE 4 – Continuous Probability Distributions 36
Engr. Caesar Pobre Llapitan

Hence, Democrats have a little less than an 8% chance of winning 19 or more races.

Incidentally, note that, since N = 25 is not included in Appendix E, Table II, it would be
very tedious to calculate this using the binomial distribution.

4. In a family of 11 children, what is the probability that there will be more boys than girls?
Use the normal approximation to the binomial.

Solution
μ = Np = 5.5, σ5 = Npq = 2.75, so X ∼ N(5.5, 2.75).

If we were using the binomial distribution, we would find P(X ≥ 6); since we are using
the normal approximation to the binomial, we find P(X ≥ 5.5).

Hence,
P(X ≥ 5.5) = P(Z ≥ 0) = .5.

5. Let X be the number of times that a fair coin that is flipped 40 times lands on heads. Find
the probability that X = 20. Use the normal approximation and then compare it with the
exact solution.

Solution
To employ the normal approximation, note that because the binomial is a discrete
integer-valued random variable, whereas the normal is a continuous random variable, it
is best to write P{X = i} as P{i − 1/2 < X < i + 1/2} before applying the normal approximation
(this is called the continuity correction). Doing so gives

P  X = 20 = P 19.5  X  20.5


 19.5 − 20 X − 20 20.5 − 20 
= P   
 10 10 10 
 X − 20 
~ P −0.16   0.16 
 10 
  ( 0.16 ) −  ( −0.16 )  0.1272

The exact result is

 40   1 
40

P  X = 20 =     0.1254
 20   2 

6. The ideal size of a first-year class at a particular college is 150 students. The college, knowing
from past experience that, on the average, only 30 percent of those accepted for admission
will actually attend, uses a policy of approving the applications of 450 students. Compute
the probability that more than 150 first-year students attend this college.

Solution
If X denotes the number of students that attend, then X is a binomial random variable
with parameters n = 450 and p = .3. Using the continuity correction, we see that the
normal approximation yields
MODULE 4 – Continuous Probability Distributions 37
Engr. Caesar Pobre Llapitan

 X − ( 450 )( 0.3 ) 150.5 − ( 450 )( 0.3 ) 


P  X  150.5 = P   
 450 ( 0.3 )( 0.7 ) 450 ( 0.3 )( 0.7 ) 
~ 1 −  ( 1.59 )
 0.0559

Hence, less than 6 percent of the time do more than 150 of the first 450 accepted actually
attend. (What independence assumptions have we made?)

7. To determine the effectiveness of a certain diet in reducing the amount of cholesterol in the
bloodstream, 100 people are put on the diet. After they have been on the diet for a sufficient
length of time, their cholesterol count will be taken. The nutritionist running this
experiment has decided to endorse the diet if at least 65 percent of the people have a lower
cholesterol count after going on the diet. What is the probability that the nutritionist
endorses the new diet if, in fact, it has no effect on the cholesterol level?

Solution
Let us assume that if the diet has no effect on the cholesterol count, then, strictly by
chance, each person’s count will be lower than it was before the diet with probability
12. Hence, if X is the number of people whose count is lowered, then the probability
that the nutritionist will endorse the diet when it actually has no effect on the
cholesterol count is

100
100
 100   1 
  i  2  = P  X  64.5
i =65  
 X − ( 100 )( 1 / 2 ) 
= P  2.9 
 100 ( 1 / 2 )( 1 / 2 ) 
 1 −  ( 2.9 )
 0.0019

Classroom Activity 8
1. Suppose that X is a binomial random variable with n = 20 and p = 0.4.
a. Approximate the probability that X is less than or equal to 70.
b. Approximate the probability that X is greater than 70 and less than 90.
2. The manufacturing of semiconductor chips produces 2% defective chips. Assume the
chips are independent and that a lot contains 1000 chips.
a. Approximate the probability that more than 25 chips are defective.
b. Approximate the probability that between 20 and 30 chips are defective.
3. An electronic office product contains 5000 electronic components. Assume that the
probability that each component operates without failure during the useful life of the
product is 0.999, and assume that the components fail independently. Approximate the
probability that 10 or more of the original 5000 components fail during the useful life of
the product.

Exercises 4
1. Suppose that X is a binomial random variable with n = 100 and p = 0.1.
a) Compute the exact probability that X is less than 4.
b) Approximate the probability that X is less than 4 and compare to the result in part (a).
c) Approximate the probability that 8 < X < 12.
MODULE 4 – Continuous Probability Distributions 38
Engr. Caesar Pobre Llapitan

2. The reliability of an electrical fuse is the probability that a fuse, chosen at random from
production, will function under its designed conditions. A random sample of 1000 fuses was
tested and x = 27 defectives were observed. Calculate the approximate probability of
observing 27 or more defectives, assuming that the fuse reliability is .98.
3. A producer of soft drinks was fairly certain that her brand had a 10% share of the soft drink
market. In a market survey involving 2500 consumers of soft drinks, x = 211 expressed a
preference for her brand. If the 10% figure is correct, find the probability of observing 211 or
fewer consumers who prefer her brand of soft drink.
4. Consider babies born in the “normal” range of 37–43 weeks gestational age. Extensive data
supports the assumption that for such babies born in the United States, birth weight is
normally distributed with mean 3432 g and standard deviation 482 g. [The article “Are
Babies Normal?” (The American Statistician, 1999: 298–302) analyzed data from a
particular year; for a sensible choice of class intervals, a histogram did not look at all normal,
but after further investigations it was determined that this was due to some hospitals
measuring weight in grams and others measuring to the nearest ounce and then converting
to grams. A modified choice of class intervals that allowed for this gave a histogram that was
well described by a normal distribution.]
a) What is the probability that the birth weight of a randomly selected baby of this type
exceeds 4000 g? Is between 3000 and 4000 g?
b) What is the probability that the birth weight of a randomly selected baby of this type is
either less than 2000 g or greater than 5000 g?
c) What is the probability that the birth weight of a randomly selected baby of this type
exceeds 7 lb?
d) How would you characterize the most extreme .1% of all birth weights?
e) If X is a random variable with a normal distribution and a is a numerical constant (a ±
0), then Y = aX also has a normal distribution. Use this to determine the distribution of
birth weight expressed in pounds (shape, mean, and standard deviation), and then
recalculate the probability from part (c). How does this compare to your previous
answer?
5. A supplier ships a lot of 1000 of electrical connectors. A sample of 25 is selected at random,
without replacement. Assume the lot contains 100 defective connectors.
a) Using a binomial approximation, what is the probability that there are no defective
connectors in the sample?
b) Use the normal approximation to answer the result in part (a). Is the approximation
satisfactory?
c) Redo parts (a) and (b) assuming the lot size is 500. Is the normal approximation to the
probability that there are no defective connectors in the sample satisfactory in this case?

VI. NORMAL APPROXIMATION TO THE POISSON

If Y  Poisson parameter and is large (> 7, say), then Y has approximately a N(, )
distribution.
MODULE 4 – Continuous Probability Distributions 39
Engr. Caesar Pobre Llapitan

Example: Stock Control


At a given hospital, patients with a particular virus arrive at an average rate of once every
five days. Pills to treat the virus (one per patient) have to be ordered every 100 days. You are
currently out of pills; how many should you order if the probability of running out is to be less
than 0.005?

Solution
Assume the patients arrive independently, so this is a Poisson process, with rate 0.2/day.

Therefore, Y, number of pills needed in 100 days, ~ Poisson, = 100 × 0.2 = 20.

We want, or P(Y > n) < 0.0005 or P(Y ≤ n + ½) > 0.995 under the Normal approximation,
where a probability of 0.995 corresponds (from tables) to z = 2.575.

Since  = =  = 20 this corresponds to n + ½ = 20 + 2.575√20 = 31.5 , so we need to order


n > 32 pills.

Don’t use approximations that are too simple if their failure might be important!
Rare events in particular are often a lot more likely than predicted by (too-) simple
approximations for the probability distribution.

VII. EXPONENTIAL DISTRIBUTION

Definition 11
The random variable X that equals the distance between successive counts of a Poisson
process with mean  > 0 is an exponential random variable with parameter . The
probability density function of X is
MODULE 4 – Continuous Probability Distributions 40
Engr. Caesar Pobre Llapitan

f ( x ) = e− x for 0  x   (21)

The exponential distribution obtains its name from the exponential function in the probability
density function.

For any value of , the exponential distribution is quite skewed. The following results are easily
obtained.

If the random variable X has an exponential distribution with parameter ,

1 1
 =E ( X )= and  2 =V ( X ) = (22)
 2

It is important to use consistent units in the calculation of probabilities, means, and variances
involving exponential random variables.

Four exponential probability distribution functions are shown in figure 1 on the same scale. Note
that they all have the same shape. The greater the rate , the more likely it is that the
corresponding exponential random variable takes a small value. This makes sense: if the events
are occurring at a high rate, it will tend to be a short time until the first event, and vice versa.

Exponential distribution with  = 1.33 Exponential distribution with  = 1

Exponential distribution with  = 0.8 Exponential distribution with  = 0.67

An exponential random variable can be regarded as the waiting time until the first event in a
Poisson process with rate .

It is appropriate to think of a ‘random process’ in which events occur in time, independently of


each other, at a rate per unit time. This means that processes that are systematic (such as train
timetables) or approximately regular (the arrival of waves on a beach) are not Poisson processes.
MODULE 4 – Continuous Probability Distributions 41
Engr. Caesar Pobre Llapitan

Examples of phenomena that might be suitably modelled with this distribution include:
▪ radioactive decay
▪ the occurrences of a rare disease in a large population
▪ arrival of a packet of information on the internet.

Lack of Memory Property


An even more interesting property of an exponential random variable is concerned with
conditional probabilities.

Roughly speaking, it is as the name suggests: the process ‘does not remember what has
happened up until now’ and the distribution of the waiting time, given that it has already
exceeded some amount of time t0, has the same exponential-distribution form, just translated
by t0.

The lack of memory property is quite readily established. For t0 > 0 and t > t0:

P ( X >t and X >t0 )


P  X  t X  to  = (rule of conditional probability)
P ( X  t0 )
P ( X t )
= (since "X > t"  "X > t0 ")
P ( X  t0 )
=e ( 0)
−  t −t

References
1. Jay L. Devore (2016). Probability and Statistics for Engineering and the Sciences,
9th Edition. Cengage Learning
2. Mendenhall, William III et. al. (2013). Introduction to Probability and Statistics, 14th
Edition. Brooks/Cole, Cengage Learning
3. Montgomery, Douglas C. And George C. Runger (2003). Applied Statistics and
Probability for Engineers (3rd Edition). USA: John Wiley & Sons, Inc.
4. Walpole, Ronald E., Raymond H Myers and Sharon L Myers (2014). Probability and
Statistics for Engineers and Scientists, (9th ed.). Pearson Education Limited

You might also like