Comm 214 Chapter 7
Comm 214 Chapter 7
Distributions
Discrete vs. Continuous Random Variable
Discrete R.V.
• Takes on finite or countable infinite number of different values,
• “Gaps” exist between values along the number line,
• Possible to list all possible values with associated probabilities (Probability
Distribution).
Continuous R.V.
• Takes on ANY value in an interval,
• No “gaps” between values,
• Point probability P(X=x) = 0 for any x.
Continuous Probability Distributions
• A continuous random variable is a variable that can assume any value on a
continuum (can assume an uncountable number of values)
• thickness of an item
• time required to complete a task
• temperature of a solution
• height, in inches
• These can potentially take on any value depending only on the ability to
precisely and accurately measure
Continuous Probability Distributions
The probability of the random variable assuming a value within some given
interval from x1 to x2 is defined to be the area under the graph of the
probability density function that is between x1 and x2.
x x x
x1 x2 x1 x2 x1 x2
Point Probabilities are Zero
Because there is an infinite number of values, the probability of each
individual value is virtually 0.
• E.g. with a discrete random variable like tossing a die, it is meaningful to talk about
P(X=5), say.
• In a continuous setting (e.g. with time as a random variable), the probability the random
variable of interest, say task length, takes exactly 5 minutes is infinitesimally small, hence
P(X=5) = 0.
It is meaningful to talk about P(X ≤ 5).
Probability Density Function…
Instead of P(x), continuous R.V. uses Probability Density Function (pdf)
• The technical name of the curve,
• Often represented as f(x)
• f(x) ³0 for all x (probabilities can’t be negative)
• òf(x)dx = 1 (The total area under the curve between a and b is 1.0)
f(x)
area=1
a b x
Normal Distribution
The Normal Distribution
• ‘Bell
Shaped’
• Symmetrical f(X)
• Mean, Median and Mode
are Equal
Location is determined by the σ
mean, μ X
μ
Spread is determined by the
standard deviation, σ
Mean
= Median
The random variable has an infinite = Mode
theoretical range:
+ ¥ to - ¥
The Normal Distribution
Density Function
n The formula for the normal probability density function is
2
1 æ (X -μ) ö
1 - ç
2è s ø
÷
f(X) = e
2πs
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
NORMAL PROBABILITY DISTRIBUTION
x
Three normal distribution curves with different means but the same
standard deviation.
STANDARD NORMAL DISTRIBTUION
Definition
Definition
z Values or z Scores
The units marked on the horizontal axis of the standard normal curve are
denoted by z and are called the z values or z scores. A specific value of z
gives the distance between the mean and the point represented by z in
terms of the standard deviation.
Area under the standard normal curve.
7-22
Basis for the empirical rules
• 68.26% of the area under the curve falls within +/- 1 Stdev.
• 95.44% of the area under the curve falls within +/- 2 Stdev.
• 99.73% of the area under the curve falls within +/- 3 Stdev.
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is symmetric, so half is
above the mean, half is below
f(x)
P( -¥ < x < μ) = 0.5 P(μ < x < ¥ ) = 0.5
0.5 0.5
μ x
• Average adult male height is 70 inches with standard deviation of 2 inches. You are 73 inches
tall. What is your height percentile?
Translation to the Standard
Normal Distribution
x -μ
z=
σ
z is the number of standard deviations units that
x is away from the population mean
To Standardize Normal Distribution
• We can use the following function to convert ANY normal
random variable to a standard normal random variable Z.
This shifts the mean
of X to zero…
X
0
7-29
For any continuous random variable, the probability of observing a specific
value of the random variable is 0. For example, for a standard normal
random variable, P(a) = 0 for any value of a. This is because there is no area
under the standard normal curve associated with a single value, so the
probability must be 0. Therefore, the following probabilities are equivalent:
P(a < Z < b) = P(a < Z < b) = P(a < Z < b) = P(a < Z < b)
7-30
Calculating Normal Probabilities…
P(45 < X < 60) ? …mean of 50 minutes and a
standard deviation of 10 minutes…
0
The Standard Normal Table
• The Standard Normal Table gives the probability
between the mean and a certain z value
• The z value ALWAYS refers to the area between some
value (-z or +z) and the mean
• Since the distribution is symmetrical, the Standard
Normal Table only displays probabilities for ½ of the
full distribution
EXAMPLE Finding the Area Under the Standard Normal Curve
Find the area under the standard normal curve to the right of Z = 1.25.
Find the area under the standard normal curve to the left of z = -0.38.
Find the area under the standard normal curve between z = -1.02 and z = 2.94.
In this case finding the area to the left of the z score of 2.94 is bigger than the
area we are concerned with, and the area to the left of the z score of -1.04
contains area that we are not concerned with.
To find the area we are interested in we will subtract the smaller area from the
larger, which will give us the area that is between them.
Area between -1.02 and 2.94 = (Area left of z = 2.94) – (area left of z = -1.02)
= 0.9984 – 0.1539
= 0.8445
7-35
Example
Find the area under the standard normal curve to the left of z = 1.95.
Area Under the Standard Normal Curve to the Left of z = 1.95
Area to the left of z = 1.95.
Example
Find the area under the standard normal curve from z = -2.17 to z = 0.
Example: Solution
To find the area from z=-2.17 to z =0, first we find the areas to the left of z=0
and to the left of z=-2.17 in Table IV. As shown in Table 6.3, these two areas
are .5 and .0150, respectively. Next we subtract .0150 from .5 to find the
required area.
(a) To find the area to the right of z=2.32, first we find the area to the left of
z=2.32. Then we subtract this area from 1.0, which is the total area under the
curve. The required area is 1.0 - .9898 = .0102.
Area to the right of z = 2.32.
Example: Solution
(b) To find the area under the standard normal curve to the left of z=-1.54,
we find the area in Table IV that corresponds to -1.5 in the z column and
.04 in the top row. This area is .0618.
x-µ
z=
s
where μ and σ are the mean and standard deviation of the normal
distribution of x, respectively.
Example
Let x be a continuous random variable that has a normal distribution with a
mean of 50 and a standard deviation of 10. Convert the following x values to z
values and find the probability to the left of these points.
(a) x = 55
(b) x = 35
Example: Solution
(a) x = 55
x-µ 55 - 50
z= = = .50
s 10
P(x < 55) = P(z < .50) = .6915
z value for x = 55.
Example: Solution
(b) x = 35
x -µ 35 - 50
z= = = -1.50
s 10
P(x < 35) = P(z < -1.50) = .0668
z value for x = 35.
Example
Let x be a continuous random variable that is normally distributed with a
mean of 25 and a standard deviation of 4.
Find the area
(a) between x = 25 and x = 32
(b) between x = 18 and x = 34
Example: Solution
x-µ 32 - 25
z= = = 1.75
s 4
For x = 34: 34 - 25
z= = 2.25
4
Let x be a normal random variable with its mean equal to 40 and standard
deviation equal to 5. Find the following probabilities for this normal
distribution
(a) P (x > 55)
(b) P (x < 49)
Example: Solution
(a) For x = 55:
55 - 40
z= = 3.00
5
P (x > 55) = P (z > 3.00)
= 1.0 - .9987
= .0013
Finding P (x > 55).
Example: Solution
(b) For x = 49:
49 - 40
z= = 1.80
5
For x = 39: 39 - 50
z= = -1.38
8
For x = 135:
135 - 80
z= = 4.58
12
This section presents examples that illustrate the applications of the normal
distribution.
Example
According to the Kaiser Family Foundation, U.S. workers who had employer-
provided health insurance paid an average premium of $4129 for family
coverage during 2011 (USA TODAY, October 10, 2011). Suppose that the
premiums for family coverage paid this year by all such workers are normally
distributed with a mean of $4129 and a standard deviation of $600. Find the
probability that such premium paid this year by a randomly selected such
worker is between $3331 and $4453.
Example: Solution
For x = $3331:
3331 − 4129
𝑧= = −1.33
600
For x = $4453:
4453 − 4129
𝑧= = .54
600
Thus, the probability is .8944 that this worker will finish assembling this
racing car before the company closes for the day.
Area to the left of x = 60.
Example
For x = 36:
36 - 54
z= = -2.25
8
P(x < 36) = P (z < -2.25) = .0122
Look for .9950 in the body of the normal distribution table. Table VII does
not contain .9950.
Because .05 is less than .5 and it is the area in the left tail, the value of z
is negative.
Look for .0500 in the body of the normal distribution table. The value
closest to .0500 in Table IV is either .0505 or .0495.
For a normal curve, with known values of μ and σ and for a given area
under the curve to the left of x, the x value is calculated as
x = μ + zσ
Example
Recall Example 6-14. It is known that the life of a calculator manufactured by
Calculators Corporation has a normal distribution with a mean of 54 months
and a standard deviation of 8 months. What should the warranty period be
to replace a malfunctioning calculator if the company does not want to
replace more than 1% of all the calculators sold?
Example: Solution
Area to the left of x = .01 or 1%
Find the z value from the normal distribution table for .0100. Table IV does
not contain a value that is exactly .0100.
x = μ + zσ = 54 + (-2.33)(8)
= 54 – 18.64 = 35.36
Example: Solution
Thus, the company should replace all calculators that start to malfunction
within 35.36 months (which can be rounded to 35 months) of the date of
purchase so that they will not have to replace more than 1% of the
calculators.
Finding an x value.
Example
According to the College Board, the mean combined (mathematics and
critical reading) SAT score for all college-bound seniors was 1012 with a
standard deviation of 213 in 2011. Suppose that the current distribution of
combined SAT scores for all college-bound seniors is approximately normal
with a mean of 1012 and a standard deviation of 213. Jennifer is one of the
college-bound seniors who took this test. It is found that 10% of all current
college-bound seniors have SAT scores higher than Jennifer. What is Jennifer’s
SAT score?
Example: Solution
Area to the left of the x value = 1.0 - .10 = .9000
Look for .9000 in the body of the normal distribution table. The value closest
to .9000 in Table IV is .8997, and the z value is 1.28.
x = μ + zσ = 1012 + 1.28(213)
= 1012 + 272.64 = 1284.64 ≈ 1285
P( x) = n C x p x q n - x
THE NORMAL APPROXIMATION OF THE BINOMIAL DISTRIBUTION
x = 19, n – x = 30 – 19 = 11
µ = np = 30(.50) = 15
s = npq = 30(.50)(.50) = 2.73861279
Definition
The addition of .5 and/or subtraction of .5 from the value(s) of x when the
normal distribution is used as an approximation to the binomial
distribution, where x is the number of successes in n trials, is called the
continuity correction factor.
Example: Solution
For x = 18.5:
18.5 - 15
z= = 1.28
2.73861279
For x = 19.5: 19.5 - 15
z= = 1.64
2.73861279
122.5 − 128
𝑧= = −.59
For x = 122.5 9.32952303
Example: Solution
Thus, the probability that 108 to 122 people in a sample of 400 who work
from home will say that the biggest advantage of working from home is
that there is no commute is approximately .2637.
Area between x = 107.5 and x = 122.5
Example
According to a poll, 55% of American adults do not know that GOP stands for
Grand Old Party (Time, October 17, 2011). Assume that this percentage is true
for the current population of American adults. What is the probability that
397 or more American adults in a random sample of 700 do not know that
GOP stands for Grand Old Party?
Example: Solution
Thus, the probability that 397 or more American adults in a random sample
of 700 will not know that GOP stands for Grand Old Party is approximately
.1922.
Area to the right of x = 396.5