RRL M10
RRL M10
http://www.ftj.agh.edu.pl/~lenda/stat/wykl_04_V_charact.pdf
https://courses.lumenlearning.com/wmopen-concepts-statistics/chapter/introduction-to-normal-random-variables-1-
of-6/
LEARNING OUTCOMES
Use a normal probability distribution to estimate probabilities and identify unusual events.
In Summarizing Data Graphically and Numerically, we encountered data sets, such as height and
weight, with distributions that are fairly symmetric with a central peak. We call these bell-shaped.
Many variables, such as weight, shoe sizes, foot lengths, and other human physical characteristics,
exhibit these properties. The symmetry indicates that the variable is just as likely to take a value a
certain distance below its mean as it is to take a value that same distance above its mean. The bell
shape indicates that values closer to the mean are more likely, and it becomes increasingly unlikely to
take values far from the mean in either direction.
We use a mathematical model with a smooth bell-shaped curve to describe these bell-shaped data
distributions. These models are called normal curves or normal distributions. They were first
called “normal” because the pattern occurred in many different types of common measurements.
The general shape of the mathematical model used to generate a normal curve looks like this:
Observations of Normal Distributions
There are many normal curves. Even though all normal curves have the same bell shape, they vary in
their center and spread.
Because normal curves are mathematical models, we use Greek letters to represent the mean and
standard deviation of a normal curve. The mean of a normal distribution locates its center. We use
the Greek letter μ (pronounced “mu” ) to represent the mean. We use the Greek letter σ (pronounced
“sigma”) to represent the standard deviation of a normal distribution. The standard
deviation determines the spread of the distribution. In fact, the shape of a normal curve is completely
determined by specifying its standard deviation. As we will see, if two normal distributions have the
same standard deviation, then the shapes of their normal curves will be identical.
Following are some observations we can make as we look at the figure above:
The black and the red normal curves have means or centers at μ = 10. However, the red curve
is more spread out and thus has a larger standard deviation. Notice that the red normal curve is
also shorter. This makes sense because these curves are probability density curves, so the
area under each curve has to be 1.
The black and the green normal curves have the same standard deviation or spread.
Comment
We use ¯xx¯ to represent the mean of data in a sample. We use μ to represent the mean of a
density curve defined by a mathematical model.
We use SD or \displaystyle {s}_{x}sx to represent the standard deviation of data in sample.
We use σ to represent the standard deviation of a density curve defined by a mathematical
model.
The normal curve has a central role in statistical inference, as we’ll see in Linking Probability to
Statistical Inference. Understanding the normal distribution is an important step in the direction of our
overall goal, which is to relate sample means or proportions to population means or proportions. The
goal of this section is to help you better understand normal random variables and their distributions.
All normal curves share a basic geometry. While the mean locates the center of a normal curve, it is
the standard deviation that is in control of the geometry. To see how, let’s examine a few pictures of
normal curves to see what they reveal.
EXAMPLE
Let’s start with a random variable X that has a normal distribution with mean = 10 and standard deviation
= 2. Let’s practice our new notation. Here we would write μ = 10 and σ = 2 .
The normal curve for X is shown below.
As expected, the mean μ = 10 is located at the center of the normal curve. The other two arrows point to
values 1 standard deviation on each side of the mean.
The point 1 standard deviation less than the mean is represented by μ − σ . Since μ = 10 and σ = 2, this
point is located at 10 − 2 = 8, as shown.
The point 1 standard deviation more than the mean is represented by μ + σ . Since μ = 10 and σ = 2, this
point is located at 10 + 2 = 12, as shown.
You will notice we have indicated that the area of the green region is 0.68. So we can say that the probability
of X being between 8 and 12 equals 0.68.
Or, using our probability notation, we could write:
\displaystyle P(8<X<12)=0.68P(8<X<12)=0.68
Now here is an interesting fact. If we took any normal distribution and drew a similar picture, the probability
that a value falls within 1 standard deviation of the mean is always the same. Here are several ways to express
this idea:
For any normal curve, the central area within 1 standard deviation of the mean equals 0.68.
Roughly 68% of the time we will expect X to have a value within 1 standard deviation of the
mean.
\displaystyle P(\mu-\sigma<X<\mu+\sigma)=0.68P(μ−σ<X<μ+σ)=0.68.
This is a big deal. It is one of the things that makes normal curves special. In general, probability density
curves for continuous random variables with different shapes don’t have this special property.
Let’s put this idea in context. If the weight of babies at birth follows a normal distribution with mean μ = 3,500
grams and standard deviation σ = 600 grams, then we can conclude that most babies – that is, about 68% –
will weigh somewhere between 2,900 grams (i.e., 3,500 − 600 = 2,900) and 4,100 grams (i.e., 3,500 + 600 =
4,100).
https://courses.lumenlearning.com/wmopen-concepts-statistics/chapter/introduction-to-normal-random-variables-6-
of-6/
LEARNING OUTCOMES
Use a normal probability distribution to estimate probabilities and identify unusual events.
Now we use the simulation and the standard normal curve to find the probabilities associated with
any normal density curve.
EXAMPLE
Length of Human Pregnancy
The length (in days) of a randomly chosen human pregnancy is a normal random variable with μ = 266, σ = 16.
So X = length of pregnancy (in days)
(a) What is the probability that a randomly chosen pregnancy will last less than 246 days?
We want P(X < 246). To find this probability, we first convert X = 246 to a z-score:
\displaystyle Z=\frac{246-266}{16}=\frac{-20}{16}=-1.25Z=16246−266=16−20=−1.25
Now we can use the simulation to find P(Z < −1.25). This is the area under the normal probability curve to
the left of Z = −1.25.
The probability that a randomly chosen pregnancy lasts less than 246 days is 0.1056. In other words, there is
an 11% chance that a randomly selected pregnancy will last less than 246 days.
(b) Suppose a pregnant woman’s husband has scheduled his business trips so that he will be in town between
the 235th and 295th days of her pregnancy. What is the probability that the birth will take place during that
time?
Compute the z-scores for each of these x-values:
\displaystyle Z=\frac{235-266}{16}=\frac{-31}{16}=-1.94Z=16235−266=16−31=−1.94
and
\displaystyle Z=\frac{295-266}{16}=\frac{29}{16}=1.81Z=16295−266=1629=1.81
Use the simulation to find the area under the standard normal curve between these two z-scores.
So the desired probability is 0.9387.
\displaystyle P(235<X<295)=P(-
1.94<Z<1.81)=0.9387P(235<X<295)=P(−1.94<Z<1.81)=0.9387
There is about a 94% probability that he will be home for the birth. Looks like he planned well.
TRY IT
The previous examples all followed the same general form: Given values of a normal random
variable, we found an associated probability. The two basic steps in the solution process were as
follows:
1. Convert x-value to a z-score.
2. Use the simulation to find associated probability.
The next example is a different type of problem: Given a probability, we will find the associated value
of the normal random variable. The solution process will go in reverse order.
These types of problems are informally called “work-backwards” problems. We will use a new
simulation for these types of problems. The new simulation requires us to enter a probability and then
gives us the associated z-score. This is backwards from the simulation we worked with previously
where we entered a z-score to find a probability. We will use this simulation in the next example.
EXAMPLE
Comments
In the preceding example (specifically step 2), we found the x-value by reasoning about the meaning
of the z-score. We can also develop a formula for this process.
Recall the definition of z-score. In words, the z-score of an x-value is the number of standard
deviations X is away from the mean. As a formula, this is
Z=x−μσZ=x−μσ
x−μσ=Zx−μ=Z⋅σx=μ+Z⋅σx−μσ=Zx−μ=Z⋅σx=μ+Z⋅σ
This gives us a formula for finding X from Z. You can use this formula in step 2 of a work-backwards
problem.
TRY IT
Let’s Summarize
In “Continuous Random Variables,” we made the transition from discrete to continuous random
variables. A continuous random variable is not limited to distinct values. It is a measurement such as foot
length. We cannot display the probability distribution for a continuous random variable with a table or
histogram. We use a density curve to assign probabilities to intervals of x-values. We use the area under
the density curve to find probabilities.
We use a normal density curve to model the probability distribution for many variables, such as weight,
shoe sizes, foot lengths, and other human physical characteristics. Normal curves are mathematical
models. We use µ to represent the mean of a normal curve and σ to represent the standard deviation of
a normal curve. We use Greek letters to remind us that the normal curve is not a distribution of real data.
It is a mathematical model based on a mathematical equation. We use this mathematical model to
represent the perfect bell-shaped distribution.
For a normal curve, the empirical rule for normal curves tells us that 68% of the observations fall within 1
standard deviation of the mean, 95% within 2 standard deviations of the mean, and 99.7% within 3
standard deviations of the mean.
To compare x-values from different distributions, we standardize the values by finding a z-
score: Z=x−μσZ=x−μσ
A z-score measures how far X is from the mean in standard deviations. In other words, the z-score is the
number of standard deviations X is from the mean of the distribution. For example, Z = 1 means the x-
value is 1 standard deviation above the mean.
If we convert the x-values into z-scores, the distribution of z-scores is also a normal density curve. This
curve is called the standard normal distribution. We use a simulation with the standard normal curve
to find probabilities for any normal distribution.
We can also work backwards and find the x-value for a given probability. We used a different simulation
to work backwards from probabilities to x-values. With this simulation, we found x-values corresponding
to quartiles and percentiles.
If you completed all of the exercises in this module, you should be ready for the Checkpoint. To make
sure that you are ready for the Checkpoint, use the My Response link below to evaluate your
understanding of the learning outcomes for this module and to submit questions that you may have.