0% found this document useful (0 votes)
25 views13 pages

Z Table

This document is a student learning guide for ESci 117: Engineering Data Analysis, focusing on the normal distribution as a key concept in statistics. It explains the properties and applications of the normal distribution, including its significance in modeling quantitative characteristics and the use of the standard normal variable Z for probability calculations. The guide also includes examples and exercises to help students understand and apply the concepts related to normal distributions.

Uploaded by

jassoniega24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views13 pages

Z Table

This document is a student learning guide for ESci 117: Engineering Data Analysis, focusing on the normal distribution as a key concept in statistics. It explains the properties and applications of the normal distribution, including its significance in modeling quantitative characteristics and the use of the standard normal variable Z for probability calculations. The guide also includes examples and exercises to help students understand and apply the concepts related to normal distributions.

Uploaded by

jassoniega24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

ESci 117

Engineering Data Analysis


Module 5. Random Variables and STUDENT LEARNING GUIDE
Probability Distributions TP-IMD-02 v0 No. CET.ESC SLG20-06

Dr. Jacqueline M. Guarte

College of
ENGINEERING AND
TECHNOLOGY

Department of
AGRICULTURAL AND
BIOSYSTEMS ENGINEERING
2020
ii

Student Learning Guide in

ESci 117: Engineering


Data Analysis
Lesson 5.2: The Normal Distribution

Lesson Summary
The normal distribution is a continuous probability distribution that is very
important in the theory and application of classical statistical inference. This
distribution is used to model the behavior of quantitative characteristics of
interest whose relative frequency or density histograms can be approximated
by the normal curve. The standard normal random variable Z can be used to
determine the probabilities of events associated with any normal random
variable X if its mean and variance are known.

Learning Outcome
At the end of the lesson, the students should be able to solve problems
involving the normal distribution/model.

Motivation Question
Why is the normal distribution considered by many as the most important
distribution in Statistics?

Discussion
Some students and researchers may already be acquainted with the normal
distribution as the “bell-shaped curve.” Johnson and Bhattacharyya (2011)
share some interesting history and facts on this distribution as follows:

“Carl Gauss derived the normal distribution mathematically as the


probability distribution of the error measurements, which he called the
“normal law of errors.” Subsequently, astronomers, physicists, and,
somewhat later, data collectors in a wide variety of fields found that their
histograms exhibited the common feature of first rising gradually in height
to a maximum and then decreasing in a symmetric manner. Although the
normal curve is not unique in exhibiting this form, it has been found to
provide a reasonable approximation in a great many situations.
Unfortunately, at one time during the early stages of the development of
statistics, it had many overzealous admirers. Apparently, they felt that all
real-life data must conform to the bell-shaped normal curve, or otherwise,
the process of data collection should be suspect. It is in this context that
the distribution became known as the normal distribution. However,
scrutiny of data has often revealed inadequacies of the normal
distribution. In fact, the universality of the normal distribution is only a
myth, and examples of quite nonnormal distributions abound in virtually
every field of study. Still, the normal distribution plays a central role in
statistics, and inference procedures derived from it have wide applicability
and form the backbone of current methods of statistical analysis.”

With this background information, we are now ready to learn more about the
normal distribution based mainly from Almeda et al. (2010).
2

Many of the classical statistical procedures that we will learn (starting in


Module 7) will assume that the sample comes from a normal (or at least
approximately normal) distribution. This assumption is telling us that we can
take the observed sample values to be actual values of a random variable
whose behavior can be modeled by a normal distribution. That is, the
characteristic of interest has a mound-shaped or bell-shaped relative
frequency/density histogram that can be approximated by a normal curve.

Although we will not be using the following formula, we need to know that a
normal curve can be drawn using the normal probability density function
given by
( )
( )

for any real number that is a value of a continuous random variable . The
constants and are such that and . The values and
are just our mathematical constants and . The
value ( ) is just the height of the normal curve at .

We say that is normally distributed with mean and variance and write
this as ( ). We call the constants, and , the parameters of the
distribution which determine its specific form. Recall from Module 2 that these
are just the population mean and variance that we wish to estimate from the
sample. We note that the population of interest is the set of all realized values
of if we were to repeat the experiment endlessly with mean ( ) and
variance ( ) . It also follows that the standard deviation of is just .

Now, let us look at some important features of the normal curve that need our
special attention, as seen from Figure 15 (Ott and Longnecker, 2016).
Clearly, the bell-shaped normal curve is symmetrical about its mean which
locates the peak of the bell (see Figure 15 (a)). Note that its tails “approach
the x-axis without ever touching it” (Almeda et al, 2010). Recall from Lesson
5.1 that the total area under a PDF curve is 1. Since the normal curve is
symmetric about the mean which is also the median, we can divide the total
area into two equal parts at .

Although the normal random variable may theoretically assume values from
, approximately all (99.7%) the measurements are within 3
standard deviations (3 ) of as shown by Figure 15 (d). The probability is
approximately 0.683 that a value of the normal random variable will lie
within 1 standard deviation of its mean which we can write as (
) (see Figure 15 (b)). We also know that the probability is
approximately 0.954 that a value will lie within 2 standard deviations of its
mean, or ( ) (see Figure 5.2 (c)).

Figure 1. Normal curve and areas/probabilities of selected regions under the


curve
3

Source: Taken from R.L. Ott and M. Longnecker’s An Introduction to


Statistical Methods and Data Analysis, 7 th edn., Cengage Learning,
Boston, MA, USA, 2016, p.181.

Let us now compare three normally distributed random variables,


with varying values of the parameters, and (Figure 16). All
three curves are symmetric about their respective means (which is also
the median and the mode because of symmetry) and are mesokurtic. Note
that has the smallest mean (so that the peak of its PDF curve is at the
leftmost) and the smallest variance (so that its PDF curve is the narrowest
with values more concentrated about its mean).

Figure 2. Three normal curves where while .

Source: Taken from J.V. Almeda et al, Elementary Statistics, The University
of the Philippines Press, Quezon City, 2010, p.350.
4

Recall from Lesson 5.1 the concept of kurtosis in which a distribution may be
described as mesokurtic, leptokurtic, or platykurtic. We will now relate
these types of distributions to the normal distribution based on the
comparisons made by Kenton (Investopedia, 2020) as follows.

The “peakedness” of a normal distribution matches that of a mesokurtic


distribution since the probability of extreme, rare, or outlier data is zero or
close to zero (i.e., very unlikely). A leptokurtic distribution, however, has
fatter tails compared to a normal distribution. It has more values in the
distribution tails and more values close to the mean. This tells us that the
probability of extreme events is greater under a leptokurtic distribution than
under a normal distribution. In contrast, a platykurtic distribution has lighter
tails compared to a normal distribution. It has fewer values in the tails and
fewer values close to the mean. This indicates that the probability of extreme
events is lower under a platykurtic distribution than under a normal
distribution.

Suppose we have a normal random variable with mean and variance


. It is called the standard normal random variable, denoted by , and
we write this as ( ). The graph of its probability density function (PDF)
and cumulative distribution function (CDF) are presented in Figure 17. Note
that ( ) ( ). In particular, we can see that ( ) ( )
since the standard normal curve is symmetric about its mean.

Figure 3. The PDF and CDF of the standard normal random variable.

Source: Taken from J.V. Almeda et al, Elementary Statistics, The


University of the Philippines Press, Quezon City, 2010, p.351.
5

We have a table of the values of the CDF of which we can use to evaluate
the probability of any event expressed in terms of . Therefore, we just need
to express any normal random variable in terms of so that there is no
need for us to have a separate table of the values of the CDF for each

We can use one important property of a normally distributed random variable


to transform it to . It states that any random variable that is a linear
transformation of will also be normally distributed. This is formally stated as
follows:

If ( ) and (where and are constants), then will


be normally distributed with mean ( ) and variance Var( )
.

Using this property, we can show that is a linear transformation of if we


express it in terms of as

since we can have and which will lead us to

( ) ( ) ( )

and

Var( ) ( )

as the mean and variance of should be. We can see that the value of
any value of is just the difference between that value and the mean of
divided by the standard deviation of . This value of is also called the z-
score associated with the value of and is interpreted as the number of
standard deviations that the value lies away from its mean (Ott and
Longnecker, 2016).

Let us now see the values of corresponding to selected values of (Figure


18). In particular, if the value of is just its mean, the value will be 0
which is also its mean. If the value of , the value of will be 1
which is also the sum of its mean and standard deviation. We can see that a
value of that is 3 standard deviations below (to the left of) its mean
corresponds to Note that we can also determine the value of for
any given value of , if we know and , by the expression .
6

Figure 4. Relationship between selected values of and

Source: Taken from R.L. Ott and M. Longnecker’s An Introduction to


Statistical Methods and Data Analysis, 7 th edn., Cengage Learning,
Boston, MA, USA, 2016, p.182.

We will now determine the probabilities of events associated with normal


random variables expressed in terms of using the table on the CDF values
of (Table B.1, on the last two pages of this lesson). We will just be using the
formulas presented in Lesson 5.1 but this time expressed in terms of .
These are:

1. ( ) ( ) ( ), area below under the standard


normal curve.

2. ( ) ( ) ( ), area above under the standard


normal curve.

3. ( ) ( ) ( ) ( )
( ) ( ), area between and under the standard normal
curve.

We note that to use Table B.1 directly, we need to express the probabilities in
terms of area below under the standard normal curve.We will now use the
given formulas to solve probability problems concerning a normal random
variable and also learn how to use Table B.1 through the following example.

Example: (Johnson and Bhattacharyya, 2011)

Records suggest that the normal distribution with mean 50 and standard
deviation 9 is a plausible model for a measurement of the amount of
suspended solids (ppm) in river water. Find the probability that a
measurement of the amount of suspended solids in river water is:

a) Less than 46.4 ppm.


b) Greater than 57.2 ppm.
c) Between 52.5 and 60.9 ppm inclusive.

Solution:

Let denote the amount of suspended solids (ppm) in river water. Then, we
can transform into the standard normal random variable

.
7

a) For the corresponding value of .


Therefore,

( ) ( ) ( ) .

The probability value corresponds to the area below under the


standard normal curve.We can say that the probability of this event
happening is 0.345. We can conclude that 34.5% or about 34% of the
measurements to be taken will be less than 46.4 ppm under this model.

b) For , the corresponding value of . We find

( ) ( ) ( )

The probability value corresponds to the area above under the


standard normal curve. We can say that the probability of this event
occurring is 0.212. We can infer that 21.2% of the measurements to be
taken will be greater than 57.2 ppm under this model.

c) The values corresponding to and are

and ,

respectively. We calculate

( ) ( )
( ) ( )

The probability value is the area between and under


the standard normal curve. We can say that the probability of observing
this event is 0.277. We can predict that about 28% of the measurements
to be taken will lie in the interval [52.5, 60.9] under this model.

Learning Activity
Solve the following problems concerning the normal distribution.

1. The force required to puncture a cardboard mailing tube with a sharp object
is normally distributed with mean 14.5 kilos and standard deviation 1.8
kilos. What is the probability that a tube will puncture if it is struck by a 9-
kilo blow with the object?
(Hint: Let be the force required to puncture a cardboard mailing tube with
a sharp object. We are computing for the probability that the required force
can be as much as 9 kilos.)

2. Steel used for water pipelines is often coated on the inside with cement
mortar to prevent corrosion. In a study of the mortar coatings of a pipeline
used in a water transmission project in California, the mortar thickness was
specified to be 7/16 inch. A very large number of thickness measurements
produced a mean equal to 0.635 inch and a standard deviation equal to
8

0.082 inch. If the thickness measurements were normally distributed,


approximately what percentage was less than 7/16 inch?
(Hint: Let be the thickness measurement of the mortar coating. We are
computing for the probability that the thickness of the mortar coating is less
than 7/16 inch.)

References
ALMEDA, J.V., T.S. CAPISTRANO, and G.M.F. SARTE. 2010. Elementary
Statistics. The University of the Philippines Press, Quezon City.
pp. 348-353, 395, 603-602.

JOHNSON, R.A. and G.K. BHATTACHARYYA. 2011. Statistics Principles


and Methods, 6th edn. John Wiley & Sons (Asia) Pte Ltd. pp. 230, 241-
242.

OTT, R.L. and M. LONGNECKER. 2016. An Introduction to Statistical


Methods and Data Analysis, 7 th edn. Cengage Learning, Boston, MA,
USA. pp. 180-182.

investopedia.com/terms/m/mesokurtic.asp. Retrieved on June 20, 2020.


9

Table B. 1. Standard Normal Cumulative Distribution and 100(1-α)th Percentiles, zα


10
DEPARTMENT OF
AGRICULTURAL AND BIOSYSTEMS ENGINEERING
College of Engineering and Technology

For inquiries, contact:

ENGR. ELDON P. DE PADUA


[email protected][email protected]
+63 53 565 0600 Local 1015

Use this code when referring to this material:


TP-IMD-02 v0 07-15-20 • No. CET.ESC SLG20-06

Visca, Baybay City, Leyte


Philippines 6521
[email protected]
+63 53 565 0600

You might also like