Applied Statistics
Applied Statistics
com
Applied Statistics
for Engineers
and Scientists
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
third edition
Applied Statistics
for Engineers
and Scientists
Jay Devore
California Polytechnic State University, San Luis Obispo
Nicholas Farnum
California State University, Fullerton
Jimmy Doi
California Polytechnic State University, San Luis Obispo
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial
review has deemed that any suppressed content does not materially affect the overall learning experience. The publisher reserves the right to
remove content from this title at any time if subsequent rights restrictions require it. For valuable information on pricing, previous
editions, changes to current editions, and alternate formats, please visit www.cengage.com/highered to search by
ISBN#, author, title, or keyword for materials in your areas of interest.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Applied Statistics for Engineers and © 2014, 2005, 2000, Cengage Learning
Scientists, Third Edition
WCN: 02-200-203
Jay Devore, Nicholas Farnum, Jimmy Doi
Publisher: Richard Stratton ALL RIGHTS RESERVED. No part of this work covered by the copyright
herein may be reproduced, transmitted, stored, or used in any form or by
Senior Sponsoring Editor: Molly Taylor any means graphic, electronic, or mechanical, including but not limited
Development Editor: Laura Wheel to photocopying, recording, scanning, digitizing, taping, web distribution,
Editorial Assistant: Danielle Hallock information networks, or information storage and retrieval systems, except
as permitted under Section 107 or 108 of the 1976 United States Copyright
Associate Media Editor: Andrew Coppola
Act, without the prior written permission of the publisher.
Brand Manager: Gordon Lee
Content Project Manager: Jill Quinn For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support, 1-800-354-9706
Senior Art Director: Linda May
For permission to use material from this text or product,
Manufacturing Planner: Sandee Milewski
submit all requests online at www.cengage.com/permissions
Rights Acquisition Specialist: Shalice Further permissions questions can be emailed to
Shah-Caldwell [email protected]
Production Service: Prashant Kumar Das,
MPS Limited Library of Congress Control Number: 2013944181
Text and Cover Designer: Jenny Willingham ISBN-13: 978-1-133-11136-8
Cover Image: Female Scientist:
ISBN-10: 1-133-11136-X
wavebreakmedia/Shutterstock.com;
Solar Panels: portumen/Shutterstock.com; Cengage Learning
Nanotubes: PASIEKA/SPL/Getty images 200 First Stamford Place, 4th Floor
Compositor: MPS Limited Stamford, CT 06902
USA
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This book is dedicated to
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
vii
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
viii Contents
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents ix
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
x Contents
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
Purpose
The use of statistical models and methods for describing and analyzing data has become
common practice in virtually all scientific disciplines. This book provides a comprehen-
sive introduction to those models and methods most likely to be encountered and used
by students in their careers in engineering and the natural sciences. It is appropriate for
courses of one term (semester or quarter) in duration.
Approach
Students in a statistics course designed to serve other majors are too often initially skepti-
cal of the value and relevance of the subject matter. Our experience, however, is that
students can be turned on to the subject by the use of good examples and exercises that
blend their everyday experiences with their scientific interests. We have worked hard to
find examples involving real, rather than artificial, data—data that someone thought
was worth collecting and analyzing. Many of the methods presented throughout the
book are illustrated by analyzing data taken from a published source.
The exercises form a very important component of the book. A really good lecturer
can deceive students into thinking they have an excellent mastery of the subject, only
to discover otherwise when they start working problems. We have therefore provided a
rich assortment of exercises designed to reinforce understanding of the material. A sub-
stantial majority of these are based on real data, and we have tried as much as possible
to avoid mathematical manipulation for its own sake. Someone who attempts a good
portion of the exercises will gain a greater appreciation of the scope and applicability of
the subject than would be gleaned simply by reading the text.
Sometimes the reader may be unfamiliar with the context of a particular problem
situation (as indeed we often were), but we believe that students will find scenarios,
xi
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xii Preface
such as the one below, more appealing than they would in patently artificial situations
dealing with widgets or brand A versus brand B.
64. The use of microorganisms to dissolve metals x1 5 pH, x2 5 sucrose concentration (g/L), and
from ores has offered an ecologically friendly x3 5 spore population (106 cells/ml) on y 5
and less expensive alternative to traditional oxalic acid production (mg/L). The accompa-
methods. The dissolution of metals by this nying SAS output resulted from a request to fit
method can be done in a two-stage bioleaching the model with predictors x1, x2, and x3 only.
process: (1) microorganisms are grown in cul- Source DF Sum of Mean F Pr > F
ture to produce metabolites (e.g. organic acids) Squares Square Value
and (2) ore is added to the culture medium to Model 3 5861301 1953767 7.53 0.0052
initiate leaching. The article “Two-Stage Fun- Error 11 2855951 259632
Corrected
gal Leaching of Vanadium from Uranium Ore
Total 14 8717252
Residue of the Leaching Stage using Statisti-
cal Experimental Design” (Annals of Nuclear Fitting the complete second-order model re-
Energy, 2013: 48–52) reported on a two-stage sulted in SSResid 5 541,632. Carry out a test at
bioleaching process of vanadium by using significance level .01 to decide whether at least
the fungus Aspergillus niger. In one study, the one of the second-order predictors provides use-
authors examined the impact of the variables ful information about oxalic acid production.
Example 10.2 Over the past decade researchers and consumers have shown increased interest
in renewable fuels such as biodiesel, a form of diesel fuel derived from vegetable
oils and animal fats. According to www.fueleconomy.gov, compared to petroleum
diesel, the advantages of using biodiesel include its nontoxicity, biodegradability,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xiii
and lower greenhouse gas emissions. One popular biodiesel fuel is fatty acid ethyl
ester (FAEE). The authors of “Application of the Full Factorial Design to Opti-
mization of Base-Catalyzed Sunflower Oil Ethanolysis” (Fuel, 2013: 433−442)
performed an experiment to determine optimal process conditions for producing
FAEE from the ethanolysis of sunflower oils. In one study, the effects of three pro-
cess factors on FAEE purity (%) were investigated.
(See Page 467 for the complete data) Interaction Plots for FAEE
Data Means
6
9 25 50 75 6 9 12
RATIO 90 12
LOAD
85 96
LOAD 94
95 0.75
1.00 92
90 LOAD 1.25
90
85
88
25 50 75 0.75 1.00 1.25 0.75 1.00 1.25
96
94
92
90
88
Mean
25 50 75 6 9 12
94
We92
90
have written this book for an audience whose primary interest is in statistical meth-
odology
88
0.75
and1.00the analysis
1.25
of data. The ordering of topics herein is rather different from
what is found in virtually all competing texts. The usual approach is to inject a heavy
dose of probability at the outset, then develop probability distributions and use these as
a basis for inferential methods (drawing conclusions from data). Unfortunately, an intro-
ductory one-term course rarely allows sufficient time for comprehensive treatments of
both probability and statistical inference. If probability is emphasized, statistics gets short
shrift. An additional problem is that many students find probability to be a difficult and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xiv Preface
intimidating subject, so starting out in this way creates an aura of mathematical formal-
ism that makes it all too easy to lose sight of the applied and practical aspects of statistics.
Certainly descriptive statistical methods can be developed in detail with virtu-
ally no probability background, and even an understanding of the most commonly
used inferential techniques requires familiarity with only the most basic of probability
properties. So we decided to proceed along a path first blazed by David Moore and
George McCabe in their book Introduction to the Practice of Statistics, written for a non-
science audience. In their Chapter 1, the normal distribution is introduced and em-
ployed to address many interesting questions, whereas probability does not surface
until much later in the book. Our Chapter 1 first presents some basic concepts and
terminology, continues with an introduction to some descriptive techniques, and then
extends the notion of a histogram for sample data to a distribution of values for an entire
population or process. This allows us to develop and use not only the family of normal
distributions but also other continuous and discrete distributions such as the lognormal,
Weibull, Poisson, and binomial. Chapter 2 covers numerical summary measures for
sample data (e.g., the sample mean x and sample standard deviation s) in tandem with
analogous measures for populations and processes (e.g., the population or process mean
and standard deviation ).
The focus of the first two chapters is on univariate data (observations on or values of
a single variable, such as tensile strength). In the third chapter we consider descriptive
methods for bivariate data (e.g., measuring both thickness and strength for wire speci-
mens) and then multivariate data, emphasizing in particular correlation and regression.
This chapter should be especially useful for courses in which there is insufficient time
to cover regression models from a probabilistic viewpoint (such models and inferences
based on them are the subject of Chapter 11).
Most other books intended for our target audience say rather little about how data
is obtained. Yet statistics has much to say not only about how to analyze data once it is
available but also about sensible and efficient techniques for collecting data. Several
lower-level texts, notably the one by Moore and McCabe cited earlier, successfully and
entertainingly covered this territory prior to probability and inference, and we follow
their lead with our Chapter 4. Sampling and experimental design are discussed, and the
last section contains an introduction to various aspects of measurement.
At last probability makes its appearance in Chapter 5. Our minimalist treatment
of this subject is intended to move readers expeditiously into the inferential part of the
book. Since only the notion of probability as limiting or long-run relative frequency is
needed to understand the basis for most of the usual inferential procedures, little time
is spent on topics such as addition and multiplication rules and conditional probability,
and no material on counting techniques is included here (combinations enter briefly in
Chapter 1 in connection with the binomial distribution). The concept of a random vari-
able and its probability distribution is then introduced and related to the distributional
material in Chapter 1. Finally, the notion of a statistic and its sampling distribution is
discussed and illustrated.
The remaining six chapters focus on the most widely used methods from statistical
inference. Descriptive techniques from earlier chapters, such as boxplots and quantile
plots, are employed in many of our examples. Chapter 6 covers topics from quality con-
trol and reliability. Estimation and various statistical intervals—confidence, prediction,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xv
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xvi Preface
Acknowledgments
We greatly appreciate the feedback and useful advice from the many individuals who
reviewed various parts of our manuscript: Christine Anderson-Cook, Virginia Tech;
Olcay Arslan, St. Cloud State; Peyton Cook, The University of Tulsa; Jean-Yves “Pip”
Courbois, University of Washington; Charles Donaghey, University of Houston; Dale O.
Everson, University of Idaho; William P. Fox, United States Military Academy; William
Fulkerson, Deere & Company; Roger Hoerl, General Electric Company; Marianne
Huebner, Michigan State University; Alan M. Johnson, University of Arkansas, Little
Rock; Steven L. Johnson, University of Arkansas; Janusz Kawczak, University of North
Carolina, Charlotte; Mohammed Kazemi, University of North Carolina, Charlotte;
David P. Kessler, Purdue University; Barbara McKinney, Western Michigan University;
Jang W. Ra, University of Alaska, Anchorage; John Ramberg, University of Arizona;
Stephen E. Rigdon, Southern Illinois University at Edwardsville; Amy L. Rocha,
San Jose State University; Joe Romano, Stanford University; Lewis H. Shoemaker,
Millersville University; and Paul Wilson, Rochester Institute of Technology.
The editorial and production services provided by numerous people from
Cengage Learning are greatly appreciated, especially the support of Shaylin Walsh,
Laura Wheel, and Jill Quinn. It was indeed a great pleasure to have Prashant Kumar
Das overseeing production of the book; his attention to detail, timely feedback, and
willingness to tolerate the authors’ idiosyncrasies made our work during production
much more tolerable than would otherwise have been the case. A special thanks goes
to Soma Roy for her accuracy checking and work on the solutions manuals. Finally, the
continuing support of family, colleagues, and friends has helped smooth out the bumps
in the road. We are truly grateful to all of you.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1
Crepesoles/Shutterstock.com
Data and Distributions
1.1 Populations, Samples, and Processes
1.2 Visual Displays for Univariate Data
1.3 Describing Distributions
1.4 The Normal Distribution
1.5 Other Continuous Distributions
1.6 Several Useful Discrete Distributions
Introduction
Statistical concepts and methods are not only useful but indeed often indispensable in
understanding the world around us. They provide ways of gaining new insights into the
behavior of many phenomena that you will encounter in your chosen field of specializa-
tion in engineering or science.
The discipline of statistics teaches us how to make intelligent judgments and in-
formed decisions in the presence of uncertainty and variation. Without uncertainty
or variation, there would be little need for statistical methods or statisticians. If every
component of a particular type had exactly the same lifetime, if all resistors produced
by a certain manufacturer had the same resistance value, if pH determinations for soil
specimens from a particular locale gave identical results, and so on, then a single obser-
vation would reveal all desired information.
An interesting manifestation of variation appeared in connection with an effort
to determine the “greenest” way to travel. The article titled “Carbon Conundrum”
( , 2008: 9) described websites that help consumers calculate carbon
output. The results for carbon output for a flight from New York to Los Angeles appear
in the accompanying table.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2 chapter 1 Data and Distributions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Populations, Samples, and Processes 3
and carrying out an experiment. Chapter 5 formalizes the notion of randomness and un-
certainty by introducing the language of probability. The remainder of the book focuses
on the development of inferential methods for drawing interesting conclusions from data
in a wide variety of situations. We hope you will find the subject matter and our presenta-
tion to be as interesting, relevant, and exciting as we do.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4 chapter 1 Data and Distributions
We have bivariate data when observations are made on each of two variables. Our data set
might consist of a (height, weight) pair for each basketball player on a team, with the first
observation as (72, 168), the second as (75, 212), and so on. If an engineer determines the
value of both x 5 component lifetime and y 5 reason for component failure, the resulting
data set is bivariate with one variable numerical and the other categorical. Multivariate
data arises when observations are made on more than two variables. For example, a re-
search physician might determine the systolic blood pressure, diastolic blood pressure, and
serum cholesterol level for each patient participating in a study. Each observation would be
a triple of numbers, such as (120, 80, 146). In many multivariate data sets, some variables
are numerical and others are categorical. Thus the annual automobile issue of Consumer
Reports gives values of such variables as type of vehicle (small, sporty, compact, mid-size,
large), city fuel efficiency (mpg), highway fuel efficiency (mpg), drivetrain type (rear wheel,
front wheel, four wheel), and so on.
Branches of Statistics
An investigator who has collected data may wish simply to summarize and describe
important features of the data. This entails using methods from descriptive statis-
tics. Some of these methods are graphical in nature—the construction of histograms,
boxplots, and scatterplots are primary examples. Other descriptive methods involve cal-
culation of numerical summary measures, such as means, standard deviations, and cor-
relation coefficients. The wide availability of statistical computer software packages has
made these tasks much easier to carry out than they used to be. Computers are much
more efficient than human beings at calculation and the creation of pictures (once they
have received appropriate instructions from the user!). This means that the investigator
doesn’t have to expend much effort on “grunt work” and will have more time to study
the data and extract important messages. Throughout this book, we will present output
from various packages such as Minitab, SAS, and R. The R software can be downloaded
without charge from www.r-project.org.
Example 1.1 Charity is a big business in the United States. The website charitynavigator.com gives
information on approximately 5500 charitable organizations, and many smaller chari-
ties fly below the navigator’s radar screen. Some charities operate very efficiently, with
fund-raising and administrative expenses only a small percentage of total expenses,
whereas others spend a high percentage of what they take in to perform the same
activities. Here is data on fund-raising expenses as a percentage of total expenditures
for a random sample of 60 charities:
6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8
2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4
7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2
6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8
8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9
15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2
Without any organization, making sense of the data’s most prominent features is dif-
ficult: What is a typical (i.e., representative) value? Are values highly concentrated
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Populations, Samples, and Processes 5
about a typical value or are they quite dispersed? Are there any gaps in the data?
What fraction of the values are less than 20%? Figure 1.1 shows what is called a stem-
and-leaf display as well as a histogram. In Section 1.2, we will discuss construction
and interpretation of these data summaries. For the moment, we hope you see how
they begin to describe how the percentages are distributed over the range of possible
values from 0 to 100. A substantial majority of the charities in the sample obviously
spend less than 20% on fund-raising, and only a few percentages might be viewed as
beyond the bounds of sensible practice.
40
30
Frequency
20
10
Unless otherwise noted, all content on this page is © Cengage Learning.
0
0 10 20 30 40 50 60 70 80 90
FundRsng
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6 chapter 1 Data and Distributions
Techniques for generalizing from a sample to a population are gathered within the branch
of our discipline called inferential statistics.
Example 1.2 Material strength investigations provide a rich area of application for statistical
methods. The article “Effects of Aggregates and Microfillers on the Flexural Prop-
erties of Concrete” (Magazine of Concrete Research, 1997: 81–98) reported on
a study of strength properties of high-performance concrete obtained by using
superplasticizers and certain binders. The compressive strength of such concrete
had previously been investigated, but not much was known about flexural strength
(a measure of ability to resist failure in bending). The accompanying data on
flexural strength (in megapascals, MPa, where 1 Pa (pascal) 5 1.45 3 1024 psi) ap-
peared in the article cited:
5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5 7.0 6.3 7.9 9.0
8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7
Suppose we want an estimate of the average value of flexural strength for all beams
that could be made in this way (if we conceptualize a population of all such beams,
we are trying to estimate the population mean). It can be shown that, with a high
degree of confidence, the population mean strength is between 7.48 MPa and 8.80
MPa; we call this a confidence interval or interval estimate. Alternatively, this data
could be used to predict the flexural strength of a single beam of this type. With a
high degree of confidence, the strength of a single such beam will exceed 7.35 MPa;
the number 7.35 is called a lower prediction bound.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Populations, Samples, and Processes 7
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8 chapter 1 Data and Distributions
Statistical information now appears with increasing frequency in the popular media, and
occasionally the spotlight is even turned on statisticians. For example, “Behind Cancer
Guidelines, Quest for Data,” a New York Times article from November 23, 2009, reported
that the new science for cancer investigations and more sophisticated methods for data
analysis spurred the U.S. Preventive Services task force to reexamine guidelines for how
frequently middle-aged and older women should have mammograms. The panel com-
missioned six independent groups to do statistical modeling. The result was a new set of
conclusions, in particular one that mammograms every two years give nearly the same
benefit as annual ones and confer only half the risk of harm. Donald Berry, a promi-
nent biostatistician, was quoted as saying he was pleasantly surprised that the task force
took the new research to heart in making its recommendations. The task force’s report
has generated much controversy among cancer organizations, politicians, and women
themselves.
We hope you will become increasingly convinced of the importance and relevance
of the discipline of statistics as you dig more deeply into the book and subject. We also
anticipate you’ll be intrigued enough to want to continue your statistical education beyond
your current course.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.1 Populations, Samples, and Processes 9
Example 1.3 The process of making ignition keys for automobiles consists of trimming and press-
ing raw key blanks, cutting grooves and notches, and then plating the keys. Dimen-
sions associated with groove and notch cutting are crucial to proper key functioning.
There will always be “normal” variation in dimensions because of fluctuations in
materials, worker behavior, and environmental conditions. It is important, though,
to monitor production to ensure that there are no unusual sources of variation, such
as incorrect machine settings or contaminated material, which might result in non-
conforming units or substantial changes in product characteristics. For this purpose,
a sample (subgroup) of five keys is selected every 20 minutes, and critical dimensions
are measured. Here are a few of the resulting observations for one particular dimen-
sion (in thousandths of an inch):
Subgroup 1: 6.1 8.4 7.6 7.5 4.4
Subgroup 2: 8.8 8.3 5.9 7.4 7.6
Subgroup 3: 8.0 7.5 7.0 6.8 9.3
This is indeed sample data, which can be used as a basis for drawing conclusions.
However, the conclusions are about production process behavior rather than about
a particular population of keys.
Analytic studies sometimes involve figuring out what actions to take to improve the
performance of a future product.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10 chapter 1 Data and Distributions
Example 1.4 Failure in fluorescent lamps occurs when their luminosity falls below a predeter-
mined level. The article “Using Degradation Data to Improve Fluorescent Lamp
Reliability” (J. of Quality Technology, 1995: 363–369) described a case study involv-
ing fluorescent lamps of a certain type. The project engineer suggested focusing on
three factors thought to be crucial to reliability:
Two levels, low and high, of each factor were established, leading to eight com-
binations of factor levels (e.g., low current, high mercury concentration, and
low argon concentration). Luminance levels were then monitored over time for
certain factor-level combinations. (Because of limited resources, only four of
the eight combinations were included in the experiment, with five lamps used
at each one.) Here is data for one particular lamp for which all factor levels
were low:
Time (hr): 100 500 1000 2000 3000 4000 5000 6000
Luminance (lumens): 2810 2490 2460 2370 2320 2160 2140 2080
Statistical methods were used on the resulting data to draw conclusions about how
lamp reliability could be improved. In particular, it was recommended that high
concentration levels should be used with a low current level.
Stem-and-Leaf Displays
A stem-and-leaf display can be an effective way to organize numerical data without
expending much effort. It is based on separating each observation into two parts:
(1) a stem, consisting of one or more leading digits, and (2) a leaf, consisting of
the remaining or trailing digit(s). Suppose, for example, that data on calibration
times (sec) for certain test devices has been gathered and that the smallest and
largest times are 11.3 and 18.8, respectively. Then we could use the tens and ones
digits as the stem of an observation, leaving the tenths digit for the leaf. Thus 11.3
would have a stem of 11 and a leaf of 3, 16.0 would have a stem of 16 and a leaf of
0, and so on. Once stem values have been chosen, they should be listed in a single
column. Then the leaf of each observation should be placed on the row of the cor-
responding stem.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 11
Example 1.5 The use of alcohol by college students is of great concern not only to those in the
academic community but also, because of potential health and safety consequences,
to society at large. The article “Health and Behavioral Consequences of Binge
Drinking in College” (J. of the Amer. Med. Assoc., 1994: 1672–1677) reported on
a comprehensive study of heavy drinking on campuses across the United States. A
binge episode was defined as five or more drinks in a row for males and four or more
for females. Figure 1.2 shows a stem-and-leaf display of 140 values of x 5 the percent-
age of undergraduate students who are binge drinkers. (These values were not given
in the cited article, but our display agrees with a picture of the data that did appear.)
0 4
1 1345678889
2 1223456666777889999 Stem: tens digit
3 0112233344555666677777888899999 Leaf: ones digit
4 111222223344445566666677788888999
5 00111222233455666667777888899
6 01111244455666778
The first leaf on the stem 2 row is 1, which tells us that 21% of the students at one
of the colleges in the sample were binge drinkers. Without the identification of stem
digits and leaf digits on the display, we wouldn’t know whether the stem 2, leaf 1 obser-
vation should be read as 21%, 2.1%, or .21%.
When creating a display by hand, ordering the leaves from smallest to largest on
each line can be time-consuming, and this ordering usually contributes little if any
extra information. Suppose the observations had been listed in alphabetical order by
school name, as
16% 33% 64% 37% 31% ...
Then placing these values on the display in this order would result in the stem 1 row
having 6 as its first leaf, and the beginning of the stem 3 row would be
3 | 371 . . .
Unless otherwise noted, all content on this page is © Cengage Learning.
The display suggests that a typical or representative value is in the stem 4 row,
perhaps in the mid-40% range. The observations are not highly concentrated about
this typical value, as would be the case if all values were between 20% and 49%. The
display rises to a single peak as we move downward, and then declines; there are no
gaps in the display. The shape of the display is not perfectly symmetric, but instead
appears to stretch out a bit more in the direction of low leaves than in the direction of
high leaves. Lastly, there are no observations that are unusually far from the bulk of
the data (no outliers), as would be the case if one of the 26% values had instead been
86%. The most surprising feature of this data is that at most colleges in the sample, at
least one-quarter of the students are binge drinkers. The problem of heavy drinking on
campuses is much more pervasive than many had suspected.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
12 chapter 1 Data and Distributions
A stem-and-leaf display conveys information about the following aspects of the data:
Identification of a typical or representative value
Extent of spread about the typical value
Presence of any gaps in the data
Extent of symmetry in the distribution of values
Number and location of peaks
Presence of any outlying values
Suppose in Example 1.5 that each observation had included a tenths digit as well as
the tens and ones digits: 16.4%, 36.5%, and so on. We could use two-digit leaves, so that
16.4 would have a stem of 1 and a leaf of 64; in this case, the decimal point can be omitted,
but commas are necessary between successive leaves. Because such a display can become
very unwieldy, it is customary to use single-digit leaves obtained by truncation (not round-
ing). Thus 36.7 would have stem 3 and leaf 6, and information about the tenths digit would
be suppressed.
Consider a data set consisting of exam scores all of which are in the 70s, 80s, and 90s
(an instructor’s dream!). A stem-and-leaf display with the tens digit as the stem would have
only three rows. However, a more informative display can be created by repeating each
stem value twice, once for the low leaves 0, 1, 2, 3, 4 and again for the high leaves 5, 6, 7,
8, 9. A display of the binge-drinking data with repeated stems is shown in Figure 1.3. (The
11 on the far left in the fourth row indicates that there are 11 observations on or above that
row; the (14) row contains the middle data value.)
Suppose that a final exam in physics contained questions worth a total of 200 points
and that the only student who scored in the 100s earned 186 points. Rather than include
rows 10, 11, . . . , and 18 just to show the extreme outlier 186, it is better to stop the display
with a stem 9 row and place the information HI: 186 in a prominent place to the right of
the display. The same thing can be done with outliers on the low end.
Consider two different data sets, each consisting of observations on the same variable,
for example, exam scores for two different classes or stopping distances for cars equipped
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 13
with two different braking systems. An investigator would naturally want to know in what
ways the two sets were similar and how they differed. This can be accomplished by using
a comparative stem-and-leaf display, in which the leaves for one data set are listed to the
right of the stems and the leaves for the other to the left. Figure 1.4 shows a small example;
the two sides of the display are quite similar, except that the right side appears to be shifted
up one row (about 10 points) from the other side.
9 658618
9447 8 13754380
2208965655 7 5312267
2432875 6 45104
5882 5 9
Dotplots
A dotplot is an attractive summary of numerical data when the data set is reasonably small or
there are relatively few distinct data values. Each observation is represented by a dot above the
corresponding location on a horizontal measurement scale. When a value occurs more than
once, there is a dot for each occurrence, and these dots are stacked vertically. As with a stem-and-
leaf display, a dotplot gives information about location, spread, extremes, and gaps.
Example 1.6 Here is data on state-by-state appropriations for higher education as a percentage of
state and local tax revenue for fiscal year 2009–2010 (from the Statistical Abstract of the
United States). Values are listed in order of state abbreviations (AL first, WY last):
14.0 3.1 8.6 9.6 7.4 4.0 4.5 6.5 6.1 8.8
8.2 8.6 6.4 6.7 8.0 8.5 9.4 9.5 4.6 6.8
3.9 6.9 6.3 11.9 5.8 5.8 9.9 5.9 2.7 4.2
14.9 4.0 12.1 8.0 5.2 9.2 6.8 4.3 3.9 9.6
Unless otherwise noted, all content on this page is © Cengage Learning.
8.0 8.6 8.6 8.7 3.1 5.8 6.2 8.7 6.8 8.9
Figure 1.5 shows a dotplot of the data. The most striking feature is the substan-
tial state-to-state variability. The largest values (for New Mexico, Alabama, North
Carolina, and Mississippi) are somewhat separated from the bulk of the data and
may possibly qualify as outliers.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
14 chapter 1 Data and Distributions
If the data set discussed in Example 1.6 had consisted of many more observations
(e.g. average per pupil spending for each school district in the U.S.), it would be quite
cumbersome to construct a corresponding dotplot. Our next technique is well suited to
such situations.
Histograms
Some numerical data is obtained by counting to determine the value of a variable (the
number of traffic citations a person received during the last year, the number of persons ar-
riving for service during a particular period), whereas other data is obtained by taking mea-
surements (weight of an individual, reaction time to a particular stimulus). The prescription
for drawing a histogram is different for these two cases.
definitionS A variable is discrete if its set of possible values either is finite or else can be listed
in an infinite sequence (one in which there is a first number, a second number,
and so on). A variable is continuous if its possible values consist of an entire
interval on the number line.
A discrete variable x almost always results from counting, in which case possible
values are 0, 1, 2, 3, . . . or some subset of these integers. Continuous variables arise from
making measurements. For example, if x is the pH of a chemical substance, then in
theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and so on. Of course,
in practice there are limitations on the degree of accuracy of any measuring instrument,
so we may not be able to determine pH, reaction time, height, and concentration to an
arbitrarily large number of decimal places. However, from the point of view of creating
mathematical models for distributions of data, it is helpful to imagine an entire con-
tinuum of possible values.
Consider data consisting of observations on a discrete variable x. The frequency of
any particular x value is the number of times that value occurs in the data set. The relative
frequency of a value is the fraction or proportion of time the value occurs:
Suppose, for example, that our data set consists of 200 observations on x 5 the number of
major defects on a new car of a certain type. If 70 of these x values are 1, then
frequency of the x value 1: 70
70
relative frequency of the x value 1: 5 .35
200
Multiplying a relative frequency by 100 gives a percentage; in the defect example, 35% of the
cars in the sample had just one major defect. The relative frequencies, or percentages, are usually
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 15
of more interest than the frequencies themselves. In theory, the relative frequencies should sum
to 1, but in practice the sum may differ slightly from 1 because of rounding.
This construction ensures that the area of each rectangle is proportional to the relative
frequency of the value. Thus if the relative frequencies of x 5 1 and x 5 5 are .35 and .07,
respectively, then the area of the rectangle above 1 is five times the area of the rectangle
above 5.
Example 1.7 Every corporation has a governing board of directors. The number of individuals on a
board varies from one corporation to another. One of the authors of the article “Does
Optimal Corporate Board Size Exist? An Empirical Analysis” (Journal of Applied
Finance, 2010: 57–69) provided the accompanying data on the number of directors
on the boards of a random sample of 204 corporations.
Relative Relative
Board Size Frequency Frequency Board Size Frequency Frequency
4 3 0.0147 19 0 0.0000
5 12 0.0588 20 0 0.0000
6 13 0.0637 21 1 0.0049
7 25 0.1225 22 0 0.0000
8 24 0.1176 23 0 0.0000
9 42 0.2059 24 1 0.0049
10 23 0.1127 25 0 0.0000
11 19 0.0931 26 0 0.0000
12 16 0.0784 27 0 0.0000
13 11 0.0539 28 0 0.0000
14 5 0.0245 29 0 0.0000
15 4 0.0196 30 0 0.0000
16 1 0.0049 31 0 0.0000
17 3 0.0147 32 1 0.0049
18 0 0.0000 204 0.9997
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
16 chapter 1 Data and Distributions
The corresponding histogram in Figure 1.6 rises to a peak and then declines. The
histogram extends a bit more on the right (toward large values) than it does on the
left—a slight positive skew.
40
30
Frequency
20
10
0
4 8 12 16 20 24 28 32
Board size
From either the tabulated information or the histogram itself, we can determine
the following:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 17
of which is 31.4. Then we could use the class boundaries 27.5, 28.0, 28.5, . . . , and 31.5 as
shown here:
A potential difficulty is that an observation such as 29.0 lies on a class boundary so it doesn’t
lie in exactly one interval. One way to deal with this problem is to use boundaries like
27.55, 28.05, . . . , 31.55. Adding a hundredths digit to the class boundaries prevents obser-
vations from falling on the resulting boundaries. Another way to deal with this problem is
to use the classes 27.5 2 , 28.0, 28.0 2 , 28.5, . . . , 31.0 2 , 31.5. Then 29.0 falls in the
class 29.0 2 , 29.5 rather than in the class 28.5 2 , 29.0. In other words, with this conven-
tion, an observation on a boundary is placed in the interval to the right of the boundary.
This is how Minitab constructs a histogram.
Example 1.8 Power companies need information about customer usage to obtain accurate fore-
casts of demand. Investigators from Wisconsin Power and Light determined energy
consumption (BTUs) during a particular period for a sample of 90 gas-heated homes.
An adjusted consumption value was calculated as follows:
consumption
adjusted consumption 5
(weather, in degree days)(house area)
This resulted in the accompanying data (part of the stored data set FURNACE.
MTW available in Minitab, which we have ordered from smallest to largest):
2.97 4.00 5.20 5.56 5.94 5.98 6.35 6.62 6.72 6.78
6.80 6.85 6.94 7.15 7.16 7.23 7.29 7.62 7.62 7.69
7.73 7.87 7.93 8.00 8.26 8.29 8.37 8.47 8.54 8.58
8.61 8.67 8.69 8.81 9.07 9.27 9.37 9.43 9.52 9.58
9.60 9.76 9.82 9.83 9.83 9.84 9.96 10.04 10.21 10.28
10.28 10.30 10.35 10.36 10.40 10.49 10.50 10.64 10.95 11.09
11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28
12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43
13.47 13.60 13.96 14.24 14.35 15.12 15.24 16.06 16.90 18.26
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
18 chapter 1 Data and Distributions
We let Minitab select the class intervals. The most striking feature of the histogram in
Figure 1.7 is its resemblance to a bell-shaped (and therefore symmetric) curve, with
the point of symmetry at roughly 10.
30
20
Percent 10
0
1 3 5 7 9 11 13 15 17 19
BTU
There are no hard-and-fast rules concerning either the number of classes or the choice
of classes themselves. Between 5 and 20 classes will be satisfactory for most data sets. Gener-
ally, the larger the number of observations in a data set, the more classes should be used. A
reasonable rule of thumb is
number of classes 2number of observations
Equal-width classes may not be a sensible choice if a data set has at least one “stretched-
out tail.” Figure 1.8 (page 19) shows a dotplot of such a data set. Using a small number of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 19
(a)
(b)
(c)
Figure 1.8 Selecting class intervals when there are outliers: (a) many short
equal width intervals; (b) a few wide equal-width intervals; (c) unequal-width
intervals
equal-width classes results in almost all observations falling in just one or two of the classes.
If a large number of equal-width classes are used, many classes will have zero frequency. A
sound choice is to use a few wider intervals near extreme observations and narrower inter-
vals in the region of high concentration.
The resulting rectangle heights are usually called and the vertical scale is the
density scale. This prescription will also work when class widths are equal.
Example 1.9 Corrosion of reinforcing steel is a serious problem in concrete structures located
Unless otherwise noted, all content on this page is © Cengage Learning.
11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6
5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8
3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6
5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
20 chapter 1 Data and Distributions
0.15
0.10
Density
0.05
0.00
2 4 6 8 12 20 30
Bond strength
When class widths are unequal, not using a density scale will give a picture with dis-
torted areas. For equal class widths, the divisor is the same in each density calculation, and
the extra arithmetic simply results in a rescaling of the vertical axis (i.e., the histogram us-
ing relative frequency and the one using density will have exactly the same appearance). A
density histogram does have one interesting property. Multiplying both sides of the formula
Unless otherwise noted, all content on this page is © Cengage Learning.
for density by the class width gives
That is, the area of each rectangle is the relative frequency of the corresponding class.
Furthermore, since the sum of relative frequencies must be 1.0 (except for roundoff),
the total area of all rectangles in a density histogram is 1. It is always possible to draw a
histogram so that the area equals the relative frequency (this is true also for a histogram
of discrete data—just use the density scale). This property will play an important role in
creating models for distributions in Section 1.3.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Visual Displays for Univariate Data 21
Histogram Shapes
Histograms come in a variety of shapes. A unimodal histogram is one that rises to a single
peak and then declines. A bimodal histogram has two different peaks. Bimodality occurs
when the data set consists of observations on two quite different kinds of individuals or
objects. For example, consider a large data set consisting of driving times for automobiles
traveling between San Luis Obispo, California, and Monterey, California (exclusive of
stopping time for sightseeing, eating, etc.). This histogram would show two peaks, one for
those cars that took the inland route (roughly 2.5 hours) and another for those cars travel-
ing up the coast (3.5–4 hours). However, bimodality does not automatically follow in such
situations. Only if the two separate histograms are “far apart” relative to their spreads will
bimodality occur in the histogram of combined data. Thus a large data set consisting of
heights of college students should not result in a bimodal histogram because the typical
male height of about 69 inches is not far enough above the typical female height of about
64–65 inches. A histogram with more than two peaks is said to be multimodal. Of course,
the number of peaks may well depend on the choice of class intervals, particularly with a
small number of observations. The larger the number of classes, the more likely it is that
bimodality or multimodality will manifest itself.
Example 1.10 Figure 1.10(a) shows a Minitab histogram of the weights (lbs) of the 121 players
listed on the rosters of the San Francisco 49ers and the New England Patriots as of
November 28, 2012. Figure 1.10(b) is a smoothed histogram (actually what is called
a density estimate) of the data from the R software package. Both the histogram and
the smoothed histogram show three distinct peaks: The one on the right is for line-
men, the middle peak corresponds to linebacker weights, and the peak on the left is
for all other players (wide receivers, quarterbacks, etc.).
16
14
12
10
Unless otherwise noted, all content on this page is © Cengage Learning.
Percent
0
180 210 240 270 300 330 360
(a) Weight
Figure 1.10 NFL player weights: (a) histogram, (b) smoothed histogram
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
22 chapter 1 Data and Distributions
0.008
0.006
Density estimate
0.004
0.002
0.000
150 200 250 300 350 400
(b) Player weight
Figure 1.10 ( )
A histogram is symmetric if the left half is a mirror image of the right half. A bell-
shaped histogram is symmetric, but there are other unimodal symmetric histograms that
are not bell-shaped; histograms with more than one peak can also be symmetric. A uni-
modal histogram is positively skewed if the right or upper tail is stretched out compared
with the left or lower tail, and negatively skewed if the longer tail extends to the left.
Figure 1.11 shows “smoothed” histograms, obtained by superimposing a smooth curve on
the rectangles, that illustrate the various possibilities.
Unless otherwise noted, all content on this page is © Cengage Learning.
Categorical Data
A histogram for categorical data is often called a bar chart. In some cases, there will
be a natural ordering of classes (for example, freshman, sophomore, junior, senior,
graduate student), whereas in other cases, the order will be arbitrary (Honda, Yamaha,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Exercises 23
Harley-Davidson, etc.). A Pareto diagram is a bar chart resulting from a quality control
study in which each category represents a different type of product nonconformity or
production problem. The categories appear in order of decreasing frequency (if a mis-
cellaneous category is needed, it is the last one).
Example 1.11 In the manufacture of printed circuit boards, finished boards are subjected to a final
inspection before they are shipped to customers. Here is data on the type of defect for
each board rejected at final inspection during a particular time period:
120
100
80
60
40
20
0
Low copper
plating
Poor electroless
coverage
Lamination
problems
Plating
separation
Etching
problems
Miscellaneous
Unless otherwise noted, all content on this page is © Cengage Learning.
1. Consider the strength data for beams given in concentrated about the representative value or
Example 1.2. rather spread out?
a. Construct a stem-and-leaf display of the data. b. Does the display appear to be reasonably sym-
What appears to be a representative strength metric about a representative value, or would you
value? Do the observations appear to be highly describe its shape in some other way?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
24 chapter 1 Data and Distributions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Exercises 25
b. Construct a stem-and-leaf display based on authors were classified according to the number of
two-digit stems and two-digit leaves, with suc- articles they had published during a certain period.
cessive leaves separated by either a comma or a The results were presented in the accompanying fre-
space. quency distribution:
c. Construct a stem-and-leaf display in which the
leaf of each observation is its tens digit (so the Number
ones digit is truncated). Does this display appear of papers: 1 2 3 4 5 6 7 8
to be significantly less informative about course Frequency: 784 204 127 50 33 28 19 19
lengths than the display of part (b)? What ad- Number
vantage would this display have over the one in of papers: 9 10 11 12 13 14 15 16 17
part (b) if there had been 200 courses in the Frequency: 6 7 6 7 4 4 5 3 3
sample?
a. Construct a histogram corresponding to this fre-
6. Construct two stem-and-leaf displays for the accom- quency distribution. What is the most interesting
panying set of exam scores, one in which each stem feature of the shape of the distribution?
value appears just once and the other in which stem b. What proportion of these authors published at
values are repeated: least five papers? At least ten papers? More than
ten papers?
74 89 80 93 64 67 72 70 66 85 89 81 c. Suppose the five 15s, three 16s, and three 17s had
81 71 74 82 85 63 72 81 81 95 84 81 been lumped into a single category displayed as
80 70 69 66 60 83 85 98 84 68 90 82 ;$15.< Would you be able to draw a histogram?
69 72 87 88 Explain.
What feature of the data is revealed by the display d. Suppose that instead of the values 15, 16,
with repeated stems that is not so readily apparent in and 17 being listed separately, they had
the first display? been combined into a 15–17 category with
frequency 11. Would you be able to draw a
7. Temperature transducers of a certain type are shipped histogram? Explain.
in batches of 50. A sample of 60 batches was selected,
and the number of transducers in each batch not 9. The number of contaminating particles on a silicon
conforming to design specifications was determined, wafer prior to a certain rinsing process was deter-
resulting in the following data: mined for each wafer in a sample of size 100, result-
ing in the following frequencies:
2 1 2 4 0 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3
0 4 2 1 3 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1 Number
5 0 2 3 2 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3 of particles: 0 1 2 3 4 5 6 7
Frequency: 1 2 3 12 11 15 18 10
a. Determine frequencies and relative frequencies
for the observed values of x 5 number of noncon- Number
forming transducers in a batch. of particles: 8 9 10 11 12 13 14
b. What proportion of batches in the sample have Frequency: 12 4 5 3 1 2 1
at most five nonconforming transducers? What
proportion have fewer than five? What propor- a. What proportion of the sampled wafers had at
tion have at least five nonconforming units? least one particle? At least five particles?
c. Draw a histogram of the data using relative fre- b. What proportion of the sampled wafers had be-
quency on the vertical scale, and comment on its tween five and ten particles, inclusive? Strictly
features. between five and ten particles?
c. Draw a histogram using relative frequency on the
8. In a study of author productivity (“Lotka’s Test,” Col- vertical axis. How would you describe the shape of
lection Mgmt., 1982: 111–118), a large number of the histogram?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
26 chapter 1 Data and Distributions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Exercises 27
15. Automated electron backscattered diffraction is now Construct a histogram of this data based on classes
being used in the study of fracture phenomena. with boundaries 10, 20, 30, . . . . Then calculate
The following information on misorientation angle log10(x) for each observation, and construct a his-
(degrees) was extracted from the article “Observations togram of the transformed data using class bound-
on the Faceted Initiation Site in the Dwell-Fatigue aries 1.1, 1.2, 1.3, . . . . What is the effect of the
Tested Ti-6242 Alloy: Crystallographic Orienta- transformation?
tion and Size Effects” (Metallurgical and Materials
17. The accompanying data set consists of observa-
Trans., 2006: 1507–1518)
tions on shear strength (lb) of ultrasonic spot
Class: 0 2 ,5 5 2 ,10 10 2,15 15 2 ,20 welds made on a certain type of alclad sheet.
Rel Freq: .177 .166 .175 .136 Construct a relative frequency histogram based
on ten equal-width classes with boundaries 4000,
Class: 20 2 ,30 30 2 , 40 40 2 , 60 602,90 4200, . . . . (The histogram will agree with the
Rel Freq: .194 .078 .044 .030 one in “Comparison of Properties of Joints Pre-
a. Is it true that more than 50% of the sampled angles pared by Ultrasonic Welding and Other Means,”
are smaller than 15°, as asserted in the paper? J. of Aircraft, 1983: 552–556.) Comment on its
b. What proportion of the sampled angles are at least features.
30°?
5434 4948 4521 4570 4990 5702 5241
c. Roughly what proportion of angles are between
5112 5015 4659 4806 4637 5670 4381
10° and 25°?
d. Construct a histogram and comment on any inter- 4820 5043 4886 4599 5288 5299 4848
esting features. 5378 5260 5055 5828 5218 4859 4780
5027 5008 4609 4772 5133 5095 4618
16. A transformation of data values by means of some
4848 5089 5518 5333 5164 5342 5069
mathematical function, such as 1x or 1yx, can often
yield a set of numbers that has “nicer” statistical prop- 4755 4925 5001 4803 4951 5679 5256
erties than the original data. In particular, it may be 5207 5621 4918 5138 4786 4500 5461
possible to find a function for which the histogram of 5049 4974 4592 4173 5296 4965 5170
transformed values is more symmetric (or even better, 4740 5173 4568 5653 5078 4900 4968
more like a bell-shaped curve) than the original data. 5248 5245 4723 5275 5419 5205 4452
For example, the article “Time Lapse Cinematograph- 5227 5555 5388 5498 4681 5076 4774
ic Analysis of Beryllium–Lung Ibroblast Interactions”
4931 4493 5309 5582 4308 4823 4417
(Envir. Research, 1983: 34–43) reported the results of
5364 5640 5069 5188 5764 5273 5042
experiments designed to study the behavior of certain
individual cells that had been exposed to beryllium. 5189 4986
An important characteristic of such an individual cell 18. The paper “Study on the Life Distribution of Micro-
is its interdivision time (IDT). IDTs were determined drills” ( J. of Engr. Manufacture, 2002: 301–305) re-
for a number of cells both in exposed (treatment) and ported the following observations, listed in increasing
in unexposed (control) conditions. The authors of the order, on drill lifetimes (number of holes that a drill
article used a logarithmic transformation. Consider machines before it breaks) when holes were drilled in
the following representative IDT data: a certain brass alloy.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
28 chapter 1 Data and Distributions
a. Why can a frequency distribution not be based c. Construct a frequency distribution and histo-
on the class intervals 0–50, 50–100, 100–150, gram of the natural logarithms of the lifetime
and so on? observations and comment on interesting char-
b. Construct a frequency distribution and his- acteristics.
togram of the data using class boundaries 0, d. What proportion of the lifetime observations in
50, 100, . . . and then comment on interesting this sample are less than 100? What proportion of
characteristics. the observations are at least 200?
Continuous Distributions
Let x be a continuous variable, one whose value is determined by making a measurement of
some sort. Suppose we have a sample of x values from a population or ongoing process. For
example, the sample might consist of fuel efficiencies of cars selected from a large rental
fleet (a population) or waiting times for a succession of patients entering a large medical
clinic (a patient arrival process). If the sample size is small, a histogram based on only a
small number of relatively wide class intervals is appropriate. For a large sample size, many
narrow classes should be used. Let’s agree to draw our histograms using the density scale
discussed in Section 1.2 so that
For each rectangle, area 5 relative frequency of the class
Total area of all rectangles 5 1
With a large amount of data, a histogram based on any reasonable choice of classes should
have roughly the same shape and can very frequently be well approximated by a smooth
curve. This type of approximation is illustrated in Figure 1.13.
Many approximating curves that arise in practice can be obtained as graphs of reason-
ably simple mathematical functions. Such a mathematical function provides a very concise
description of the x distribution.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Describing Distributions 29
Figure 1.13 Histograms of continuous data: (a) small number of wide classes; (b) large number of narrow
classes; (c) approximation by a smooth curve
DEFINITIONS A density function f (x) is used to describe (at least approximately) the population
or process distribution of a continuous variable x. The graph of f (x) is called the
density curve. The following properties must be satisfied:
1. f (x) $ 0
2. #2 f (x) dx 5 1 (the total area under the density curve is 1.0)
3. For any two numbers a and b with a , b,
b
proportion of x values between a and b 5 # f (x) dx
a
(This proportion is the area under the density curve and above the interval
with endpoints a and b, as illustrated in Figure 1.14.)
( )
proportion of values
Shaded area =
between and
Unless otherwise noted, all content on this page is © Cengage Learning.
There is no area under the density curve and above a single value (e.g., above 2.50), which
implies that
proportion of x values satisfying proportion of x values satisfying
5
a#x#b a,x,b
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
30 chapter 1 Data and Distributions
That is, the area under the curve between a and b does not depend on whether the two
interval endpoints are included or excluded.
Example 1.12 A certain daily program on a public radio station lasts 1 hour. Let x denote the
amount of time (hr) during which music is played. (There are no advertisements, but
the host provides occasional commentary and makes announcements.) A potential
program sponsor is interested in knowing how the value of x varies from program to
program. Consider the density function
90x8(1 2 x) 0#x#1
f (x) 5 e
0 otherwise
This looks complicated, but the corresponding density curve in Figure 1.15 has a
simple and appealing shape.
( )
0
0 .5 1
We see immediately that most x values are quite close to 1 and very few are small-
er than .5 (almost all programs consist of at least a half hour of music). The constant
90 in f(x) ensures that the total area under the density curve is 1.0 [f(x) 5 kx8(1 2 x) Unless otherwise noted, all content on this page is © Cengage Learning.
is a legitimate density function only for k 5 90]. Various proportions of interest can
now be obtained by integration. For example,
.9 .9 .9
proportion of programs
5 # 90x8(1 2 x) dx 5 90# x8 dx 2 90# x9 dx
with x between .7 and .9 .7 .7 .7
x9 x10 2 .9
5 90a 2 b 5 .587
9 10 .7
proportion of programs 1
for which x is at least .8 5 #.8 90x (1 2 x) dx 5 .624
8
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Describing Distributions 31
What duration value c separates the smallest 50% of all x values from the largest
50%? Figure 1.16 shows the location of c; the corresponding equation is
c
#0 90x8(1 2 x) dx 5 .5
which becomes
c9 c10
90a 2 b 5 .5
9 10
Newton’s method or some other numerical technique is used to obtain the solution:
c .838. That is, about 50% of all programs have music for more than .838 hr, and
about 50% have music for less than .838 hr. The value .838 is called the median of
the x distribution.
( )
3
Shaded area = .5
0
0 .5 1
Median = .838
Example 1.13 Let x denote the response time (sec) at a certain on-line computer; that is, x is the
Unless otherwise noted, all content on this page is © Cengage Learning.
time between the end of a user’s inquiry and the beginning of the system’s response
to that inquiry. The value of x varies from inquiry to inquiry. Suppose the density
function for the distribution of x is
.2e2.2x x$0
f (x) 5 e
0 otherwise
where e represents the base of the natural logarithm system and approximately equals
2.71828. A graph of f (x) is shown in Figure 1.17. By inspection, f (x) $ 0, and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
32 chapter 1 Data and Distributions
( ) ( )
.2 .2
.1 .1
Shaded area = .10
0 0
0 10 20 0 10 20
11.5 = 90th percentile
Figure 1.17 The density curve and 90th percentile for Example 1.13
from which c 5 2[ln(.1)]y.2 5 11.5. Only about 10% of all inquiries will have re-
sponse times exceeding 11.5 sec.
The density function in Example 1.13 is a particular case of a more general function.
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Describing Distributions 33
between successive arrivals at a service facility, the amount of time to complete a specified
task, and the 1-hr concentration of carbon monoxide in an air sample. In Sections 1.4 and
1.5, we introduce several other important continuous distributions.
Discrete Distributions
Let’s focus on a variable x whose possible values are nonnegative integers; usually the value
of x results from counting something. A histogram of sample data will have rectangles
centered at values 0, 1, 2, . . . (or some subset of these) regardless of the sample size. How-
ever, as the sample size increases, the relative frequencies (sample proportions of various
x values) tend to get closer and closer to their true population or process counterparts. We
will use the following notation:
proportion of x values in the population that equal 0, or the long run
p(0) 5 proportion of x values in a process that equal 0
and so on. None of these proportions can be negative, and their sum must be 1 (so that
100% of the x values are included).
Example 1.14 Consider a package of four batteries of a particular type, and let x denote the number
of satisfactory (i.e., nondefective) batteries in the package. Possible values of x are 0,
1, 2, 3, and 4. One reasonable distribution for x is specified by the following mass
function:
24
p(x) 5 (.9)x(.1)42x x 5 0, 1, 2, 3, 4
x!(4 2 x)!
where “!” is the factorial symbol (e.g., 4! 5 (4)(3)(2)(1) 5 24, 1! 5 1, and 0! 5 1).
This looks a bit intimidating, but there is an intuitive argument leading to p(x) that
we will mention shortly. Substituting x 5 3, we get
24
p(3) 5 (.9)3(.1)1 5 .2916
(6)(1)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
34 chapter 1 Data and Distributions
That is, roughly 29% of all packages will have three good batteries. Substituting the
other x values gives us the following tabulation:
x: 0 1 2 3 4
p(x): .0001 .0036 .0486 .2916 .6561
The proportion of packages with at least two good batteries is
proportion of packages with x
p(2) 1 p(3) 1 p(4) 5 .9963
values between 2 and 4 (inclusive) 5
More than 99% of all packages have at least two good batteries.
In Section 1.6, we will generalize the distribution of Example 1.14 and introduce one
additional important discrete distribution.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Exercises 35
a. What proportion of rainfall durations at this b. For the legitimate distribution of part (a), deter-
location are at least 2 hours? At most 3 hours? mine the long-run proportion of cars having at
Between 2 and 3 hours? most two underinflated tires, the proportion hav-
b. What must the duration of a rainfall be to place ing fewer than two underinflated tires, and the
it among the longest 5% of all times? proportion having at least one underinflated tire.
23. Extensive experience with fans of a certain type used 27. A mail-order computer business has six telephone
in diesel engines has suggested that the exponential lines. Let x denote the number of lines in use at a
distribution with 5 .00004 provides a good model specified time. Suppose the mass function of x is
for time until failure (hr). given by
a. Sketch a graph of the density function.
x: 0 1 2 3 4 5 6
b. What proportion of fans will last at least
p(x): .10 .15 .20 .25 .20 ? ?
20,000 hr? At most 30,000 hr? Between 20,000
and 30,000 hr? a. In the long run, what proportion of the time will at
c. What must the lifetime of a fan be to place it among most three lines be in use? Fewer than three lines?
the best 1% of all fans? Among the worst 1%? b. In the long run, what proportion of the time will at
least five lines be in use?
24. The article “Probabilistic Fatigue Evaluation of c. In the long run, what proportion of the time will
Riveted Railway Bridges” (J. of Bridge Engr., 2008: between two and four lines, inclusive, be in use?
237–244) suggested the exponential distribution with d. In the long run, what proportion of the time will at
5 1 6 as a model for the distribution of stress range least four lines not be in use?
(MPa) in certain bridge connections.
a. What proportion of stress ranges are at least 28. A contractor is required by a county planning de-
2 MPa? At most 7 MPa? Between 5 and 10 MPa? partment to submit 1, 2, 3, 4, or 5 forms (depending
b. What value separates the highest 2% of the stress on the nature of the project) when applying for a
ranges from the remaining 98%? building permit. Let y denote the number of forms
required for an application, and suppose the mass
25. The actual tracking weight of a stereo cartridge set to function is given by p(y) 5 cy for y 5 1, 2, 3, 4, or
track at 3 g can be regarded as a continuous variable 5. Determine the value of c, as well as the long-run
with density function f (x) 5 c[1 2 (x 2 3)2] for 2 , proportion of applications that require at most three
x , 4 and f (x) 5 0 otherwise. forms and the long-run proportion that require be-
a. Determine the value of c [you might find it help- tween two and four forms, inclusive.
ful to graph f (x)].
b. What proportion of actual tracking weights ex- 29. Many manufacturers have quality control programs
ceed the target weight? that include inspection of incoming materials for
c. What proportion of actual tracking weights are defects. Suppose a computer manufacturer receives
within .25 g of the target weight? computer boards in batches of five. Two boards
are randomly selected from each batch for inspec-
26. Let x represent the number of underinflated tires on tion. Consider batches for which exactly two of the
an automobile. boards are defective; for convenience, number the
a. Which of the following p(x) functions specifies a defective boards as 1 and 2, and the nondefective
legitimate distribution for x, and why are the other boards as 3, 4, and 5. Let x denote the number of
two not legitimate? defective boards among the two actually inspected,
(i) p(0) 5.3, p(1) 5 .2, and determine the mass function of x. Hint: One
p(2) 5 .1, p(3) 5 .05, p(4) 5 .05 possible sample of size 2 consists of boards 1 and
(ii) p(0) 5 .4, 2, another of boards 1 and 3, and so on. How many
p(1) 5 p(2) 5 p(3) 5 .1, p(4) 5 .3 such samples are there, and what is the value of x
(iii) p(x) 5 .2(3 2 x) for x 5 0, 1, 2, 3, 4 for each sample?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
36 chapter 1 Data and Distributions
( )
.08
.06
.04
.02
0
80 90 100 110 120
DEFINITION A continuous variable x is said to have a normal distribution with parameters Unless otherwise noted, all content on this page is © Cengage Learning.
and , where 2∞ , , ∞ and . 0, if the density function of x is
1 2 2
f (x) 5 e2(x2) y(2 ) 2 ,x,
22
Again, e denotes the base of the natural logarithm system and has an approximate value
of 2.71828, whereas represents the familiar mathematical constant approximately
equal to 3.14159.
Clearly, f (x) $ 0 for any number x, but techniques from multivariable calculus must
be used to show that #2 f (x) dx 5 1. The graph of f (x)—the density curve—is always a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.4 The Normal Distribution 37
bell-shaped curve (and hence symmetric) centered at , so is the median of the distribu-
tion. If the value of is close to zero, the normal curve is highly concentrated about (little
variability in the distribution), whereas a large value of corresponds to a curve that spreads
out a great deal (a substantial amount of variability). Figure 1.19 displays several different
normal density curves. Any normal curve has two inflection points—points at which the
curve changes from being concave downward to concave upward—that are equidistant
from . It can be shown that the value of is the distance from to each inflection point,
as illustrated in Figure 1.20.
= 40, = 2.5
= 10, =5
0 10 20 30 40 50
= 70, = 10
50 60 70 80 90
= 10 = 10
Suppose that capacitors of a certain type have resistances that vary according to a nor-
mal distribution, with 5 800 megohms and 5 200 megohms. If a particular applica-
tion requires a resistance between 775 megohms and 850 megohms, the proportion of
capacitors with satisfactory values of resistance (x) is
Unfortunately, none of the standard integration techniques can be used to evaluate this
integral. To calculate proportions of this sort, a special normal reference distribution is
needed.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
38 chapter 1 Data and Distributions
DEFINITIONS The normal distribution with parameter values 5 0 and 5 1 is called the
standard normal distribution. We shall use the letter z to denote a variable that
has this distribution. The corresponding density function is
1 2
f (z) 5 e2z y2 2 ,z,
22
The standard normal density curve, or z curve, is shown in Figure 1.21. It is cen-
tered at 0 and has inflection points at 61.
Appendix Table I, which also appears on the inside front cover of the book, is a tabula-
tion of cumulative z curve areas; that is, the table gives areas under the z curve to the left of
various values (to 2 ), as illustrated in Figure 1.21. Entries in this table were obtained by
using numerical integration techniques, since the standard normal density function cannot
be integrated in a straightforward way. Let’s first use this table to obtain various z curve areas
and other z curve information, and then see how the table applies to any normal curve.
–2 –1 0 1 2
Example 1.15 The proportion of values in a standard normal distribution that are less than 1.25 is
Unless otherwise noted, all content on this page is © Cengage Learning.
proportion of z values entry in Appendix Table I at the intersection
5
satisfying z , 1.25 of the 1.2 row and .05 column
5 .8944
It is also true that
proportion of z values satisfying z # 1.25 5 .8944
Similarly,
proportion of z values entry in 20.3 row and .08 column
5
satisfying z , 2.38 of Appendix Table I
5 .3520
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.4 The Normal Distribution 39
Figure 1.22 illustrates the simple relationship between an upper-tail area and a
cumulative area.
Cumulative area
1 = Total area
to the left of
Area to the right of
= –
Figure 1.22 Obtaining an “area to the right” from a cumulative curve area
In particular,
proportion of values
satisfying z . 1.25 5 area under z curve to the right of 1.25
5 1 2 area to the left of 1.25
5 1 2 .8944
5 .1056
What about the area under the z curve and above the interval between 2.38 and
1.25? Figure 1.23 shows that this is a difference between two cumulative areas:
proportion of z values
satisfying 2.38 , z , 1.25 5 (area to the left of 1.25)
2(area of the left of 2.38)
5 .8944 2 .3520 5 .5424
curve
Unless otherwise noted, all content on this page is © Cengage Learning.
= –
Figure 1.23 The area above an interval is the difference between two cumulative areas
In Example 1.15, a value on the horizontal z scale was specified and a curve area was
determined. We now reverse this process by showing how to select a value or values to cap-
ture a specified curve area.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
40 chapter 1 Data and Distributions
Example 1.16 What value c on the horizontal z axis is such that the area under the z curve to the
left of c is .67? Figure 1.24 illustrates the situation.
curve
In Appendix Table I, we must look in the main body for .6700 (or the closest
entry to it). The value .6700 does indeed appear; it is at the intersection of the 0.4 row
and the .04 column. Thus c 5 .44. That is, 67% of the area under the z curve lies to
the left of .44. Another way of expressing this is to say that .44 is the 67th percentile of
the standard normal distribution. If .6710 replaces .6700 in the question posed, the
closest tabulated entry is .6700. Rather than use linear interpolation, we generally
recommend simply using the closest entry to answer the question; our answer to the
revised question would also be (approximately) .44.
What value c captures the upper-tail z curve area .05, as illustrated in Fig-
ure 1.25? The cumulative area to the left of c must be .9500. A search for this area
in Appendix Table I reveals the following information about the two closest entries:
.9495 is in the 1.6 row and .04 column
.9505 is in the 1.6 row and .05 column
Because the desired area .9500 is halfway between the two closest entries, we use
interpolation to find c 5 1.645 (1.64 or 1.65 would also be acceptable answers).
Finally, what interval, symmetrically placed about zero, captures 95% of the
area under the z curve? This situation is illustrated in Figure 1.26.
Unless otherwise noted, all content on this page is © Cengage Learning.
curve
0 – 0
Figure 1.25 Finding the value to capture a Figure 1.26 Determining to capture a
specified upper-tail area specified central curve area
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.4 The Normal Distribution 41
Since the lower-tail area to the left of 2c must be .025, the cumulative area to
the left of c is .9500 1 .0250 5 .9750. This cumulative area is in the 1.9 row and .06
column of the z table, so c 5 1.96. Alternatively, the desired lower-tail area .0250 lies
in the 21.9 row and .06 column of the z table, so 2c 5 21.96 and again c 5 1.96.
Proposition Let x have a normal distribution with parameters and . Then the standardized
variable
x2
z5
has a standard normal distribution. This implies that if we form the standardized limits
a2 b2
a* 5 b* 5
then
proportion of x values satisfying proportion of z values satisfying
5
a ,x,b a* , z , b*
Example 1.17 The time that it takes a driver to react to the brake light on a decelerating vehicle
is critical in avoiding rear-end collisions. The article “Fast-Rise Brake Lamp as a
Collision-Prevention Device” (Ergonomics, 1993: 391–395) suggests that reaction
time for an in-traffic response to a brake signal from standard brake lights can be
modeled with a normal distribution having parameters 5 1.25 sec and 5 .46 sec.
In the long run, what proportion of reaction times will be between 1.00 sec and
1.75 sec? Let x denote reaction time. The standardized limits are
1.00 2 1.25 1.75 2 1.25
5 2.54 5 1.09
.46 .46
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
42 chapter 1 Data and Distributions
Thus
proporiton of x values proportion of z values satisfying
5
satisfying 1.00 , x , 1.75 2.54 , z , 1.09
entry in 1.0 row, .09 entry in 20.5 row,
5 2
column of z table .04 column of z tables
5 .8621 2 .2946
5 .5675
This calculation is illustrated in Figure 1.27.
Proportion of
values between
1.00 and 1.75
Normal, = 1.25, = .46
curve
1.25 0
proportion of x values
5 proportion of z values that exceed 1.63
that exceed 2.0
5 1 2 area under z curve to the left of 1.63
5 1 2 .9484 Unless otherwise noted, all content on this page is © Cengage Learning.
5 .0516
Only a bit more than 5% of all reaction times will exceed 2 sec.
Example 1.18 The amount of distilled water dispensed by a certain machine has a normal distribu-
tion with 5 64 oz and 5 .78 oz. What container size c will ensure that overflow
occurs only .5% of the time? Let x denote the amount of water dispensed. The den-
sity curve for x is pictured in Figure 1.28, which shows that c captures a cumulative
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.4 The Normal Distribution 43
= 64 0
area of .995 under this normal curve. That is, c is the 99.5th percentile of this normal
distribution. Standardizing then tells us that
proportion of x values c 2 64
5 proportion of z values satisfying z ,
satisfying x , c .78
5 .995
How can we capture cumulative area .9950 under the z curve? The 2.5 row of
Appendix Table I has entries .9949 and .9951 in the .07 and .08 columns, respectively.
Let’s use the value 2.58 (a more detailed tabulation gives 2.576). This implies that
c 2 64
5 2.58
.78
giving
c 5 64 1 2.58(.78) 5 64 1 2.0 5 66 oz
Notice that the general form of the expression for c in Example 1.18 is
c 5 1 (z critical value)
Unless otherwise noted, all content on this page is © Cengage Learning.
where the z critical value captures the desired cumulative area under the z curve. Once we
know how to capture a particular cumulative area under the z curve, it is easy to determine
how to capture the same area under any other normal curve.
A histogram of sample data may suggest that a normal curve specifies a reasonable
population or process distribution, but appropriate values of and still remain to be
chosen. In Chapter 2, we begin to see how this can be done.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
44 chapter 1 Data and Distributions
( )
.20
.15
Normal approximation to
Shaded area =
.10 proportion of values 10
.05
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
10.5
30. Suppose that values are repeatedly chosen from a a. Be at most 1.78 b. Exceed .55
standard normal distribution. c. Exceed 2.80 d. Be between .21 and 1.21
a. In the long run, what proportion of values will be e. Be either at most 22.00 or at least 2.00 Unless otherwise noted, all content on this page is © Cengage Learning.
at most 2.15? Less than 2.15? f. Be at most 2 4.2 g. Be at least 4.33
b. What is the long-run proportion of selected values
32. a. What value z* is such that the area under the stan-
that will exceed 1.50? That will exceed 22.00?
dard normal curve to the left of z* is .9082?
c. What is the long-run proportion of values that will
b. What value z* is such that the area under the stan-
be between 21.23 and 2.85?
dard normal curve to the left of that value is .9080?
d. What is the long-run proportion of values that will
c. What value z* is such that the area under the
exceed 5? That will exceed 25?
standard normal curve to the right of z* is .121?
e. In the long run, what proportion of selected values
d. What value z* is such that the area under the
z will satisfy |z| , 2.50?
standard normal curve between 2 z* and z*
31. In the long run, what proportion of values selected is .754?
from the standard normal distribution will satisfy e. How far to the right of 0 would you have to go to
each of the following conditions? capture an upper-tail z curve area of .002? How
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.4 Exercises 45
far to the left of 0 would you have to go to cap- cle Inspections” (J. of Automobile Engr., 2008: 1615–
ture this same lower-tail area? 1623) described a rolling bench test for determining
maximum vehicle speed. A normal distribution with
33. Suppose that values are successively chosen from the
5 46.8 km/h and 5 1.75 km/h is postulated.
standard normal distribution.
a. What proportion of mopeds have a maximum
a. How large must a value be to be among the largest
speed that is at most 50 km/h?
15% of all values selected?
b. What proportion of mopeds have a maximum
b. How small must a value be to be among the small-
speed that is at least 48 km/h?
est 25% of all values selected?
c. What speed separates the fastest 75% of all mopeds
c. What values are among the 4% that are farthest
from the others?
from 0?
38. Spray drift is a constant concern for pesticide ap-
34. Determine the following percentiles for the standard
plicators and agricultural producers. The inverse
normal distribution:
relationship between droplet size and drift potential
a. 91st b. 9th c. 22nd d. 99.9th
is well known. The paper “Effects of 2,4-D Formu-
35. Suppose that the thicknesses of bolts (mm) manufac- lation and Quinclorac on Spray Droplet Size and
tured by a certain process can be modeled with a nor- Deposition” (Weed Technology, 2005: 1030–1036)
mal distribution having 5 10 and 5 1. Note: The investigated the effects of herbicide formulation on
density curve here is just the standard normal curve spray atomization. A figure in the paper suggested
shifted to be centered at 10 rather than 0. the normal distribution with 5 1050 m and 5
a. What is the long-run proportion of bolts whose 150 m was a reasonable model for droplet size for
thicknesses are at most 11 mm? Hint: The corre- water (the “control treatment”) sprayed through a
sponding normal curve area is identical to what z 760 ml/min nozzle.
curve area? a. What proportion of all droplets have a size that is
b. In the long run, what proportion of these bolts less than 1500 m? At least 1000 m?
will have thickness values between 7.5 mm and b. What proportion of all droplets have a size that is
12.5 mm? between 1000 and 1500 m?
c. In the long run, what proportion of these bolts will c. How would you characterize the smallest 2% of
have thicknesses that exceed 11.5 mm? all droplets?
36. Suppose the flow of current (milliamps) in wire 39. The article “Reliability of Domestic-Waste Biofilm
strips of a certain type under specified conditions Reactors” (J. of Envir. Engr., 1995: 785–790) sug-
can be modeled with a normal distribution hav- gests that substrate concentration (mg/cm3) of influ-
ing 5 20 and 5 1 (think about how the cor- ent to a reactor is normally distributed with 5 .30
responding density curve relates to the standard and 5 .06.
normal curve). a. What proportion of concentration values exceed
a. What proportion of strips will have a current flow .25?
of between 18.5 and 22 milliamps? b. What proportion of concentration values are at
b. What proportion of strips will have a current flow most .10?
exceeding 15 milliamps? c. How would you characterize the largest 5% of all
c. How large must a current flow be to be among the concentration values?
largest 5% of all flows?
40. Consider babies born in the “normal range” of 37–43
37. Mopeds (small motorcycles with an engine capacity weeks gestational age. Extensive data supports the as-
below 50 cm3) are popular in Europe because of their sumption that for such babies born in the United States,
mobility, ease of operation, and low cost. The article birth weight is normally distributed with 5 3432 g
“Procedure to Verify the Maximum Speed of Auto- and 5 482 g. [The article “Are Babies Normal?”
matic Transmission Mopeds in Periodic Motor Vehi- (The American Statistician, 1999: 298–302) analyzed
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
46 chapter 1 Data and Distributions
data from a particular year; for a sensible choice of class b. What proportion of reels will have at most 30
intervals, a histogram did not look normal but further flaws? Fewer than 30 flaws?
investigation revealed that this was because some hos-
42. Based on extensive data from an urban freeway near
pitals measured weight in grams and others measured
Toronto, Canada, “it is assumed that free speeds
to the nearest ounce and then converted the data to
can best be represented by a normal distribution”
grams. A modified choice of class intervals that allowed
(“Impact of Driver Compliance on the Safety and
for this gave a histogram that was well described by a
Operational Impacts of Freeway Variable Speed
normal distribution.]
Limit Systems” (J. of Transp. Engr., 2011: 260–268)).
a. For babies of this type, what proportion of all birth
The values of and reported in the article were
weights exceeds 4000 g?
119 km/h and 13.1 km/h, respectively.
b. For babies of this type, what proportion of all birth
a. What percentage of vehicles have speeds that are
weights is between 3000 and 4000 g?
between 100 and 120 km/hr?
c. How would you characterize the highest .1% of all
b. What speed characterizes the fastest 10% of all
birth weights?
speeds?
d. What value c is such that the interval (3432 2 c,
c. The posted speed limit was 100 km/hr. What
3432 1 c) includes 98% of all birth weights?
percentage of vehicles were traveling at speeds
41. Let x denote the number of flaws along a 100-m exceeding this posted limit?
reel of magnetic tape (values of x are whole num- d. What two values, symmetrically placed about 119,
bers). Suppose x has approximately a normal distri- capture 90% of all vehicle speeds.
bution with 5 25 and 5 5. e. What values symmetrically placed about 119 sep-
a. What proportion of reels will have between 20 arate .1% of the most extreme vehicle speeds from
and 40 flaws, inclusive? the rest?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.5 Other Continuous Distributions 47
Figure 1.30 illustrates density curves for several different combinations of and .
Every lognormal distribution is positively skewed. The following example shows that by
taking logarithms, calculation of any lognormal curve area reduces to a normal distribu-
tion computation.
( )
.25
.20
= 1, =1
.15
.10 = 3, = !3
= 3, =1
.05
0
0 5 10 15 20 25
Example 1.19 According to the article “Predictive Model for Pitting Corrosion in Buried Oil and
Gas Pipelines” (Corrosion, 2009: 332–342), the lognormal distribution has been re-
ported as the best option for describing the distribution of maximum pit depth data
from cast iron pipes in soil. The authors suggest that a lognormal distribution with 5
.353 and 5 .754 is appropriate for maximum pit depth (mm) of buried pipelines.
Since x , 2 is equivalent to ln(x) , ln(2) 5 .693,
proportion of pipelines 5 proportion of pipelines with ln(x) , .693
with x , 2
5 area under normal (.353, .754) curve to the left of .693
5 area under z curve to the left of (.693 2 .353) .754
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
48 chapter 1 Data and Distributions
definition A variable x has a Weibull distribution with parameters and if the density
function of x is
21 2(xy)
x e x.0
f (x) 5 c
0 x#0
When 5 1, the Weibull density function reduces to the exponential density function
(with 5 1y). Figure 1.31 shows several Weibull density curves. Some combinations
of and result in a positive skew and others, a negative skew.
( )
( )
1 8
= 1, = 1 (exponential) 6
= 10, = .5
= 2, =1
.5 4
= 10, =1
= 2, = .5
= 10, =2
2
0
0 5 10 0 .5 1.0 1.5 2.0 2.5
Thus, rather than needing a table of cumulative areas, such as the z table for normal dis-
tribution calculations, we use a simple mathematical function to get this information.
Example 1.20 In recent years the Weibull distribution has been used to model engine emissions
of various pollutants. Let x denote the amount of NOx emission (g/gal) from a cer-
tain type of four-stroke engine, and suppose that x has a Weibull distribution with
5 2 and 5 10 (suggested by information in the article “Quantification of Vari-
ability and Uncertainty in Lawn and Garden Equipment NOx and Total Hydrocar-
bon Emission Factors,” J. of the Air and Waste Management Assoc., 2002: 435–448).
The corresponding density curve looks exactly like the one in Figure 1.31 for 5 2,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.5 Exercises 49
5 1 except that now the values 50 and 100 replace 5 and 10 on the horizontal axis
(because is a “scale parameter”). Then
proportion of engines emitting 2
The proportion of engines emitting at most 25 g/gal is .998, so the distribution is almost
entirely concentrated on values between 0 and 25. The value c which separates the 5% of
all engines having the largest amounts of NOx emissions from the remaining 95% satisfies
2
.95 5 1 2 e2(cy10)
Isolating the exponential term on one side, taking logarithms, and solving the result-
ing equation gives c 17.3 as the 95th percentile of the emissions distribution.
43. A theoretical justification based on a certain mate- kg/day/km) could be modeled with a lognormal dis-
rial failure mechanism underlies the assumption tribution having 5 9.164 and 5 .385.
that ductile strength of a material has a lognormal a. What proportion of source loads are at most
distribution. Suppose the values of the parameters are 15,000 kg/day/km?
5 5 and 5 .1. b. What interval (a, b) is such that 95% of all
a. What proportion of material specimens have a source loads have values in this interval, 2.5%
ductile strength exceeding 120? What proportion have values less than a, and 2.5% have values
have a ductile strength of at least 120? exceeding b?
b. What proportion of material specimens have a
45. The article “Response of SiGf/Si3N4 Composites
ductile strength between 110 and 130?
Under Static and Cyclic Loading—An Experimen-
c. If the smallest 5% of strength values were unac-
tal and Statistical Analysis” (J. of Engr. Materials
ceptable, what would be the minimum accept-
and Technology, 1997: 186–193) suggests that ten-
able strength?
sile strength (MPa) of composites under specified
44. Nonpoint source loads are chemical masses that conditions can be modeled by a Weibull distribu-
travel to the main stem of a river and its tributaries in tion with 5 9 and 5 180.
flows that are distributed over relatively long stream a. Sketch a graph of the density function.
reaches in contrast to those that enter at well-defined b. What proportion of specimens of this type have
and regulated points. The article “Assessing Uncer- strength values exceeding 175?
tainty in Mass Balance Calculation of River Nonpoint c. What proportion of specimens of this type have
Source Loads” (J. of Envir. Engr., 2008: 247–258) strength values between 150 and 175?
suggested that for a certain time period and location, d. What strength value separates the weakest 10%
x 5 nonpoint source load of total dissolved solids (in of all specimens from the remaining 90%?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
50 chapter 1 Data and Distributions
46. Suppose that fracture strength (MPa) of silicon nitride b. Construct a histogram of the natural logarithms of
braze joints under certain conditions has a Weibull dis- the lifetime observations, and comment on inter-
tribution with 5 5 and 5 125 (suggested by data esting characteristics.
in the article “Heat-Resistant Active Brazing of Silicon
11 14 20 23 31 36 39 44
Nitride: Mechanical Evaluation of Braze Joints,” Weld-
47 50 59 61 65 67 68 71
ing J., August 1997: 300s–304s).
a. What proportion of such joints have a fracture 74 76 78 79 81 84 85 91
strength of at most 100? Between 100 and 150? 93 96 99 101 104 105 105 112
b. What strength value separates the weakest 50% of 118 123 136 139 141 148 158 161
all joints from the strongest 50%? 168 184 206 248 263 289 322 388
c. What strength value characterizes the weakest 5% 513
of all joints?
49. The authors of the paper from which the data in
47. The Weibull distribution discussed in this section has the previous exercise was extracted suggested that a
a positive density function for all x . 0. In some situ- reasonable probability model for drill lifetime was a
ations, the smallest possible value of x will be some lognormal distribution with 5 4.5 and 5 .8.
number that exceeds zero. A shifted Weibull distri- a. What proportion of lifetime values are at most 100?
bution, appropriate in such situations, has a density b. What proportion of lifetime values are at least
function for x . obtained by replacing x with x 2 200? Greater than 200?
in the earlier density function formula. The article
“Predictive Posterior Distributions from a Bayesian 50. The article cited in Example 1.20 proposed the log-
Version of a Slash Pine Yield Model” (Forest Science, normal distribution with 5 4.5 and 5 .625 as a
1996: 456–463) suggests that the values 5 1.3 cm, model for total hydrocarbon emissions (g/gal).
5 4, and 5 5.8 specify an appropriate distribu- a. What proportion of engines emit at least 50 g/gal?
tion for diameters of trees in a particular location. Between 50 and 150 g/gal?
a. What proportion of trees have diameters between b. What value c separates the best 1% of engines
2 and 4 cm? with respect to THC emissions from the remain-
b. What proportion of trees have diameters that are ing 99%?
at least 5 cm? 51. The article “On Assessing the Accuracy of Offshore
c. What is the median diameter of trees, that is, the Wind Turbine Reliability-Based Design Loads from
value separating the smallest 50% from the largest the Environmental Contour Method” (Intl. J. of Off-
50% of all diameters? shore and Polar Engr., 2005: 132–140) proposes the
48. The paper “Study on the Life Distribution of Micro- Weibull distribution with 5 1.817 and 5 .863 as
drills” (J. of Engr. Manufacture, 2002: 301–305) re- a model for 1-hour significant wave height (m) at a
ported the following observations, listed in increasing certain site.
order, on drill lifetime (number of holes that a drill a. What proportion of wave heights are at most 0.5 m?
machines before it breaks) when holes were drilled in b. What proportion of wave heights are between 0.2
a certain brass alloy. and 0.6 m?
a. Construct a histogram of the data using class c. What is the 90th percentile of the wave height dis-
boundaries 0, 50, 100, . . . , and then com- tribution? The 10th percentile?
ment on interesting characteristics.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.6 Several Useful Discrete Distributions 51
values that equal 1, and so on. We now introduce the two discrete distributions
that appear most frequently in statistical applications: the binomial and the Poisson
distributions.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
52 chapter 1 Data and Distributions
The population or long-run process proportion of packages having x 5 2 is then the sum
of these six values of .0256, or 6(.0256) 5.1536. Similarly, there are four possibilities for
x 5 1—the single satisfactory cartridge could be the first, second, third, or fourth one in
the package. The proportion of SFFF’s is (.8)(.2)(.2)(.2) 5 .0064, which is also the pro-
portion of FSFF’s, FFSF’s, and FFFS’s. Adding .0064 four times gives
! 2
5 3 (1 2 ) 5 0,1, . . . ,
!( 2 )!
In the case of a population, the formula gives good approximations as long as the total
number of items examined in all batches is at most 5% of the population size (answers
are exact if the population size is infinite). For a process, it is required that the value of
remain constant over time (a stable process).
In the mass function formula, x(1 2 )n2x generalizes the multiplications
(.8) (.2) and (.8)2(.2)2 in the pen cartridge example. The factorial expression is the
3
number of possible outcomes for a batch of size n that have x S’s. For example, when
n 5 4 and x 5 2,
n! 4! (4)(3)(2)(1) (4)(3)
5 5 5 56
x!(n 2 x)! (2!)(2!) (2)(1)(2)(1) (2)(1)
as we saw previously. You can find a derivation of this formula in several of the refer-
ences listed in the bibliography.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.6 Several Useful Discrete Distributions 53
Example 1.21 The binomial distribution is used extensively in genetic applications. An early
genetics article (“The Progeny in Generation F12 to F17 of a Cross Between a Yel-
low-Wrinkled and a Green-Round Seeded Pea,” J. of Genetics, 1923: 255–331)
reported on an experiment in which four-seeded pea pods from a dihybrid cross
were examined. The variable of interest was x 5 the number of YR (yellow and
round) peas in a pod. Mendelian laws of inheritance imply that 5 9/16 5 .5625
[from (3/4)(3/4)]. Now consider peas with eight-seeded pods. The proportion of all
pods with five YR peas is
8!
(proportion with x 5 5) 5 (.5625)5(.4375)3
(5!)(3!)
5 56(.5625)5(.4375)3 5 2641
In the long run, slightly more than 50% of all pods will have five or more YR peas
and slightly less than 50% will have four or fewer YR peas. The complete distribution
of x is as follows:
x: 0 1 2 3 4 5 6 7 8
p(x): .0013 .0138 .0621 .1598 .2567 .2641 .1698 .0624 .0100
Figure 1.32 shows a picture of this distribution. The binomial histogram has a slight
negative skew (it is symmetric only when 5 .5).
Proportion
.30
Unless otherwise noted, all content on this page is © Cengage Learning.
.20
.10
0
0 1 2 3 4 5 6 7 8
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
54 chapter 1 Data and Distributions
Use of the binomial distribution formula can be tedious when n is large. Appendix
Table II gives a tabulation of p(x) for a few selected values of n and . This will allow
you to practice binomial calculations without referring to the formula. Alternatively,
values of p(x) for any n and can be obtained from Minitab and other statistical com-
puter packages.
The condition p(x) $ 0 is clearly satisfied. The fact that ^ x50 p(x) 5 1 is a consequence
of multiplying both sides of the following infinite series expansion by e2:
2 3
e 5 1 1 1 1 1
2! 3!
We shall see in Chapter 2 that can be interpreted as the average rate at which events
occur.
Example 1.22 Let x denote the number of creatures of a particular type captured in a trap during
a given time period. Suppose that x has a Poisson distribution with 4.5, so, on aver-
age, traps will contain 4.5 creatures. [The article “Dispersal Dynamics of the Bivalve
Gemma Gemma in a Patchy Environment (Ecological Monographs, 1995: 1–20) sug-
gests this model; the bivalve Gemma gemma is a small clam]. The proportion of traps
with five creatures is
e24.5(4.5)5
(proportion with x 5 5) 5 5 .1708
5!
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.6 Several Useful Discrete Distributions 55
Proportion
.20
.15
.10
.05
0
0 2 4 6 8 10 12
A small tabulation of the Poisson mass function for selected values of appears in Ap-
pendix Table III.
Unless otherwise noted, all content on this page is © Cengage Learning.
n! e2x
x(1 2 )n2x
x!(n 2 x)! x!
A more formal statement of this result is that the Poisson mass function on the right-
hand side is the limit of the binomial function on the left as n , 0 in such a
way that n .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
56 chapter 1 Data and Distributions
Example 1.23 Components of a certain type are shipped from a supplier to customers in lots of 5000.
Because the purchaser cannot check the condition of each component, a sample of 25
is selected and tested. The entire lot will then be accepted only if the number of compo-
nents x that do not conform to specification is at most three (so here S’s are nonconform-
ing units, not what we usually think of as a success). Suppose that .5% of all components
are nonconforming, giving 5 100(.005) 5 .5. Then the proportion of acceptable lots is
proportion of lots with x # 3
5 p(0) 1 p(1) 1 p(2) 1 p(3)
100! 100!
5 (.005)0(.995)100 1 … 1 (.005)3(.995)97
0!100! 3!97!
e2.5(.5)0 … e2.5(.5)3
1 1
0! 3!
5 .6065 1 .3033 1 .0758 1 .0126
5 .9982
The exact proportion using the binomial mass function is .6058 1 .3044 1 .0757 1
.0124 5 .9983.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.6 Exercises 57
whether any particular signal is red is independent of this new acceptance sampling plan. Does
whether any other one is red. this plan appear more satisfactory than the
a. On what proportion of days will our friend en- original plan?
counter at most two red lights? At least five red
57. Suppose that the number of drivers who travel be-
lights?
tween a particular origin and destination during a
b. On what proportion of days will our friend en-
designated time period has a Poisson distribution
counter between two and five (inclusive) red
with parameter 5 20 (suggested in the article
lights?
“Dynamic Ride Sharing: Theory and Practice,” J.
55. Suppose that 10% of all bits transmitted through of Transp. Engr., 1997: 308–312). In the long run,
a digital communication channel are erroneously in what proportion of time periods will the number
received and that whether any particular bit is er- of drivers
roneously received is independent of whether any a. Be at most 10?
other bit is erroneously received. Consider sending b. Exceed 20?
a very large number of messages, each consisting of c. Be between 10 and 20, inclusive? Be strictly
20 bits. between 10 and 20?
a. What proportion of these messages will have at
58. Let x be the number of material anomalies occur-
most 2 erroneously received bits?
ring in a particular region of an aircraft gas-turbine
b. What proportion of these messages will have at
disk. The article “Methodology for Probabilistic
least 5 erroneously received bits?
Life Prediction of Multiple-Anomaly Materials”
c. For what proportion of these messages will more
(Amer. Inst. of Aeronautics and Astronautics J.,
than half the bits be erroneously received?
2006: 787–793) proposes a Poisson distribution for
56. Components arrive at a distributor in very x. Suppose that 5 4.
large batches. A batch can be characterized as ac- a. What proportion of gas-turbine disks have exactly
ceptable only if the fraction of defective compo- one anomaly?
nents in the batch is at most .10. The distributor b. What proportion of gas-turbine disks have at least
decides to randomly select ten components from three anomalies?
the batch, test each one, and accept the batch c. What proportion of gas-turbine disks have be-
only if the sample contains at most two defec- tween one and six anomalies inclusive?
tive components. Assume that the condition of
59. Let x denote the number of trees in a quarter-acre
any particular component is independent of any
plot within a certain forest. Suppose that x has a
other.
Poisson distribution with 5 20 (corresponding to
a. If the actual fraction of defectives in each batch is
an average density of 80 trees per acre). In what pro-
only 5 .01, what proportion of batches will be
portion of such plots will there be at least 15 trees?
accepted? Repeat this calculation for the follow-
At most 25 trees?
ing values of : .05, .10, .20, and .25.
b. A graph of the proportion of batches accepted 60. An article in the Los Angeles Times (Dec. 3, 1993)
versus the actual fraction of defectives is reports that 1 in 200 people carry the defective
called the operating characteristic curve. Use gene that causes colon cancer. Let x denote the
the results of part (a) to sketch this curve for number of people in a group of size 1000 who
0 # # 1 (proportion of batches accepted is carry this defective gene. What is the approximate
on the vertical axis and is on the horizontal distribution of x? Use this approximate distribu-
axis). tion to determine the proportion of all such groups
c. Suppose the distributor decides to be more having at least 8 people who carry the defective
demanding by accepting a batch only if gene, as well as the proportion of all such groups
the sample contains at most one defective for which between 5 and 10 people (inclusive)
component. Repeat parts (a) and (b) with carry the defective gene.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
58 chapter 1 Data and Distributions
Supplementary Exercises
61. The accompanying frequency distribution of fracture a. What proportion of elapsed times exceed
strength (MPa) observations for ceramic bars fired 1.5 days?
in a particular kiln appeared in the article “Evaluat- b. What is the 90th percentile of the elapsed time
ing Tunnel Kiln Performance” (Amer. Ceramic Soc. distribution?
Bull., August 1997: 59–63). 64. Let x denote the distance (m) that an animal moves
812 832 852 872 892 from its birth site to the first territorial vacancy it en-
Class:
, 83 , 85 , 87 , 89 , 91 counters. Suppose that for banner-tailed kangaroo
Frequency: 6 7 17 30 43 rats, x has an exponential distribution with parame-
912 932 952 972 ter 5 .01386 (as suggested in the article “Compe-
Class:
, 93 , 95 , 97 , 99 tition and Dispersal from Multiple Nests,” Ecology,
Frequency: 28 22 13 3 1997: 873–883).
a. Construct a histogram based on relative frequen- a. What proportion of distances are at most 100 m?
cies, and comment on any interesting features. At most 200 m? Between 100 m and 200 m?
b. What proportion of the strength observations are b. What proportion of distances are at least 50 m?
at least 85? Less than 95? c. What is the median distance, that is, the value that
c. Roughly what proportion of the observations are separates the smallest 50% of all distances from
less than 90? the largest 50%?
62. The article cited in Exercise 61 presented compel- 65. Suppose the unloading time x (centiminutes) of
ling evidence for assuming that fracture strength a forwarder in a harvesting operation could be as-
(MPa) of ceramic bars fired in a particular kiln sumed to be lognormal with 5 6.5 and 5 .75,
is normally distributed (while commenting that as suggested in the article “Simulating a Harvester-
the Weibull distribution is traditionally used as Forwarder Softwood Thinning” (Forest Products J.,
a model). Suppose that 5 90 and 5 3.75, May 1997: 36–41).
which is consistent with data given in the article. a. What proportion of unloading times exceed 1000?
a. In the long run, what proportion of bars would 2000? 3000?
have strength values less than 90? Less than 95? At b. What proportion of times are between 2500 and
least 95? 5000?
b. In the long run, what proportion of bars would c. What value characterizes the fastest 10% of all
have strength values between 85 and 95? Between times?
80 and 100? d. Sketch a graph of the density function of x. Is the
c. What value is exceeded by 90% of the fracture positive skewness quite pronounced?
strengths for all such bars? 66. In an experiment, 25 laminated glass units configured
d. What interval centered at 90 includes 99% of all in a particular way are subjected to an impact test (cf.
fracture strength values? “Performance of Laminated Glass Units Under Simu-
63. Once an individual has been infected with a cer- lated Windborne Debris Impacts,” J. of Architectural
tain disease, let x represent the time (days) that Engr., 1996: 95–99). We are interested in the number
elapses before the individual becomes infectious. of units that sustain an inner glass ply fracture. Sup-
The article “The Probability of Containment for pose that the long-run proportion of all such units that
Multitype Branching Process Models for Emerg- fracture is .20. In the long run, for what proportion of
ing Epidemics” (J. of Applied Probability, 2011: such experiments will the number of fractures be
173–188) proposes a Weibull distribution with 5 a. At least 10?
2.2 and 5 1.1 for x 2 .5 (i.e. the Weibull density b. At most 5?
curve is shifted to the right of 0 by .5; Minitab refers c. Between 5 and 10 inclusive?
to .5 as the value of the threshold parameter). d. Strictly between 5 and 10?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 59
67. Airlines frequently overbook flights. Suppose that for a. Construct a stem-and-leaf display of the data.
a plane with 100 seats, an airline takes 110 reserva- b. What is a typical or representative flow value?
tions. Let x represent the number of people with res- Does the data appear to be highly concentrated or
ervations who actually show up for a sold-out flight. quite spread out about this typical value?
From past experience, we know that the distribution c. Does the distribution of values appear to be rea-
of x is as follows: sonably symmetric? If not, how would you de-
scribe the departure from symmetry?
x: 95 96 97 98 99 100 101 102 103
d. Does the data set appear to contain any outliers?
p(x): .05 .10 .12 .14 .24 .17 .06 .04 .03
e. Construct a histogram using class boundaries 2, 3,
x: 104 105 106 107 108 109 110 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, and 20. From your
p(x): .02 .01 .005 .005 .005 .0037 .0013 histogram, approximately what proportion of the
observations are at most 11? Compare this with
a. For what proportion of such flights is the airline the exact proportion that are at most 11.
able to accommodate everyone who shows up
for the flight? 69. Let x denote the vibratory stress (psi) on a wind
b. For what proportion of all such flights is it not pos- turbine blade at a particular wind speed in a wind
sible to accommodate all passengers? tunnel. The article “Blade Fatigue Life Assessment
c. For someone who is trying to get a seat on such with Applications to VAWTS” (J. of Solar Energy
a flight and is number 1 on the standby list, Engr., 1982: 107–111) proposes the Rayleigh distri-
what proportion of the time is such an indi- bution as a model; the density function is
vidual able to take the flight? Answer the ques- x x.0
? e2x y(2 )
2 2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
60 chapter 1 Data and Distributions
one suggested in “The Statistical Properties of Freeway b. Determine the mass function of x. Hint: The cu-
Traffic” (Transportation Research, 1977: 221–228): mulative proportion function jumps only at pos-
sible values of x.
.15e2.15(x2.5) x . .5
f (x) 5 e c. Use the cumulative proportion function to deter-
0 otherwise
mine the proportion of all policyholders for which
a. Sketch the corresponding density curve, and verify 3 # x # 6, and check to see that the mass function
that f (x) is a legitimate density function. gives this same proportion.
b. What proportion of time headways are at most
74. Based on data from a dart-throwing experiment, the
5 sec? Between 5 and 10 sec?
article “Shooting Darts” (Chance, Summer 1997,
c. What value separates the smallest 50% of all time
16–19) proposed that the horizontal and vertical er-
headways from the largest 50%?
rors from aiming at a point target should be indepen-
d. What value characterizes the largest 10% of all
dent of one another, each with a normal distribution
time headways?
having parameters 5 0 and . It can then be shown
72. A k-out-of-n system is one that will function if and that the density function of the distance from the tar-
only if at least k out of the n individual components get to the landing point is
in the system function. If individual components
v 2 2
function independently of one another and the f (v) 5 2 ? e2v y2 v.0
long-run proportion of components that function
is .9, what is the long-run proportion of 3-out-of-5
a. This pdf is a member of what family introduced in
systems that will function?
this chapter?
73. An insurance company offers its policyholders a num- b. If 5 20 mm (close to the value suggested in the
ber of different premium payment options. Let x de- paper), what proportion of darts will land within
note the number of months between successive pay- 25 mm (roughly 1 in.) of the target?
ments chosen by a policyholder. For any particular
75. The bursting strength of wine bottles of a certain type
number k, the proportion of x values that are at most k
is normally distributed with parameters 5 250 psi
(i.e., # k) is called a cumulative proportion.Consider
and 5 30 psi. If these bottles are shipped 12 to a
the following cumulative proportions: 0 for x , 1, .30
carton, in what proportion of cartons will at least one
for 1 , # x , 3, .40 for 3 # x , 4, .45 for 4 # x , 6, .60
of the bottles have a bursting strength exceeding 300
for 6 # x , 12, and 1 for x $ 12.
psi? Hint: Think of a bottle as a success S if its burst-
a. Graph this cumulative proportion function, that
ing strength exceeds 300 psi.
is, graph (proportion of x values # k) versus k.
Bibliography
Chambers, John, William Cleveland, Beat Kleiner, and New York, 1992. A veritable encyclopedia of informa-
Paul Tukey, Graphical Methods for Data Analysis, tion on discrete distributions.
Wadsworth, Belmont, CA, 1983. A very readable source Johnson, Norman, Samuel Kotz, and N. Balakrishnan,
for information on constructing histograms, checking Continuous Univariate Distributions (vol. 1, 2nd
the plausibility of various distributions, and other visual ed.), Wiley, New York, 1994. An encyclopedic reference
techniques. for continuous distributions.
Cleveland, William, The Elements of Graphing Data Olkin, Ingram, Cyrus Derman, and Leon Gleser,
(2nd ed.), Hobart Press, Summit, NJ, 1994. An infor- Probability Models and Applications (2nd ed.),
mal and informative introduction to various aspects of Macmillan, New York, 1994. Contains in-depth dis-
graphical analysis. cussions of both general properties of discrete and
Johnson, Norman, Samuel Kotz, and Adrienne Kemp, continuous distributions and results for specific
Univariate Discrete Distributions (2nd ed.), Wiley, distributions.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 61
2
MichaelTaylor/Shutterstock.com
Numerical Summary
Measures
2.1 Measures of Center
2.2 Measures of Variability
2.3 More Detailed Summary Quantities
2.4 Quantile Plots
Introduction
In Chapter 1, we learned how to describe sample data using either a stem-
and-leaf display or a histogram. We then saw how a density function or mass
function could be used to represent the distribution of a variable in an entire
population or process. Often an investigator will want to obtain or convey in-
formation about particular characteristics of data. In this chapter, we first in-
troduce several numerical summary measures that describe where a sample or
distribution is centered. Another important aspect of a sample or distribution
is the extent of spread about the center. In Section 2.2, we develop the most
useful measures of variability. In Section 2.3, we consider more detailed data
summaries and how they can be combined to yield concise yet informative data
descriptions. Once sample data has been obtained, it is often important to know
whether it is plausible that the data came from a particular type of distribution,
such as a normal distribution or a Weibull distribution. In Section 2.4, we show
how to construct a picture from which the plausibility of any particular type of
underlying distribution can be judged.
61
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
62 chapter 2 Numerical Summary Measures
x1 1 x2 1 1 xn
^ xi
i51
x5 5
n n
The numerator of x can be written more informally as ^ xi, where the summation
is over all sample observations.
For reporting x, we recommend using decimal accuracy of one digit more than the ac-
curacy of the xi’s. Thus if observations are stopping distances with x1 5 125, x2 5 131,
and so on, we might have x 5 127.3 ft.
Example 2.1 In recent years there has been growing commercial interest in the use of what is
known as internally cured concrete. This concrete contains porous inclusions most
commonly in the form of lightweight aggregate (LWA). In the article “Characterizing
Lightweight Aggregate Desorption at High Relative Humidities Using a Pressure Plate
Apparatus” (J. of Materials in Civil Engr., 2012: 961–969), researchers examined
various physical properties of 14 LWA specimens. The following are the 24-hour
water absorption percentages for the 14 specimens:
x1 5 16.0 x2 5 30.5 x3 5 17.7 x4 5 17.5 x5 5 14.1
x6 5 10.0 x7 5 15.6 x8 5 15.0 x9 5 19.1 x10 5 17.9
x11 5 18.9 x12 5 18.5 x13 5 12.2 x14 5 6.0
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Measures of Center 63
Figure 2.1 shows a stem-and-leaf display of the data (the tenths digit is truncated);
a water absorption percentage in the midteens appears to be “typical.” With
^ xi 5 229.0, the sample mean is x 5 229.0
14 5 16.36, a value consistent with informa-
tion conveyed by the stem-and-leaf display.
The mean suffers from one deficiency that makes it an inappropriate measure of
center under some circumstances: Its value can be greatly affected by the presence
of even a single outlier (unusually large or small observation). In Example 2.1, the
value x2 5 30.5 is obviously an outlier. Without this observation, x 5 15.27; the outlier
increases the mean by more than 1%. If the 30.5 observation were replaced by the
relatively large value 90.0, a really extreme outlier, then x 5 288.5y14 5 20.61, which is
larger than any of the other observations!
The Sample Median
An alternative measure of center that resists the effects of outliers is the median. The me-
dian strip of a roadway divides the roadway into two equal parts, and the sample median does
the same for the sample. If, for example, n 5 5 and the observations are ordered from small-
est to largest, the third observation from either end is the median. When n 5 6, though,
there are two middle values in the ordered list; the median is the average of these two values.
The sample median, denoted by x, is obtained by first ordering the sample obser-
~
definition
vations from smallest to largest. Then
n11
Unless otherwise noted, all content on this page is © Cengage Learning.
average of two n n
5 average of th and a 1 1b th values n even
middle values 2 2
Example 2.2 People not familiar with classical music might tend to believe that a composer’s in-
structions for playing a particular piece are so specific that the duration would not
depend at all on the performer(s). However, there is typically plenty of room for
interpretation, and orchestral conductors and musicians take full advantage of this.
We went to the website ArkivMusic.com and selected a sample of 12 recordings of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
64 chapter 2 Numerical Summary Measures
Beethoven’s stunningly beautiful Symphony No. 9 (the “Chorale”), and found the
following durations (min) listed in increasing order:
62.3 62.8 63.6 65.2 65.7 66.4 67.4 68.4 68.8 70.8 75.7 79.0
Figure 2.2 is a dotplot of the data:
60 65 70 75 80
Duration
Since n 5 12 is even, the sample median is the average of the ny2 5 sixth and
(ny2 1 1) 5 seventh values from the ordered list:
66.4 1 67.4
x5 5 66.90
~
2
Note that if the largest observation 79.0 had not been included in the sample, then the
resulting sample median for the n 5 11 remaining observations would have been the
single middle value 67.4 [the (n 1 1)y2 5 sixth ordered value—i.e., the sixth value in
from either end of the ordered list]. The sample mean is x 5 ^ xi 5 816.1y12 5 68.01,
a bit more than a full minute larger than the median. The mean is pulled out a bit
relative to the median because the sample “stretches out” somewhat more on the up-
per end than on the lower end.
The largest observation or even the largest two or three observations in Ex
ample 2.2 can be increased by an arbitrary amount without impacting x. Similarly,
~
decreasing several of the smallest observations by any amount does not affect the me-
dian. In contrast to x, the median is impervious to many outliers.
Trimmed Means
A trimmed mean is a compromise between x and x; it is less sensitive to outliers than ~
the mean but more sensitive than the median. The observations are again first ordered
from smallest to largest. Then a trimming percentage 100r% is chosen, where Unless otherwise noted, all content on this page is © Cengage Learning.
r is a number between 0 and .5. Suppose that r 5 .1, so the trimming percentage
is 10%. Then if n 5 20, 10% of 20 is 2; the 10% trimmed mean results from deleting
(trimming) the largest two and the smallest two observations, and then averaging the
remaining 16 values. Notice that the trimming percentage specifies the number of
observations to be deleted from each end of the ordered list. The sample mean is a 0%
trimmed mean, whereas the median is a trimmed mean corresponding to the largest
possible trimming percentage (e.g., a 45% trimmed mean when n 5 20).
Example 2.3 Consider the following 20 observations, ordered from smallest to largest, each repre-
senting the lifetime (hr) of a certain type of incandescent lamp:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Measures of Center 65
612 623 666 744 883 898 964 970 983 1003
1016 1022 1029 1058 1085 1088 1122 1135 1197 1201
The effect of trimming here is to produce a central value that is somewhat larger than
the mean yet considerably below the median. Similarly, the 20% trimmed mean
averages the middle 12 values to obtain xtr(20) 5 999.9, which is even closer to the
median. The various measures of center are illustrated in the dotplot of Figure 2.3.
–
tr(10)
Figure 2.3 Dotplot of lifetimes and measures of center for Example 2.3
mean value and the median are frequently used measures for continuous distributions.
Discrete Distributions
Plastic parts manufactured using an injection molding process may exhibit one or more
defects, including sinks, scratches, black spots, and so on. Let x represent the number of
defects on a single part, and suppose the distribution of x is as follows:
x: 0 1 2 3 4
p(x): .80 .14 .03 .02 .01
A picture of the distribution appears in Figure 2.4. Where is this distribution centered?
That is, what is the mean or long-run average value of x? A first thought might be to sim-
ply average the five possible values of x to obtain a mean value of 2.0. But this entails
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
66 chapter 2 Numerical Summary Measures
giving the same weight to each possible value, whereas the distribution indicates that
x 5 0 occurs much more frequently than any of the other values. So what is needed is
a weighted average of x values.
Proportion
.80
.70
.60
.50
.40
.30
.20
.10
0
0 1 2 3 4
definition The mean value (alternatively, expected value) of a discrete variable x, denoted
by x or just [alternatively, E(x)] is given by
x 5 ^ x ? p(x)
Example 2.4 We return now to the plastic part scenario introduced at the outset of this subsection.
The mean value of x, the number of defects on a part, is
Unless otherwise noted, all content on this page is © Cengage Learning.
4
x 5 ^ x ? p(x)
x50
When we consider the population of all such parts, the population mean value of
x is .30. Alternatively, .30 is the long-run average value of x when part after part is
monitored. It can also be shown that the histogram of the distribution of Figure 2.4
will balance on the tip of a fulcrum placed on the horizontal axis only if the tip is
at .30; is the balance point of the distribution.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Measures of Center 67
In Example 2.4, is not a possible value of x. In the same way, if x is the number of
children in a household, the population mean value of x might be 1.7 even though there
are no households with 1.7 children.
In Chapter 1, we introduced two important types of discrete distributions, the bi-
nomial distribution and the Poisson distribution. The binomial distribution models the
number of “successes” in a group of n items when conditions of individual items are
independent of one another and the long-run proportion of successes is (a number
between 0 and 1). The mean value of x is
n
n!
x 5 ^ x x
(12 )n2x
x50 x!(n 2 x)!
The summation looks very intimidating, but fortunately some algebraic manipulation
yields an extremely simple result.
e2x e2x
x 5 ^ x 5^x
x50 x! x51 x!
e2x21
5 ^
x51 (x 2 1)!
Suppose, for example, that x is the number of burnt potato chips in a 13-oz bag. If x
has a Poisson distribution with parameter 5 2.5, then x 5 2.5; the population mean
number of burnt chips per bag is 2.5.
Continuous Distributions
A distribution for a continuous variable x is specified by a density function f (x) whose
graph is a smooth curve. To obtain , we replace summation in the discrete case by
integration and replace the mass function p(x) by the density function.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
68 chapter 2 Numerical Summary Measures
definition The mean value (or expected value) of a continuous variable x with density
function f (x) is given by
x 5 # x ? f (x) dx
2
Just as in the discrete case is the balance point for the histogram corresponding to p(x),
in the continuous case is the balance point for the density curve corresponding to f (x).
Example 2.5 The distribution of the amount of gravel (tons) sold by a particular construction sup-
ply company in a given week is a continuous variable x with density function
f (x) 5 1.5(1 2 x2) 0 # x # 1
(f(x) 5 0 outside the interval from 0 to 1). The density curve is shown in Figure 2.5.
Knowledge of the mean value of x will help the company decide on a price for the gravel:
1
x 5 # xf (x) dx 5 # x[1.5(1 2 x2)] dx
2 0
1 x2 x4 2 1
5 1.5# (x 2 x3) dx 5 1.5 a 2 b 5 .375
0 2 4 0
( )
1.5
0 1
1 1
5 # e2(x2) y(2 ) dx 1 # (x 2 )
2 2 2 2
e2(x2) y(2 ) dx
2
22 2 22
1 x2
#2
2
51 ye2y y2dy using y 5
22
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Measures of Center 69
The latter integral is zero because the integrand g(y) is an odd function (g(2 y) 5
2 g(y)), which gives the desired result.
A lognormal variable x is one for which ln(x) has a normal distribution with mean
value . That is, ln(x) 5 . Therefore, it might seem that x 5 e, but this is not the
case. It can be shown that
2
x 5 e1 y2
m and x
If x1, . . . , xn have been randomly selected from some population or process distribution
with mean value , then the sample mean x gives a point estimate for . In Example 2.1,
we calculated x 5 16.36, so a reasonable educated guess for the population mean
water-absorption percentage is 16.36%. Estimation—both point (a single number) and
interval—will be discussed in Chapter 7.
of a continuous distribution divides the area under the density curve into two equal
halves. The defining condition is
~
#2 f(x) dx 5 .5
Example 2.6 (Example 2.5 continued) The median for the distribution of weekly gravel sales
satisfies
x3 2
~
Using c in place of , we have the cubic equation 1.5(c 2 c3y3) 5 .5, whose so-
~
is somewhat larger than the median because the distribution is positively skewed
(see Figure 2.5).
Figure 2.6 shows the relationship between the mean and the median for various
types of unimodal distributions or (smoothed) histograms. The median of a discrete dis-
tribution can also be defined; see one of the chapter references for details.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
70 chapter 2 Numerical Summary Measures
Figure 2.6 The relationship between the mean and the median for a continuous
distribution or smoothed histogram
Just as the sample mean gives a point estimate of the population mean , the sam-
ple median x gives a point estimate of the population median. If the population distribu-
~
tion is symmetric (as is any normal distribution), both x and x are estimates of the same
~
population characteristic, namely, the point of symmetry. The issue of which estimate to
use will be addressed in Section 7.1.
for Endotoxin—An Essential but Underestimated London’s Victoria and Albert Museum (“Enigmas
Issue” (Indoor Air, 2006: 20–27) considered various of Bidri,” Surface Engr., 2005: 333–339), which are
issues associated with determining endotoxin con- listed in increasing order:
centration. The following data on concentration 2.0 2.4 2.5 2.6 2.6 2.7 2.7
(EU/mg) in settled dust for one sample of urban 2.8 3.0 3.1 3.2 3.3 3.3 3.4
homes and another of farm homes was kindly 3.4 3.6 3.6 3.6 3.6 3.7 4.4
supplied by the authors of the article. 4.6 4.7 4.8 5.3 10.1
U: 6.0 5.0 11.0 33.0 4.0 5.0 a. Construct a stem-and-leaf display of the data.
80.0 18.0 35.0 17.0 23.0 How does it suggest that the sample mean and
F: 4.0 14.0 11.0 9.0 9.0 8.0 median will compare?
4.0 20.0 5.0 8.9 21.0 9.2 b. Calculate the values of the sample mean and
3.0 2.0 0.3 median. Hint: ^ xi 5 95.0.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Exercises 71
c. By how much could the largest observation, 7. An experiment to study the lifetime (hr) for a cer-
10.1, be increased without affecting the value tain type of component involved putting ten com-
of the sample median? By how much could this ponents into operation and observing them for
value be decreased without affecting the value of 100 hours. Eight of the components failed during
the sample median? that period, and those lifetimes were recorded.
Denote the lifetimes of the two components still
4. Suppose that after computing xn based on n sample
functioning after 100 hours by 1001. The resulting
observations x1, . . . , xn, another observation xn11 be-
sample observations were 48, 79, 1001, 35, 92, 86,
comes available. What is the relationship between
57, 17, 1001, and 29. Which of the measures of
the mean of the first n observations, the new ob-
center discussed in this section can be calculated,
servation, and the mean of all n 1 1 observations?
and what are the values of those measures? Note:
The mean of the 10 observations in Exercise 1
The data from this experiment is said to be “cen-
is 640.5. If an 11th property had sold at a price
sored on the right”; patient lifetimes in medical ex-
of 780, what would be the mean sale price for all
perimentation are sometimes obtained in this way.
11 properties?
8. A target is located at the point 0 on a horizontal axis.
5. In the article “Evaluation of Optimal Power Op-
Let x be the landing point of a shot aimed at the
tions for Base Transceiver Stations of Mobile
target, a continuous variable with density function
Telephone Networks Cameroon” (Solar Energy,
f (x) 5 .75(1 2 x2) for 21 # x # 1. What is the mean
2012: 2935–2949), researchers recorded site spe-
value of x?
cific information for remote telecommunications
stations throughout Cameroon. The following ob- 9. Let x denote the amount of time for which a book
servations are daily energy demand readings (kWh) on 2-hour reserve at a college library is checked out
for 12 stations: by a student, and suppose that x has density func-
tion f (x) 5 .5x for 0 , x , 2.
17.76
23.44 24.58
26.99 27.23
30.77
a. What is the mean value of x? Why is the mean
31.79 35.57 36.59 36.59
40.51 59.31
value not 1, the midpoint of the interval of posi-
Without doing any computation, how do you think tive density?
the sample mean compares to the sample median? b. What is the median of this distribution, and how
What would you report as representative, or typical, does it compare to the mean value?
of the daily energy demand for these stations? What c. What proportion of checkout times are within
prompted your choice? one-half hour of the mean time? What propor-
tion are within one-half hour of the median time?
6. Blood pressure values are often reported to the
nearest 5 mmHg (100, 105, 110, and so on). Sup- 10. Let x have a uniform distribution on the interval from
pose the actual blood pressure values for nine ran- a to b, so the density function of x is f (x) 5 1y(b 2 a)
domly selected individuals are for a # x # b. What is the mean value of x?
118.6 127.4 138.4 130.0 113.7 11. The weekly demand for propane gas (1000s of gal-
122.0 108.3 131.5 133.2 lons) at a certain facility is a continuous variable
with density function
a. What is the median of the reported blood pres-
1
sure values? 2a1 2 b 1#x#2
b. Suppose the blood pressure of the second indi- f (x) 5 c x2
vidual is 127.6 rather than 127.4 (a small change 0 otherwise
in a single value). How does this affect the me- Determine both the mean value and the median.
dian of the reported values? What does this say In the long run, in what proportion of weeks will
about the sensitivity of the median to rounding the value of x be between the mean value and the
or grouping in the data? median?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
72 chapter 2 Numerical Summary Measures
12. Refer to Exercise 27 of Section 1.3, in which x was the mean value of any function h(x) is computed
1:
* * * * * * * * *
2:
3:
30 40 50 60 70
Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 2.7 Samples with identical measures of center but different
amounts of variability
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.2 Measures of Variability 73
Our primary measures of variability involve quantities called deviations from the
mean: x1 2 x, x2 2 x, . . . , xn 2 x. That is, the deviations from the mean are obtained by
subtracting x from each of the n sample observations. A deviation will be positive if the
observation is larger than the mean (to the right of the mean on the measurement axis)
and negative if the observation is smaller than the mean. If all the deviations are small in
magnitude, then all xi’s are close to the mean and there is little variability. On the other
hand, if some of the deviations are large in magnitude, then some xi’s lie far from x, sug-
gesting a greater amount of variability. A simple way to combine the deviations into a
single quantity is to average them (sum them and divide by n). Unfortunately, there is a
major problem with this suggestion:
n
sum of deviations 5 ^ (xi 2 x) 5 0
i51
n
so that the average deviation is always zero (because ^ i51 x 5 x 1 1 x 5 nx 5
n
^ i51 xi). In practice, the sum of the deviations may not be identically zero because of
rounding in x. The greater the decimal accuracy used in x, the closer the sum will be
to zero.
How can we change the deviations to nonnegative quantities so the positive and
negative deviations do not counteract one another when they are combined? One pos-
sibility is to work with the absolute values of the deviations and calculate the average
absolute deviation ^ uxi 2 xuyn. Because the absolute value operation leads to a number
of theoretical difficulties, consider instead the squared deviations (x1 2 x)2, (x2 2 x)2, . . . ,
(xn 2 x)2. We might now use the average squared deviation ^ (xi 2 x)2yn, but for several
reasons we will divide the sum of squared deviations by n 2 1 rather than n.
s2 5
^ (xi 2 x)2 5 Sxx
n21 n21
The sample standard deviation, denoted by s, is the (positive) square root of the
variance:
s 5 2s2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
74 chapter 2 Numerical Summary Measures
Example 2.7 The website www.fueleconomy.gov contains a wealth of information about the fuel
characteristics of various vehicles. In addition to EPA mileage ratings, there are many
vehicles for which users have reported their own values of fuel efficiency (mpg). Con-
sider the following sample of n 5 11 efficiencies for the 2009 Ford Focus equipped
with an automatic transmission (for this model, EPA reports an overall rating of
27 mpg—24 mpg in city driving and 33 mpg in highway driving):
Car xi xi 2 x (xi 2 x)2
1 27.3 25.96 35.522
2 27.9 25.36 28.730
3 32.9 20.36 0.130
4 35.2 1.94 3.764
5 44.9 11.64 135.490
6 39.9 6.64 44.090
7 30.0 23.26 10.628
8 29.7 23.56 12.674
9 28.5 24.76 22.658
10 32.0 21.26 1.588
11 37.6 4.34 18.836
^ xi 5 365.9 ^ (xi 2 x) 5 .04 ^ (x 2 x)2 5 314.110 x 5 33.26
i
Effects of rounding account for the sum of deviations differing slightly from zero. The
numerator of s2 is Sxx 5 314.110, from which
Sxx 314.110
s2 5 5 5 31.41, s 5 5.60
n21 11 2 1
The size of a representative deviation from the sample mean 33.26 is roughly 5.6 mpg.
Note: Of the nine people who also reported driving behavior, only three did more
than 80% of their driving in highway mode; we bet you can guess which cars they
drove. We haven’t a clue why all 11 reported values exceed the EPA figure: Maybe
only drivers with really good fuel efficiencies communicate their results.
One explanation for the use of n 2 1 in s2 goes back to the fact that ^ (xi 2 x) 5 0.
Suppose that n 5 5 and that x1 2 x 5 24, x2 2 x 5 6, x3 2 x 5 1, and x5 2 x 5 28.
Since the sum of these four deviations is 25, the remaining deviation must be
x4 2 x 5 5 (so that the sum of all five deviations is zero). More generally, once any
n 2 1 of the deviations are available, the value of the remaining deviation is deter-
mined. The n deviations actually contain only n 2 1 independent pieces of informa-
tion about variability. Statisticians express this by saying that s2 and s are based on
n 2 1 degrees of freedom (df). Many inferential procedures encountered in later
chapters are based on some appropriate number of df.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.2 Measures of Variability 75
definitions The variance of a discrete distribution for a variable x specified by mass function
p(x), denoted by 2x or just 2 (alternatively, V(x)), is given by
2 5 ^ (x 2 )2 ? p(x)
where the sum is over all possible x values. The standard deviation is , the posi-
tive square root of the variance.
Example 2.8 Consider a computer system consisting of the computer itself, a monitor, and a printer.
Let x denote the number of system components that need service while under war-
ranty; possible x values are 0, 1, 2, and 3. Suppose that p(0) 5 .532, p(1) 5 .389,
p(2) 5 .076, and p(3) 5 .003 (these come from individual component failure propor-
tions of .2, .3, and .05 along with an assumption of component independence, so that
these proportions can be multiplied as we originally did in a binomial calculation).
Then 5 .55 and
2 5 ^ (x 2 )2 ? p(x)
5 (0 2 .55)2 (.532) 1 (1 2 .55)2 (.389) 1 (2 2 .55)2(.076)
1 (3 2 .55)2(.003)
5 .16093 1 .07877 1 .15979 1 .01801 5 .41750
from which 5 .646.
The standard deviation of a binomial distribution is then 5 2n(1 2 ). Note that
5 0 if 5 0 (in which case, every item is a failure, so x 5 0 always) or 5 1 (ev-
ery item a success, so x 5 n always). The variance and standard deviation are largest
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
76 chapter 2 Numerical Summary Measures
when 5 .5 [(1 2 ) is maximized for this value], that is, when there is a 50–50 split
between successes and failures. As moves toward either 0 or 1, the variance and stan-
dard deviation decrease. If identical components are shipped in groups of size 25 and
the long-run success (doesn’t need warranty service) proportion is 5 .9, then
The mean value of a Poisson distribution with parameter is itself, and this is also
the variance of the distribution:
e2x
2 5 ^ (x 2 )2 5
x50 x!
(Again, much summation manipulation is required.) The standard deviation is, of course,
1. If the number of blemishes x on surfaces of a certain part has a Poisson distribution
with parameter 5 3.5, then the mean value is 3.5 and the standard deviation is 1.87.
2 5 # (x 2 )2 ? f (x) dx
2
The standard deviation is again the positive square root of the variance.
Example 2.9 The distribution of x 5 gravel sales during a given week (tons), introduced in
Example 2.5, was specified by the density function f (x) 5 1.5(1 2 x2) for x between
0 and 1. We found the mean value to be 5 .375. The variance of the distribution is
1
2 5 # (x 2 .375)2 ? 1.5(1 2 x2) dx
0
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.2 Measures of Variability 77
parameter is the standard deviation of the distribution. That is, a bit of integration ma-
nipulation shows that
1
V(x) 5 # (x 2 )2
2 2
e 2(x2) y(2 ) dx 5 2
2 12
Let k be some fixed positive number. Consider the area under a normal curve with
parameters and that lies within k standard deviations of the mean value. That is,
we wish to determine the proportion of x values that lie in the interval from 2 k to
1 k. Standardizing the interval limits gives
2 k 2 1 k 2
5 2k 5k
Thus the desired proportion is the area under the standard normal (z) curve between
2k and k. This shows that the area within k standard deviations of the mean under any
normal curve depends only on k and not on the particular normal curve under consider-
ation. For k 5 1, the desired proportion is the area under the z curve between 21 and 1.
From Appendix Table I, this area is .8413 2 .1587 5 .6826 < .68. Similar calculations
for k 5 2 and k 5 3 give .9544 and .9974, respectively. Thus for any variable x whose
distribution is well approximated by a normal curve:
Approximately 68% of the values are within 1 standard deviation of the mean.
Approximately 95% of the values are within 2 standard deviations of the mean.
Approximately 99.7% of the values are within 3 standard deviations of the mean.
These three statements together are often referred to as the empirical rule; the name
reflects the fact that histograms of a great many data sets have at least roughly the shape
of a normal curve.
The variance of a variable having a Weibull distribution is even more complicated than
the mean value; consult one of the chapter references.
2
and s2
The sample mean x is a sensible estimate (educated guess) for the value of the population
or process mean . Similarly, the sample variance should be defined so that it gives a rea-
sonable estimate of the population or process variance 2. Recall that 2 involves squared
deviations from , that is, quantities of the form (x 2 )2. If the value of were known
to an investigator, a good estimate of 2 based on sample observations x1, . . . , xn would be
^ (xi 2 )2yn. It is natural to replace by x when the value of the former quantity is
unknown. However, it can be shown that ^ (xi 2 x)2 , ^ (xi 2 )2 unless x 5 , so x is
“closer” to the sample observations than is . To compensate for this reduction in sum of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
78 chapter 2 Numerical Summary Measures
squares, the value of the denominator n should also be reduced. According to a technical
criterion called unbiasedness, the sample size n should be replaced by the number of df
n 2 1. The resulting sample variance s2 will tend to provide good estimates of 2.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.2 Exercises 79
Calculate and interpret the values of the sample a. Within 5 of the mean value?
mean and sample standard deviation for this data. b. Within 1 standard deviation of the mean value?
20. Use the alternative computing formula for Sxx as 24. Suppose that x, the number of flaws on the surface
shown in Exercise 18 to determine the sample stan- of a boiler of a certain type, has a Poisson distribu-
dard deviation for the average porosity measure- tion with 5 5. For what proportion of such boilers
ments presented in Exercise 19. will the number of flaws
a. Be within 1 standard deviation of the mean
21. Consider the following information on ultimate
number of flaws?
tensile strength (lb/in.) for a sample of n 5 4 hard
b. Exceed the mean number of flaws by more than
zirconium copper wire specimens (from “Charac-
2 standard deviations?
terization Methods for Fine Copper Wire,” Wire J.
Intl., August 1997: 74–80): 25. Let x represent the number of underinflated tires on
an automobile of a certain type, and suppose that
x 5 76,831 s 5 180 smallest xi 5 76,683
p(0) 5 .4, p(1) 5 p(2) 5 p(3) 5 .1, and p(4) 5 .3,
largest xi 5 77,048
from which 5 1.8.
Determine the values of the two middle sample ob- a. Calculate the standard deviation of x.
servations (and don’t do it by successive guessing!).
b. For what proportion of such cars will the num-
Hint: See Exercise 18 part b.
ber of underinflated tires be within 1 standard
22. The federal test procedure (FTP) for determin- deviation of the mean value? More than 3 stan-
ing the levels of various types of vehicle emissions dard deviations from the mean value?
is time-consuming and expensive to perform. Ac- 26. Use the fact that (x 2 )2 5 x2 2 2x 1 2 to show
cording to the article “Motor Vehicle Emissions that 2 5 ^ x2p(x) 2 2 for a discrete variable x.
Variability” (J. of the Air and Waste Mgmnt. Assoc., Then use this result to compute the variance for
1996: 667–675), there is a widespread belief that the variable whose distribution is given in the pre-
repeated FTP measurements on the same vehicle vious problem. Hint: Substitute the alternative
would yield identical (or nearly identical) results. expression for (x 2 )2 in the definition of 2, and
The accompanying data is from one particular ve- break the summation into three separate terms; the
hicle characterized as a high emitter: argument in the continuous case involves replacing
HC (gm/mi): 13.8 18.3 32.2 32.5 summation with integration.
CO (gm/mi): 118 149 232 236 27. If x has a uniform distribution on the interval from a
a. Compute the sample standard deviations for the to b [ f (x) 5 1y (b 2 a)], from which 5 (a 1 b)y2,
HC and CO observations. Does the widespread show that 2 5 (b 2 a)2y12. If task completion time
belief appear to be justified? is uniformly distributed with a 5 4 and b 5 6, what
b. The sample coefficient of variation syx (or 100syx) proportion of times will be farther than 1 standard
assesses the extent of variability relative to the deviation from the mean value of completion
mean. Values of this coefficient for several different time?
data sets can be compared to determine which 28. Suppose that bearing diameter x has a normal distri-
data sets exhibit more or less variation. Carry out bution. What proportion of bearings have diameters
such a comparison for the given HC and CO data. that are within 1.5 standard deviations of the mean
diameter? That exceed the mean diameter by more
23. Suppose, as in Exercise 57 of Chapter 1, that the
than 2.5 standard deviations?
number of drivers traveling between a particular
origin and destination during a designated time pe- 29. Historical data implies that 20% of all components
riod has a Poisson distribution with 5 20. In the of a certain type need service while under warranty.
long run, during what proportion of such periods Suppose that whether any particular component
will the number of drivers be needs warranty service is independent of whether
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
80 chapter 2 Numerical Summary Measures
any other component does. If these components 31. If component lifetime is exponentially distributed
are shipped in batches of 25 and x denotes the with parameter , obtain an expression for the pro-
number of components in a batch that need war- portion of components whose lifetime exceeds the
ranty service, determine the standard deviation of mean value by more than 1 standard deviation. Hint:
x and then the proportion of batches for which the According to Exercise 26, 2 5#0 x2f (x) dx 2 2;
number of components that need warranty service now use integration by parts.
exceeds the mean number by more than 2 standard
32. The sample mean and sample standard devia-
deviations.
tion for the sample of n 5 100 shear strength ob-
30. If the unloading time of a forwarder in a harvesting servations given in Exercise 17 of Section 1.2 are
operation is lognormally distributed with a mean 5049.16 and 351.45, respectively. What percent-
value of 900 and a standard deviation of 725, what age of the observations in the sample are within
are the values of the parameters and ? Note: An 1 standard deviation of the mean, and how does this
expression for the mean value of a lognormal vari- compare to the corresponding percentage given by
able is given in Section 2.1, and an expression for the empirical rule? Answer this question also for
the variance appears in this section. 2 standard deviations and for 3 standard deviations.
Lower Upper
quartile Median quartile
Let’s first consider quartiles for sample data. There are several different sensible
ways to define the sample quartiles. We will use a definition that requires a minimal
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 More Detailed Summary Quantities 81
definitionS Separate the n ordered sample observations into a lower half and an upper half; if
n is an odd number, include the median x in each half. Then
~
Example 2.10 Reconsider the flexural strength data for beams given in Example 1.2. A stem-and-
leaf display of the 27 observations follows:
5 9
6 3 3 5 8 8
7 0 0 2 3 4 6 7 7 8 8 9 Stem: ones digit
8 1 2 7 Leaf: tenths digit
9 0 7 7
10 7
11 3 6 7
Because n 5 27 is odd, the median x 5 7.7 is included in each half of the data:
~
Lower half: 5.9 6.3 6.3 6.5 6.8 6.8 7.0 7.0 7.2 7.3 7.4 7.6 7.7 7.7
Upper half: 7.7 7.8 7.8 7.9 8.1 8.2 8.7 9.0 9.7 9.7 10.7 11.3 11.6 11.8
Notice that if the largest observation, 11.8, were increased by any amount, the up-
per quartile and therefore the IQR would not be affected, whereas such an increase
would change the sample variance and standard deviation. Similarly, a decrease in
several of the smallest observations has no impact on the quartiles or the IQR.
The following output is from the summary and IQR commands from the R
software. The former command requests that the values of various summary quanti-
ties be calculated:
> summary(flexural)
Min. 1st Qu. Median Mean 3rd Qu. Max.
5.900 7.000 7.700 8.141 8.850 11.800
> IQR(flexural)
[1] 1.85
Minitab’s reported value for the quartile Q3 is 9.000, a bit different from what R returns.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
82 chapter 2 Numerical Summary Measures
~
#2 f (x) dx 5 .5
(so that half the area under the density curve lies to the left of ). The lower quartile q1
~
Example 2.11 The exponential distribution with parameter has density function e2x for x . 0.
For any positive number c,
c c
#2 f (x) dx 5 # e2x dx 5 1 2 e2c
0
#c e2x dx 5 e2c
Equating either of these quantities to .5 and solving for c gives c 5 5 2ln(.5)y 5
~
Suppose, for example, that times (min) between successive arrivals at a shipping ter-
minal are exponentially distributed with 5 .1. Then q1 5 2.88 min, 5 6.93 min,
~
and qu 5 13.86 min. The upper quartile is much farther from the median than is the
lower quartile because the distribution has a substantial positive skew (the mean value
of x is 1y 5 10, much larger than the median).
Example 2.12 The quartiles of a normal distribution are easily expressed in terms of and . First,
consider a variable z having the standard normal distribution. Symmetry of the stan-
dard normal curve about 0 implies that 5 0. Looking for .2500 inside Appendix
~
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 More Detailed Summary Quantities 83
That is, for any normal distribution, the quartiles are .675 standard deviation to ei-
ther side of the mean. The interquartile range is 1 .675 2 ( 2 .675) 5 1.35.
A familiar example is IQ scores in the general population, where 5 100, 5 15,
q1 5 89.875 90, and qu 110. Roughly 25% of all people have scores below
90 and roughly 25% have scores exceeding 110.
The relation IQR 5 1.35 suggests that if the sample IQR is very different from
1.35s, it is not plausible that the underlying distribution is normal. In Example 2.10,
1.35s 2.2, which is not much greater than the IQR of 1.85. A graphical technique
for assessing the plausibility of a normal population or process distribution is pre-
sented in the next section.
For our purposes, it is not necessary to discuss quartiles for a discrete distribution.
Boxplots
A boxplot is a visual display of data based on the following five-number summary:
To create a boxplot, first draw a horizontal measurement scale. Then place a rectangle
above this axis; the left edge of the rectangle is at the lower quartile, and the right edge is
at the upper quartile (so box width 5 IQR). Place a vertical line segment or some other
symbol inside the rectangle at the location of the median; the position of the median
symbol relative to the two edges conveys information about skewness in the middle 50%
of the data. Finally, draw “whiskers” out from either end of the rectangle to the smallest
and largest observations. A boxplot with a vertical orientation can also be drawn by mak-
ing obvious modifications in the construction process.
Example 2.13 Returning to the article on lightweight aggregates referenced in Example 2.1, the
researchers also reported specific gravity measurements for all 14 LWA specimens:
Largest xi 5 1.62
Figure 2.9 shows the resulting boxplot.
The right edge of the box is closer to the median than is the left edge, indicating
a substantial skew in the middle half of the data. The box width (IQR) is also reason-
ably large relative to the range of the data (distance between the tips of the whiskers).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
84 chapter 2 Numerical Summary Measures
Example 2.14 The article “Compression of Single-Wall Corrugated Shipping Containers Using
Fixed and Floating Test Platens” (J. of Testing and Evaluation, 1992: 318–320) de-
scribes an experiment in which several different types of boxes were compared with
respect to compression strength. Consider the following observations on four dif-
ferent types of boxes (summary quantities for this data are in good agreement with
values given in the cited article):
Type of box Compression strength (lb)
1 655.5 788.3 734.3 721.4 679.1 699.4 Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 2.10 is a comparative boxplot of this data produced by the Minitab statistical
package. (Recall that Minitab uses definitions of the quartiles that differ somewhat
from ours.) The most striking feature of the comparative boxplot is that strength
values for the fourth type of box appear to be considerably smaller than those for the
three other types; this suggests that the population mean strength for type 4 boxes is
less than the mean strengths for the other three types. The differences between box
types seem pretty clear-cut because within-sample variation is small relative to the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 More Detailed Summary Quantities 85
separation between sample means and medians. When this is not the case, an infer-
ential method called single-factor analysis of variance, discussed in Chapter 9, is used
to investigate differences among three or more populations or treatments.
800
700
CompStr
600
500
1 2 3 4
Type
definitionS Any observation farther than 1.5 IQR from the closest quartile is an outlier. An
outlier is extreme if it is more than 3 IQR from the nearest quartile, and it is mild
otherwise.
Many inferential procedures are based on the assumption that the sample came from
a normal distribution. Even a single extreme outlier in the sample warns the investiga-
Unless otherwise noted, all content on this page is © Cengage Learning.
tor that such procedures should not be used, and the presence of several mild outliers
conveys the same message.
Let’s now modify our previous construction of a boxplot by drawing a whisker out
from each end of the box to the smallest and largest observations that are not outliers. Each
mild outlier is represented by a closed circle and each extreme outlier by an open circle.
Some statistical computer packages do not distinguish between mild and extreme outliers.
Example 2.15 The National Health and Nutrition Examination Survey (NHANES), a massive
annual program conducted by the National Center for Health Statistics, is a series
of cross-sectional nationally representative surveys that include demographic,
socioeconomic, dietary, and health-related questions. The information from the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
86 chapter 2 Numerical Summary Measures
surveys is used to assess the health and nutritional status of adults and children in
the United States.
One variable measured is the high-density lipoprotein (HDL) cholesterol level
(mg/dl) of each survey participant. The following 30 HDL observations were ob-
tained from the 2009–2010 NHANES data set:
11 32 33 41 45 46 47 48 48 49
49 50 52 55 57 57 59 61 63 63
66 67 71 71 71 72 73 76 111 144
Relevant summary quantities are
x 5 57 lower quartile 5 48 upper quartile 5 71
~
Figure 2.11 A boxplot of the HDL cholesterol data showing mild and
extreme outliers
That is, p is the area under the density curve to the left of p. Figure 2.12 illustrates the
definition.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 Exercises 87
( )
Shaded area =
Example 2.16 Appendix Table I gives cumulative z curve areas for the standard normal distribution.
To find the 90th percentile, we look for cumulative area .9000 inside the table. The
entry closest to .9000 is .8997 in the 1.2 row and .08 column, so .9 1.28. By sym-
metry, the 10th z percentile (.1th quantile) is .1 21.28. It then follows that for the
normal distribution with mean value and standard deviation ,
.9 1 1.28 .1 2 1.28
Percentiles for discrete distributions will not be needed in this book. In general,
percentiles for sample data require interpolation between successive sample values. In
Section 2.4, we use percentiles that correspond to the ordered sample observations. For
Unless otherwise noted, all content on this page is © Cengage Learning.
example, if n 5 10, we will regard the smallest sample observation as the fifth sample
percentile, the second smallest observation as the 15th sample percentile, and so on.
33. Reconsider the accompanying data on postsurgical b. Construct a boxplot based on the five-number
range of motion introduced in Exercise 18 of this summary and comment on its features.
chapter: c. How large or small does an observation have to
154 142 137 133 122 126 135 be to qualify as an outlier? As an extreme out-
135 108 120 127 134 122 lier?
a. What are the values of the quartiles? What is the d. By how much could the largest observation be
value of the IQR? decreased without affecting the IQR?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
88 chapter 2 Numerical Summary Measures
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 Exercises 89
U: 6.0 5.0 11.0 33.0 4.0 5.0 Engr., 1995: 483–490). Discuss any interesting
80.0 18.0 35.0 17.0 23.0 features.
F: 4.0 14.0 11.0 9.0 9.0 8.0 43. Exercise 46 from Section 1.5 suggested a Weibull
4.0 20.0 5.0 8.9 21.0 9.2 distribution with 5 5 and 5 125 as a model
3.0 2.0 0.3 for fracture strength of silicon nitride braze
a. Determine the medians, quartiles, and IQRs joints.
for the two samples. a. What are the quartiles of this distribution, and
b. Are there any outliers in either sample? Any what is the value of the IQR?
extreme outliers? b. Suppose that the value of is changed to
c. Construct a comparative boxplot and use it as 12.5. Determine the values of the quartiles
a basis for comparing and contrasting the two and the value of the IQR. Note: In essence,
samples. this amounts to dividing each observation in
the population distribution by 10, because
41. The authors of the article cited in Exercise 2 also
is a “scale” parameter and changing its value
provided endotoxin concentrations in dust from
stretches or compresses the x scale without
vacuum-cleaner dust bags:
changing the shape of the distribution.
U: 34.0 49.0 13.0 33.0 24.0 24.0 35.0 104.0
44. Reconsider the lognormal distribution with 5
34.0 40.0 38.0 1.0
9.164 and 5.385 proposed in Exercise 44 from
F: 2.0 64.0 6.0 17.0 35.0 11.0 17.0 13.0
Section 1.5 as a model for the distribution of non-
5.0 27.0 23.0 28.0 10.0 13.0 0.2 point source load of total dissolved solids (in kg/
Construct a comparative boxplot (which ap- day/km).
peared in the cited paper), and compare and a. What are the values of the quartiles?
contrast the two samples. b. What is the value of the 95th percentile of the
42. The comparative boxplot (see below) of gasoline concentration distribution?
vapor coefficients for vehicles in Detroit appeared c. If were 10.164 rather than 9.164, would the
in the article “Receptor Modeling Approach to values of the two quartiles simply increase by
VOC Emission Inventory Validation” (J. of Envir. an identical amount?
70
60
Unless otherwise noted, all content on this page is © Cengage Learning.
50
40
30
20
10
0 Time
6 A.M. 8A.M. 12noon 2 P.M. 10P.M.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
90 chapter 2 Numerical Summary Measures
Sample Quantiles
The details involved in constructing quantile plots differ a bit from source to source. The
basis for our construction is a comparison between quantiles of the sample data and the
corresponding quantiles of the distribution under consideration. Recall that for any num-
ber p between 0 and 1, the pth quantile p is such that area p lies to the left of p under the
density curve. For example, Appendix Table I shows that the .9th quantile (90th percen-
tile) for the standard normal distribution is approximately 1.28, the .1th quantile is roughly
21.28, the .8th quantile is about .84, and of course the .5th quantile (the median) is 0.
Roughly speaking, sample quantiles are defined in the same way that quantiles of a
population or process distribution are defined. The .5th sample quantile should separate
the smallest 50% of the sample from the largest 50%, the .9th sample quantile should be
such that 90% of the sample lies below that value and only 10% above, and so on. Our
interest here is only in the value of p corresponding to each of the sample observations
when ordered from largest to smallest. Recall that when n is odd, the sample median
or .5th quantile is the middle value in the ordered list; for example, the sixth smallest
value when n 5 11. This amounts to regarding the middle observations as being half
in the lower half of the data and half in the upper half. Similarly, suppose that n 5 10.
Then if we call the third smallest value the .25th quantile, we are regarding that value
as being half in the lower group (consisting of the two smallest observations) and half in
the upper group (comprising the seven largest observations). This leads to the following
general definition of sample quantiles:
definition Let x(1) denote the smallest sample observation, x(2) the second smallest sample
observation, . . . , and x(n) the largest sample observation. We take x(1) to be the
(.5yn)th sample quantile, x(2) to be the (1.5yn)th sample quantile, . . . , and finally
x(n) to be the [(n 2 .5)yn]th sample quantile. That is, for i 5 1, . . . , n, x(i) is the
[(i 2 .5)/n]th sample quantile.
Thus when n 5 20, x(1) is the .025th quantile, x(2) is the .075th quantile, x(3) is the .125th
quantile, . . . , and x(20) is the .975th quantile (97.5th percentile).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.4 Quantile Plots 91
.5 1.5 n 2 .5
aa b th quantile, x(1) b , aa b th quantile, x(2) b , . . . , aa b th quantile, x(n) b
n n n
In other words, pair the smallest quantile with the smallest observation, the second
smallest quantile with the second smallest observation, and so on. Each such pair can
be plotted as a point on a two-dimensional coordinate system. If the first number in each
pair is close to the second number, the points in the plot will fall close to a 45° line [one
with slope 1 passing through the point (0, 0)].
For example, this program can be carried out to decide whether a normal distribution
with 5 100 and 5 15 is plausible. First the appropriate z quantiles are determined;
then the desired normal quantiles are expressed in the form 1 (corresponding z quan-
tile). However, an investigator is typically not interested in knowing whether a particular
normal distribution is plausible but instead whether some normal distribution is plausible.
It is clearly inefficient to construct a separate normal quantile plot for each of a large
number of different choices of and . Fortunately, this is not necessary because there
is a linear relationship between z quantiles and those for any other normal distribution:
definition A normal quantile plot is a plot of the (z quantile, observation) pairs. The lin-
ear relation between normal (, ) quantiles and z quantiles implies that if the
sample has come from a normal distribution with particular values of and , the
points in the plot should fall close to a straight line with slope and vertical in-
tercept . Thus a plot for which the points fall close to some straight line suggests
that the assumption of a normal population or process distribution is plausible.
Note that if a straight line is fit to the points in the plot, the intercept and slope give esti-
mates of and , respectively, though these will typically differ from the usual estimates
x and s.
Example 2.17 There has been recent increased use of augered cast-in-place (ACIP) and drilled dis-
placement (DD) piles in the foundations of buildings and transportation structures. In
the article “Design Methodology for Axially Loaded Auger Cast-in-Place and Drilled
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
92 chapter 2 Numerical Summary Measures
60
55
50
L/D
45
40
35
30
2 1 0 1 2
Normal quantile Unless otherwise noted, all content on this page is © Cengage Learning.
The judgment as to whether a plot does or does not show a substantial linear pattern is
somewhat subjective. Particularly when n is small, normality should not be ruled out unless
the departure from linearity is very clear-cut. Figure 2.14 displays several plots that suggest a
nonnormal population or process distribution. In Section 8.4, we show how a quantitative as-
sessment of the extent to which points in a two-dimensional plot fall close to a straight line can
be used as the basis of an inferential procedure for deciding whether normality is plausible.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.4 Quantile Plots 93
Figure 2.14 Quantile plots that are inconsistent with an underlying normal
distribution
Example 2.18 For many years it has been well established that the Weibull distribution is useful
in modeling the strength of fibers used in composite materials such as carbon
graphite, Kevlar, and glass. With the advent of nanotechnology where materials can
be developed at miniscule levels, scientists have questioned whether the Weibull
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
94 chapter 2 Numerical Summary Measures
5.5
5.0
4.5
ln( )
3.5
3.0
4 3 2 1 0 1
ln( ln(1 ))
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.4 Exercises 95
45. The accompanying normal quantile plot was con- 69.0 69.7 72.7 80.3 81.0
structed from a sample of 30 readings on tension for 85.0 86.0 86.3 86.7 87.7
mesh screens behind the surface of video display 89.3 90.7 91.0 92.5 93.0
tubes used in computer monitors. Does it appear The corresponding z percentiles are
plausible that the tension distribution is normal?
21.83 21.28 20.97 20.73 20.52
20.34 20.17 0.0 0.17 0.34
Tension
0.52 0.73 0.97 1.28 1.83
350
Construct a normal quantile plot and a dotplot. Is it
plausible that the population distribution is normal?
300
48. The accompanying observations are precipitation
values during March over a 30-year period in
250 Minneapolis–St. Paul.
42.0 43.1 43.9 44.1 44.6 45.0 46.1 b. Calculate the square root of each value and then
47.0 62.0 64.3 68.8 70.1 74.5 construct a quantile plot based on this transformed
Use the quantiles for a sample of size 20 given in data. Does it seem plausible that the square root of
this section to construct a normal quantile plot, and precipitation is normally distributed?
comment on the plausibility of a normal population c. Repeat part (b) after transforming by cube roots.
distribution.
49. The article “A Probabilistic Model of Fracture in
47. A sample of 15 female collegiate golfers was Concrete and Size Effects on Fracture Toughness”
selected, and the clubhead velocity (km/ hr) of each (Magazine of Concrete Res., 1996: 311–320) gives
golfer while swinging a driver was determined, arguments for why fracture toughness in concrete
resulting in the following data (“Hip Rotational specimens should have a Weibull distribution and
Velocities During the Full Golf Swing,” J. of Sports presents several histograms of data that appear well
Science and Medicine, 2009: 296–299): fit by superimposed Weibull curves. Consider the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
96 chapter 2 Numerical Summary Measures
following sample of size n 5 18 observations on transformer oil gap under various oil flow velocities
toughness for high-strength concrete (consistent and exposure to temporary overvoltage. Consider
with one of the histograms); values of pi 5 (i 2 .5)y 18 the following breakdown time data (in s) from their
are also given: experiment where an oil flow at 16 cm/s and an over-
voltage of 81kV were applied.
Obs: .47 .58 .65 .69 .72 .74
pi: .0278 .0833 .1389 .1944 .2500 .3056 7.2 10.0 18.0 25.0 36.0 38.0
Obs: .77 .79 .80 .81 .82 .84 46.0 63.0 71.0 76.0 92.0 95.0
pi: .3611 .4167 .4722 .5278 .5833 .6389 104.0 152.0 198.0 226.0 235.0 247.0
Obs: .86 .89 .91 .95 1.01 1.04 361.0 392.0
pi: .6944 .7500 .8056 .8611 .9167 .9722
Construct a Weibull plot and comment on the
Construct a Weibull quantile plot and comment. plausibility of breakdown time having a Weibull
distribution.
50. In the article “Weibull Parameter of Oil-Immersed
Transformer to Evaluate Insulation Reliability on 51. The accompanying figures show (a) a normal quan-
Temporary Overvoltage” (IEEE Trans. on Dielectrics tile plot of the observations on cell interdivision time
and Elec. Insul., 2010: 1863–1868), researchers in- (IDT) given in Exercise 16 of Section 1.2 and (b) a
vestigated the reliability of oil-immersed transformers normal quantile plot of the logarithms of the IDTs.
under various conditions. In one experiment, the What do these plots suggest about the distribution of
researchers measured the breakdown time of the cell interdivision time?
IDT
70
60
50
40
30
20
10
Normal quantile
–2 –1 0 1 2
(a)
ln(IDT)
Unless otherwise noted, all content on this page is © Cengage Learning.
4.5
3.5
2.5
Normal quantile
–2 –1 0 1 2
(b)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 97
52. A plot to assess the plausibility of an exponential 53. The article “Families of Distributions for Hourly
population distribution can be based on quantiles Median Power and Instantaneous Power of Received
of the exponential distribution having 5 1 (i.e., Radio Signals” (J. of Research for the National
the exponential distribution with density function Bureau of Standards, 1963: 753–762) suggests the
f (x) 5 e2x for x . 0). This is because , like for lognormal distribution for x 5 hourly median pow-
a normal distribution, is a scale parameter. Con- er (decibels) of received radio signals transmitted
sider the following failure time observations (1000s between two cities. Consider the following sample
of hours) resulting from accelerated life testing of of hourly median power readings:
16 integrated circuit chips of a certain type: 2.7 5.4 9.7 22.8 30.5 55.7 66.2 97.3
82.8 11.6 359.5 502.5 307.8 179.7 186.5 240.0
242.0 26.5 244.8 304.3 379.1 a. Is it plausible that these observations were sam-
212.6 229.9 558.9 366.7 204.6 pled from a normal distribution?
Construct a quantile plot and comment on the b. Is it plausible that these observations were sam-
plausibility of failure time having an exponential pled from a lognormal distribution?
distribution.
Supplementary Exercises
54. Anxiety disorders and symptoms can often be ef- d. Construct a comparative boxplot, and comment
fectively treated with benzodiazepine medications. on interesting features.
It is known that animals exposed to stress exhibit e. Would you recommend estimating the differ-
a decrease in benzodiazepine receptor binding in ence between the true average binding mea-
the frontal cortex. The paper “Decreased Benzo- sure of PTSD individuals and the true average
diazepine Receptor Binding in Prefrontal Cortex measure for healthy individuals using a method
in Combat-Related Posttraumatic Stress Disorder” based on assuming that each sample was select-
(American J. of Psychiatry, 2000: 1120–1126) de- ed from a normal population distribution? Ex-
scribed the first study of benzodiazepine receptor plain your reasoning.
binding in individuals suffering from PTSD. The
55. A sample of 77 individuals working at a particular
accompanying data on a receptor binding measure
office was selected, and the noise level (dBA) expe-
(adjusted distribution volume) was read from a
rienced by each one was determined, yielding the
graph in the paper:
following data (“Acceptable Noise Levels for Con-
PTSD: 10 20 25 28 31 35 37 38 38 struction Site Offices,” Building Serv. Engr. Res.
39 39 42 46 and Tech., 2009: 87–94).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
98 chapter 2 Numerical Summary Measures
Use various techniques discussed in this chapter to 59. A deficiency of the trace element selenium in the
organize, summarize, and describe the data. diet can negatively affect growth, immunity, muscle
and neuromuscular function, and fertility. The
56. Three different C2F6 flow rates (SCCM) were con-
introduction of selenium supplements to dairy cows
sidered in an experiment to investigate the effect
is justified when pastures have low selenium levels.
of flow rate on the uniformity (%) of the etch on a
Authors of the paper “Effects of Short-Term Supple-
silicon wafer used in the manufacture of integrated
mentation with Selenised Yeast on Milk Production
circuits, resulting in the following data:
and Composition of Lactating Cows” (Australian J.
125: 2.6 2.7 3.0 3.2 3.8 4.6 of Dairy Tech., 2004: 199–203) supplied the follow-
160: 3.6 4.2 4.2 4.6 4.9 5.0 ing data on milk selenium concentration (mg/L)
200: 2.9 3.4 3.5 4.1 4.6 5.1 for a sample of cows given a selenium supplement
Compare and contrast the uniformity observations and a control sample given no supplement, both
resulting from these three different flow rates. initially and after a nine-day period.
57. Consider a sample x1, . . . , xn, and let xk and s2k de- Obs Init Se Init Cont Final Se Final Cont
note the sample mean and variance, respectively, of 1 11.4 9.1 138.3 9.3
the first k observations. 2 9.6 8.7 104.0 8.8
a. Show that 3 10.1 9.7 96.4 8.8
4 8.5 10.8 89.0 10.1
k
ks2k11 5 (k 2 1)s2k 1 (x 2 xk)2 5 10.3 10.9 88.0 9.6
k 1 1 k11
6 10.6 10.6 103.8 8.6
7 11.8 10.1 147.3 10.4
b. Suppose that a sample of 15 strands of drap-
ery yarn has resulted in a sample mean thread 8 9.8 12.3 97.1 12.4
elongation of 12.58 mm and a sample standard 9 10.9 8.8 172.6 9.3
deviation of .512 mm. A 16th strand results in 10 10.3 10.4 146.3 9.5
an elongation value of 11.8. What are the val- 11 10.2 10.9 99.0 8.4
ues of the sample mean and sample standard 12 11.4 10.4 122.3 8.7
deviation for all 16 elongation observations? 13 9.2 11.6 103.0 12.5
14 10.6 10.9 117.8 9.1
58. In 1997 a woman sued a computer keyboard man-
15 10.8 121.5
ufacturer, charging that her repetitive stress inju-
16 8.2 93.0
ries were caused by the keyboard (Genessy v. Dig-
ital Equipment Corp.). The jury awarded about a. Do the initial Se concentrations for the supple-
$3.5 million for pain and suffering, but the court ment and control samples appear to be similar?
then set aside that award as being unreason- Use various techniques from this chapter to
able compensation. In making this determina- summarize the data and answer the question
tion, the court identified a “normalative” group posed.
of 27 similar cases and specified a reasonable b. Again use methods from this chapter to summa-
award as one within 2 standard deviations of rize the data and then describe how the final Se
the mean of the awards in the 27 cases. The 27 concentration values in the treatment group dif-
awards were (in $1000s) 37, 60, 75, 115, 135, fer from those in the control group.
140, 149, 150, 238, 290, 340, 410, 600, 750,
750, 750, 1050, 1100, 1139, 1150, 1200, 1200, 60. An inequality developed by the Russian mathema-
1250, 1576, 1700, 1825, and 2000, from which tician Chebyshev gives information about the per-
^ xi 5 20,179, ^ x2i 5 24,657,511. What is the maxi centage of values in any sample or distribution that
mum possible amount that could be awarded un- fall within a specified number of standard deviations
der the 2 standard deviation rule? of the mean. Let k denote any number satisfying
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 99
k $ 1. Then at least 100(1 2 1y k2)% of the values 63. The accompanying observations are carbon mon-
are within k standard deviations of the mean. oxide levels (ppm) in air samples obtained from a
a. What does Chebyshev’s inequality say about certain region:
the percentage of values that are within 2 stan-
9.3 10.7 8.5 9.6 12.2 16.6 9.2 10.5
dard deviations of the mean? Within 3 standard
deviations of the mean? Within 5 standard de- 7.9 13.2 11.0 8.8 13.7 12.1 9.8
viations? Within 10 standard deviations? a. Calculate a trimmed mean by trimming the
b. What does Chebyshev’s inequality say about the smallest and largest observations, and give the
percentage of values that are more than 2 stan- corresponding trimming percentage. Do the
dard deviations from the mean? More than 3 same with the two smallest and two largest values
standard deviations from the mean? trimmed.
c. Suppose the distribution of slot width on a b. Using the results of part (a), how would you cal-
forging has a mean value of 1.000 in. and a culate a trimmed mean with a 10% trimming
standard deviation of .0025 in. What percent- percentage?
age of such forgings have a slot width that is be- c. Suppose there had been 16 sample observations.
tween .995 in. and 1.005 in.? If specifications How would you go about calculating a 10%
are 1.000 6 .005 in., what percentage of slot trimmed mean?
widths will conform to specifications?
d. Refer to part (c). What percentage of such forg- 64. Specimens of three different types of rope wire were
ings will have a slot width that is outside the selected, and the fatigue limit (MPa) was deter-
interval from .995 in. to 1.005 in. (i.e., either mined for each specimen, resulting in the accom-
less than .995 or greater than 1.005)? What panying data:
can be said about the percentage of widths Type 1: 350 350 350 358 370 370 370 371
that exceed 1.005 in.?
371 372 372 384 391 391 392
61. Reconsider Chebyshev’s inequality as stated in the Type 2: 350 354 359 363 365 368 369 371
previous exercise. 373 374 376 380 383 388 392
a. Compare what the inequality says about the per- Type 3: 350 361 362 364 364 365 366 371
centage within 1, 2, or 3 standard deviations of 377 377 377 379 380 380 392
the mean value to the corresponding percent-
ages given by the empirical rule. a. Construct a comparative boxplot, and comment
b. An exponential distribution with parameter on similarities and differences.
has both mean value and standard deviation b. Construct a comparative dotplot (a dotplot for
equal to 1y. If component lifetime is exponen- each sample with a common scale). Comment
tially distributed with a mean value of 100 hr, on similarities and differences.
what percentage of these components have life- c. Does the comparative boxplot of part (a) give an
times within 1 standard deviation of the mean informative assessment of similarities and differ-
lifetime? Within 2 standard deviations? Within ences? Explain your reasoning.
3 standard deviations? Compare these to the per-
65. The three measures of center introduced in this
centages given by Chebyshev’s inequality.
chapter are the mean, median, and trimmed
c. Why do you think the percentages from Che-
mean. Two additional measures of center that are
byshev’s inequality so badly understate the
occasionally used are the midrange, which is the
actual percentages in the situations of parts
average of the smallest and largest observations,
(a) and (b)?
and the midhinge, which is the average of the two
62. Consider a sample x1, . . . , xn with mean x and stan- quartiles. Which of these five measures of center
dard deviation s, and let zi 5 (xi 2 x)ys. What are the are resistant to the effects of outliers and which
mean and standard deviation of the zi’s? are not? Explain your reasoning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
100 chapter 2 Numerical Summary Measures
66. The capacitance (nf) of multilayer ceramic capaci- mean 5 535 median 5 500 mode 5 500
tors supplied by a certain vendor is normally distrib- sd 5 96 minimum 5 220 maximum 5 925
uted with mean value 98 and standard deviation 2. 5th percentile 5 400 10th percentile 5 430
Specifications for these capacitors are 100 6 5 nf. 90th percentile 5 640 95th percentile 5 720
a. What proportion of these capacitors will con-
What can you conclude about the shape of a histo-
form to specification?
gram of this data? Explain your reasoning.
b. Suppose that these capacitors are shipped in
batches of size 20. Let x denote the number of ca- 69. Let x denote the maximum physical stress that a
pacitors in a batch that conform to specification. unit of a certain product encounters during its life-
Provided that capacitances of successive capaci- time. Suppose that x is normally distributed with
tors are independent of one another, what kind of 99th percentile 5 5.33 and 10th percentile 5 1.72
distribution does x have? In the long run, in what (suggested in the article “A Formulation of Product
proportion of batches will at least 19 of the 20 ca- Reliability through Environmental Stress Testing
pacitors conform to specifications? Hint: Think of and Screening,” J. of the Institute of Envir. Sciences,
a capacitor that conforms to specification as a “suc- 1994: 50–56; the unit for x was unspecified). What
cess,” so x is the number of successes in the batch. proportion of these units have maximum stress val-
ues exceeding 5? What proportion have maximum
67. Aortic stenosis refers to a narrowing of the aortic
stress values less than 2?
valve in the heart. The paper “Correlation Analysis
of Stenotic Aortic Valve Flow Patterns Using Phase 70. The indoor thermal climate is an important
Contrast MRI” (Annals of Biomed. Engr., 2005: characteristic affecting the health and pro-
878–887) gave the following data on aortic root ductivity of workers in buildings. The paper
diameter (cm) and gender for a sample of patients “Adaptive Comfort Temperature Model of Air-
having various degrees of aortic stenosis: Conditioned Buildings in Hong Kong” (Building
M: 3.7 3.4 3.7 4.0 3.9 and Environment, 2003: 837–852) reported data
3.8 3.4 3.6 3.1 4.0 on a number of building characteristics mea-
3.4 3.8 3.5 sured during the summer and also during the
F: 3.8 2.6 3.2 3.0 4.3 winter. Consider the accompanying values of
3.5 3.1 3.1 3.2 3.0 relative humidity.
a. Compare and contrast the diameter observations Summer: 57.18 58.11 56.53 58.61 57.40 62.64
for the two genders. 61.72 57.26 53.43 53.71 58.64 45.12
b. Calculate a 10% trimmed mean for each of the 47.52 54.47 55.88 51.08 53.69 54.37
two samples and compare to other measures of 54.36 61.01 52.66 56.20 48.40 46.99
center (for the male sample, the interpolation 50.63 52.40 52.20 55.95 53.77
method mentioned in Section 2.1 must be used). Winter: 52.20 41.83 55.63 54.18 54.56 56.20
58.09 56.70 57.57 58.70 56.15 59.77
68. A study carried out to investigate the distribution of
61.58 61.81 62.48 63.31 55.57 62.25
total braking time (reaction time plus accelerator-
57.40 55.07 62.52 52.80 57.20 59.27
to-brake movement time, in ms) during real driving
54.98 58.13
conditions at 60 km/hr gave the following summary
information on the distribution of times (“A Field Use methods from this and the previous chapter
Study on Braking Responses during Driving,” Ergo- to describe, summarize, compare, and contrast the
nomics, 1995: 1903–1910): summer and winter relative humidity data.
Bibliography
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.1 101
3
Giancarlo Liguori/Shutterstock.com
Bivariate and Multivariate
Data and Distributions
3.1 ScatterPlots
3.2 Correlation
3.3 Fitting a Line to Bivariate Data
3.4 Nonlinear Relationships
3.5 Using More Than One Predictor
3.6 Joint Distributions
Introduction
Now that we have acquired some facility for working with univariate data and
distributions, it’s time to expand our horizons. A multivariate data set consists of
observations made simultaneously on two or more variables. One important special
case is that of bivariate data, in which observations on only two variables, and , are
available. In Section 3.1, we introduce the scatterplot, a picture for gaining insight
into the nature of any relationship between and .
Next, we discuss the correlation coefficient, which is a measure of how
strongly two variables are related. In many investigations, one primary objective
is to predict from the value of —for example, to predict yield from a chemical
reaction at a particular reaction temperature. If the scatterplot shows a linear
pattern, the natural strategy is to fit a straight line to the data and use it as the
basis for predictions, as we do in Section 3.3. If a scatterplot shows curvature,
fitting a nonlinear function, such as a quadratic or an exponential function, is
appropriate; we show how this can be done in Section 3.4. Multiple regression
functions, in which is related to two or more predictor variables, are the
subject of Section 3.5. Finally, Section 3.6 introduces bivariate and multivariate
101
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
102 chapter 3 Bivariate and Multivariate Data and Distributions
3.1 ScatterPlots
A multivariate data set consists of measurements or observations on each of two or
more variables. One important special case, bivariate data, involves only two vari
ables, x and y. For example, x might be the distance from a particular highway and y,
the lead content of the soil at that distance. When both x and y are numerical variables,
each observation consists of a pair of numbers, such as (14, 5.2) or (27.63, 18.9). The
first number in a pair is the value of x and the second number is the value of y.
An unorganized list of such pairs yields little information about the distribution of
either the x values or the y values separately, and even less information about whether
the two variables are related to one another. In Chapter 1, we saw how pictures could
help make sense of univariate data. The most important picture based on bivariate
numerical data is a scatterplot. Each observation (pair of numbers) is represented by
a point on a rectangular coordinate system, as shown in Figure 3.1(a). The horizontal
axis is identified with values of x and is scaled so that any x value can be easily located.
Similarly, the vertical or y axis is marked for easy location of y values. The point cor
responding to any particular (x, y) pair is placed where a vertical line from the value on
the x axis intersects a horizontal line from the value on the y axis. Figure 3.1(b) shows
the point representing the observation (4.5, 15); it is above 4.5 on the horizontal axis
and to the right of 15 on the vertical axis.
40 40
30 30 Point
corresponding
20 20 to (4.5, 15)
= 15
10 10
Unless otherwise noted, all content on this page is © Cengage Learning.
1 2 3 4 5 1 2 3 4 5
Example 3.1 Visual and musculoskeletal problems associated with the use of visual display
terminals (VDTs) have become rather common in recent years. Some research
ers have focused on vertical gaze direction as a source of eye strain and irritation.
This direction is known to be closely related to ocular surface area (OSA), so a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.1 Scatterplots 103
3
OSA
Unless otherwise noted, all content on this page is © Cengage Learning.
0
0.4 0.6 0.8 1.0 1.2 1.4 1.6
Palwidth
Figure 3.2 Scatterplot from Minitab for the data from Example 3.1, along with
dotplots of and values
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
104 chapter 3 Bivariate and Multivariate Data and Distributions
Here are some things to notice about the data and plot:
The horizontal and vertical axes in the scatterplot of Figure 3.2 intersect at the
point (0, 0). In many data sets, the values of x or y or the values of both variables differ
considerably from zero relative to the range(s) of the values. For example, a study of
how air conditioner efficiency is related to maximum daily outdoor temperature might
involve observations for temperatures ranging from 80°F to 100°F. When this is the
case, a more informative plot would show the appropriately labeled axes intersecting at
some point other than (0, 0).
Example 3.2 Arsenic is found in many ground waters and some surface waters. Recent re
search on health effects has prompted the Environmental Protection Agency
to reduce allowable arsenic levels in drinking water; as a result, many water
systems are no longer compliant with standards. This has spurred interest in
the development of methods to remove arsenic. The accompanying data on
x 5 pH and y 5 arsenic removed (%) by a particular process was read from a
scatterplot in the article “Optimizing Arsenic Removal During Iron Removal:
Theoretical and Practical Considerations” (J. of Water Supply Res. and Tech.,
2005: 545–560):
Figure 3.3 shows two Minitab scatterplots of this data. In Figure 3.3(a), the software
selected the scale for both axes. We obtained Figure 3.3(b) by specifying scaling for
the axes so that they would intersect at roughly the point (0, 0). The second plot
is much more crowded than the first one; such crowding can make it difficult to
ascertain the general nature of any relationship. For example, curvature can be over
looked in a crowded plot.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.1 Exercises 105
% removal
% removal
70
70
60
60
50
50
40
40
30
30
20 20
10 10
0 0
pH pH
7.0 7.5 8.0 8.5 9.0 9.5 10.0 0 2 4 6 8 10
(a) (b)
Large values of arsenic removal tend to be associated with low pH, a negative or in
verse relationship. Furthermore, the two variables appear to be at least approximately
linearly related, although the points in the plot would spread out somewhat about
any superimposed straight line (such a line appeared in the plot in the cited article).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
106 chapter 3 Bivariate and Multivariate Data and Distributions
a. Construct stem-and-leaf displays of both floor area is based on the amount of oil added (in g) and
and cost. Comment on any interesting features. the corresponding amount of oil recovered (in g)
b. Do the values of cost appear to be perfectly from wheat straw.
linearly related to the floor area values?
c. Construct a scatterplot of the data. Does Oil Added Oil Recovered
it appear that cost could be accurately 1.0 0.610
predicted by the value of floor area? Explain 1.5 0.840
2.1 1.512
your reasoning.
2.8 1.792
3. In the article referenced in Exercise 2, the relation 3.6 2.952
ship between the number of beds in a barrack and 4.5 2.880
the cost of the building was also investigated. 5.5 4.400
6.6 5.346
Number of Beds Cost
7.8 6.396
22 418,930
9.1 7.189
40 609,386
10.5 8.085
40 755,489
12.0 9.840
38 660,527
13.6 11.696
24 864,438
15.2 13.224
54 1,003,495
16.9 14.365
59 895,947
98 1,461,549 a. For each observation, determine the percentage
106 1,899,494 of oil recovery by wheat straw. Is this percentage
142 2,331,632 relatively constant across all observations? Was
190 2,833,203 the percentage higher at certain added oil levels
68 4,750,468 over others?
392 5,331,390 b. Do the values of the recovered oil appear to be
perfectly linearly related to the added oil values?
Construct a scatterplot based on this data. What ap
Why or why not?
pears to be the nature of the relationship between
c. Construct a scatterplot of the data. Does it
these two variables? Do you notice anything pecu
appear that recovered oil could be accurately
liar in the graph?
predicted by the value of added oil? Explain
4. Open water oil spills, such as the Deepwater your reasoning.
Horizon spill of 2010, can wreak terrible conse 5. The article “Objective Measurement of the
quences on the environment and be expensive to Stretch-ability of Mozzarella Cheese” (J. of Texture
clean up. Many physical and biological methods Studies, 1992: 185–194) reported on an experi
have been developed to recover oil from water ment to investigate how the behavior of mozzarella
surfaces. In the article “Capacity of Straw for Re cheese varied with temperature. Consider the ac
peated Binding of Crude Oil from Salt Water and companying data on x 5 temperature and y 5
Its Effect on Biodegradation” (J. Hazard. Toxic elongation (%) at failure of the cheese. Note: The
Radioact. Waste, 2012: 75–78), researchers exam researchers were Italian and used real mozzarella
ined how wheat straw could be used to extract cheese, not the poor cousin widely available in the
crude oil from a water surface. An experiment United States.
was conducted in which crude oil (0 to 16.9 g)
was added to 100 mL of saltwater in separate x: 59 63 68 72 74 78 83
Petri dishes. Wheat straw (2 g) was then added to y: 118 182 247 208 197 135 132
each dish and all dishes were shaken at 70 rpm a. Construct a scatterplot in which the axes
overnight. The following data read from a graph intersect at (0, 0). Mark 0, 20, 40, 60, 80, and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.1 Exercises 107
100 on the horizontal axis and 0, 50, 100, 150, coastal watersheds for (in m21) and average
200, and 250 on the vertical axis. long-term annual temperature (T in °C):
b. Construct a scatterplot in which the axes
T: 8.51 8.69 9.01 9.50 10.00 10.60 11.00 11.60
intersect at (55, 100), as was done in the cited
: .40 .42 .40 .43 .40 .38 .40 .30
article. Does this plot seem preferable to the one
in part (a)? Explain your reasoning. T: 11.60 12.60 12.60 13.60 14.20 15.30 17.90 17.90
c. What do the plots of parts (a) and (b) suggest : .41 .27 .28 .19 .22 .19 .13 .09
about the nature of the relationship between the
Construct a scatterplot of the data. How would you
two variables?
describe the nature of the relationship between the
6. Calcium phosphate cement is gaining increasing two variables?
attention for use in bone repair applications. The
article “Short-Fibre Reinforcement of Calcium 8. Researchers considered how the construction
Phosphate Bone Cement” (J. of Engr. in Med., cost of highway resurfacing projects in Kentucky
2007: 203–211) reported on a study in which were affected by that state’s asphalt price index
polypropylene fibers were used in an attempt to (API) and diesel price index (DPI) among other
improve fracture behavior. The following data on factors. From about the mid-1990s to 2010, Ken
x 5 fiber weight (%) and y 5 compressive strength tucky’s annual average API and DPI were found
(MPa) was provided by the article’s authors. to be closely related to the annual average crude
oil price. Based on this, the authors suggested
x: 0.00 0.00 0.00 0.00 0.00 1.25 that crude oil price could be used to predict API
y: 9.94 11.67 11.00 13.44 9.20 9.92 and DPI (“Prices of Highway Resurfacing Proj
ects in Economic Downturn: Lessons Learned
x: 1.25 1.25 1.25 2.50 2.50 2.50
y: 9.79 10.99 11.32 12.29 8.69 9.91
and Strategies Forward,” J. Mgmnt. Engr., 2012,
391–397).
x: 2.50 2.50 5.00 5.00 5.00 5.00 Consider the following monthly API and state
y: 10.45 10.25 7.89 7.61 8.07 9.04 wide crude oil index (COI) values for California
x: 7.50 7.50 7.50 7.50 10.00 10.00 during 2010−11, obtained from the California
y: 6.63 6.43 7.03 7.63 7.35 6.94 Department of Transportation.
x: 10.00 10.00
COI API COI API
y: 7.02 7.67
385.1 415.1 474.3 477.1
408.0 377.0 483.4 488.9
Construct a scatterplot of the data. How would you
400.8 402.8 504.9 586.3
describe the nature of the relationship between the
426.0 427.3 616.1 634.7
two variables?
437.0 436.9 656.6 667.5
7. In surface water hydrology, a common problem 384.0 360.8 606.0 592.2
is the estimation of long-term annual yield from 393.3 372.3 579.0 565.9
ungauged watersheds. In the article “General 402.9 417.2 588.4 570.5
ized Mediterranean Annual Water Yield Model: 404.2 376.5 536.8 589.7
Grunsky’s Equation and Long-Term Average 399.5 424.1 585.9 559.8
Temperature” (J. Hydrol. Engr., 2011: 874–879), 438.9 432.2 592.5 637.0
researchers propose a generalized water yield 447.8 450.6 650.0 625.0
model for watersheds. One important watershed-
specific component of the model is , a coeffi Construct a scatterplot of the data. How would you
cient characterizing the watershed’s annual water describe the nature of the relationship between the
yield response to annual precipitation. The article two variables? Does it seem to be the case that COI
provided the following data from 16 California and API are closely related?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
108 chapter 3 Bivariate and Multivariate Data and Distributions
3.2 Correlation
A scatterplot of bivariate numerical data gives a visual impression of how strongly x values
and y values are related. However, to make precise statements and draw reliable conclu
sions from data, we must go beyond pictures. A correlation coefficient (from co-relation)
is a quantitative assessment of the strength of relationship between x and y values in a set of
(x, y) pairs. In this section, we introduce the most frequently used correlation coefficient.
Figure 3.4 displays scatterplots that indicate different types of relationships between
the x and y values. The plot in Figure 3.4(a) suggests a very strong positive relationship
between x and y, that is, a strong tendency for y to increase as x increases. Figure 3.4(b)
gives evidence of a substantial negative relationship: As x increases, there is a tendency
for y to decrease (as would probably be the case for x 5 amount of time per week that
a high school student spends watching television and y 5 amount of time the student
spends studying). The plot of Figure 3.4(c) indicates no strong relationship between
the two variables; there is no tendency for y to either increase or decrease as x increases.
Finally, as illustrated in Figure 3.4(d), a scatterplot can show a strong positive (or
negative) relationship through a pattern that is curved rather than linear in appearance.
Figure 3.4 Scatterplots illustrating various types of relationships: (a) positive relationship, linear pattern;
(b) negative relationship, linear pattern; (c) no relationship or pattern; (d) positive relationship, curved pattern
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.2 Correlation 109
II – <0 I – >0
II I
– >0 – >0
–
–
– –
(a) (b)
II I
III IV
–
Unless otherwise noted, all content on this page is © Cengage Learning.
(c)
Similar reasoning for the data displayed in Figure 3.5(b), which exhibits a strong
negative relationship, implies that ^ (xi 2 x)(yi 2 y) will be a large negative number.
When there is no evidence of a strong relationship, as in Figure 3.5(c), positive and
negative products of deviations tend to counteract one another, giving a value of the
sum that is close to zero. In summary, ^ (xi 2 x)(yi 2 y) seems to be a reasonable mea
sure of the degree of association between the x and y values; it will be a large positive
number, a large negative number, or a number close to zero according to whether there
is a strong positive, a strong negative, or no strong relationship.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
110 chapter 3 Bivariate and Multivariate Data and Distributions
Unfortunately, our proposal has a serious deficiency: Its value depends on the
choice of unit of measurement for both x and y. Suppose, for example, that x is height.
Each x value expressed in inches will be 12 times the corresponding value expressed
in feet, and the same will then be true of x. It follows that the value of ^ (xi 2 x)(yi 2 y)
when the x unit is inches will be 12 times what it is when the unit is feet. A measure of
the inherent strength of the relationship should give the same value whatever the units
for the variables; otherwise our impressions may be distorted by the choice of units.
A straightforward modification of our initial proposal leads to the most popular
measure of association, one that is free of the defect just alluded to and has other attrac
tive properties.
r5
^(xi 2 x)(yi 2 y ) 5
Sxy
1 ^ xi 2 1^ yi 2
Sxy 5 ^ xiyi 2
n
Use of the computing formulas makes all the subtraction needed to obtain the devia
tions unnecessary. Instead, the following five summary quantities are needed: ^ xi, ^ yi,
^ x2i , ^ y2i , ^ xi yi. The following example shows how a tabular format facilitates the cal
culations (we’ll get to the issue of interpretation in a moment).
Example 3.3 The catch basin in a storm-sewer system is the interface between surface runoff and
the sewer. A catch-basin insert is a device for retrofitting catch basins to improve
their pollutant removal properties. The article “An Evaluation of the Urban Storm
water Pollutant Removal Efficiency of Catch Basin Inserts” (Water Envir. Res., 2005:
500–510) reported on tests of various inserts under controlled conditions for which
inflow is close to what can be expected in the field. Consider the following data, read
from a graph in the article, for one particular type of insert on x 5 amount filtered
(1000s of liters) and y 5 % total suspended solids removed.
x: 23 45 68 91 114 136 159 182 205 228
y: 53.3 26.9 54.8 33.8 29.9 8.2 17.2 12.2 3.2 11.1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.2 Correlation 111
The accompanying table contains five columns for the x, y, x2, y2, and xy values,
respectively. The sum of each column is given at the bottom of the table.
x y x2 y2 xy
23 53.3 529 2840.89 1225.9
45 26.9 2025 723.61 1210.5
68 54.8 4624 3003.04 3726.4
91 33.8 8281 1142.44 3075.8
114 29.9 12996 894.01 3408.6
136 8.2 18496 67.24 1115.2
159 17.2 25281 295.84 2734.8
182 12.2 33124 148.84 2220.4
205 3.2 42025 10.24 656
228 11.1 51984 123.21 2530.8
1251 250.6 199,365 9249.36 21,904.4
^ xi ^ yi ^ x2i ^ y2i ^ xi yi
Then
(1251)2
Sxx 5 199,365 2 5 42,865,
10
(250.6)2
Syy 5 9249.36 2 5 2969.3
10
(1251)(250.6)
Sxy 5 21,904.4 2 5 29446
10
from which
29446
r5 5 2.837
242,86522969.3
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
112 chapter 3 Bivariate and Multivariate Data and Distributions
strength of relationship based on r. It may seem surprising that a value of r as extreme as 2.5
or .5 should be in the “weak” category; an explanation for this is given later in the chapter.
1 .8 .5 0 .5 .8 1
4. r 5 1 only when all the points in a scatterplot of the data lie exactly on a straight
line that slopes upward. Similarly, r 5 21 only when all the points lie exactly on a
downward-sloping line. Only when there is a perfect linear relationship between x and
y in the sample will r take on one of its two possible extreme values.
5. The value of r is a measure of the extent to which x and y are linearly related—that
is, the extent to which the points in the scatterplot fall close to a straight line. A value
of r close to zero does not rule out any strong relationship between x and y; there could
still be a strong relationship but one that is not linear.
Example 3.4 As far back as Leonardo da Vinci, height and wingspan (measured from fingertip to
fingertip between outstretched hands) were known to be closely related. For the fol
lowing actual measurements (in inches) from 16 students in a statistics class notice
how close the two values are.
Height: 59.0 72.0 67.0 63.5 68.0 66.0 71.0 69.0
Wingspan: 57.5 70.5 69.0 63.5 71.0 67.0 71.5 68.5
Height: 73.0 69.0 69.5 72.0 73.5 73.0 74.0 70.0
Wingspan: 74.0 69.5 71.0 71.5 75.0 75.5 74.5 73.0
The scatterplot in Figure 3.7 shows an approximately linear shape, and the point
cloud is roughly elliptical. The correlation is computed to be 0.955. If the measure
ments were converted to centimeters, the correlation would remain unchanged.
75
Unless otherwise noted, all content on this page is © Cengage Learning.
70
Wingspan
65
60
55
60 62 64 66 68 70 72 74
Height
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.2 Correlation 113
Example 3.5 The article “Quantitative Estimation of Clay Mineralogy in Fine-Grained Soils”
(J. Geotech. Geoenviron. Engr., 2011: 997–1008) reported on various chemical prop
erties of natural and artificial soils. Consider the accompanying data on the cation
exchange capacity (CEC, in meq@100 g) and specific surface area (SSA, in m2@g) of
20 natural soils. A scatterplot appears in Figure 3.8.
CEC: 66 121 134 101 77 89 63 57 117 118
SSA: 175 324 460 288 205 210 295 161 314 265
CEC: 76 125 75 71 133 104 76 96 58 109
SSA: 236 355 240 133 431 306 132 269 158 303
500
400
SSA
300
200
100
50 60 70 80 90 100 110 120 130 140
CEC
Example 3.6 The accompanying data on y 5 glucose concentration (g/L) and x 5 fermentation
time (days) for a particular brand of malt liquor was read from a scatterplot appearing
in the article “Improving Fermentation Productivity with Reverse Osmosis” (Food
Tech., 1984: 92–96):
x: 1 2 3 4 5 6 7 8
y: 74 54 52 51 52 53 58 71
The scatterplot of Figure 3.9 (page 114) suggests a strong relationship, but not a lin
ear one, between x and y. With
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
114 chapter 3 Bivariate and Multivariate Data and Distributions
80
70
60
50
2 4 6 8
(36)(465)
Sxy 5 2094 2 5 1.5000
8
(36)2
Sxx 5 204 2 5 42 Syy 5 586.875
8
1.500
r5 5 .0096 .01
2422586.875
This shows the importance of interpreting r as measuring the extent of any linear rela
tionship. We should not conclude that there is no relation whatsoever just because r 0.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.2 Exercises 115
In Chapter 11, we show how the sample characteristic r can be used to make an
inference concerning the population characteristic . In particular, r can be used to
decide whether 5 0 (no linear relationship in the population).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
116 chapter 3 Bivariate and Multivariate Data and Distributions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 117
15. A sample of automobiles traversing a certain stretch a. Calculate the sample correlation coefficient for
of highway is selected. Each automobile travels at a the nine (x, y) pairs.
roughly constant rate of speed, though speed does b. Let x1 be the average score on the first midterm
vary from auto to auto. Let x 5 speed and y 5 time exam for the 8 a.m. students and y1 be the average
needed to traverse this segment of highway. Would score on the second midterm for these students.
the sample correlation coefficient be closest to .9, Denote the two averages for the noon students by
.3, 23, or 2.9? Explain. x2 and y2, and for the night students by x3 and y3.
Calculate r for these three (x, y) pairs.
16. Suppose that x and y are positive variables and that a
c. Construct a scatterplot of the nine (x, y) pairs
sample of n pairs results in r 1. If the sample cor
and another one of the three pairs of averages.
relation coefficient is computed for the (x, y2) pairs,
Can you see why r in part (a) is smaller than r in
will the resulting value also be approximately 1?
part (b)? Does this suggest that a correlation co
Explain.
efficient based on averages (called an “ecologi
17. Nine students currently taking introductory statis cal” correlation) might be misleading? Explain.
tics are randomly selected, and both the first mid 18. Suppose data is collected on two quantitative vari
term exam score (x) and the second midterm score ables, x and y. Let r be the corresponding sample cor
(y) are determined. Three of the students have the relation coefficient for (x, y). The x and y values are
class at 8 a.m., another three have it at noon, and then transformed as follows: x= 5 a 1 bx, y= 5 c 1 dy
the remaining three have a night class. The result where a, b, c, and d are constants. Let r= be the cor
ing (x, y) pairs are as follows: responding sample correlation coefficient for (x=, y=).
8 a.m.: (70, 60) (72, 83) (94, 85) a. Show that x= 5 a 1 bx and y= 5 c 1 dy.
Noon: (80, 72) (60, 74) (55, 58) b. Show that sx= 5 bsx and sy= 5 dsy.
Night: (45, 63) (50, 40) (35, 54) c. Show that r 5 r=.
y 5 25 1 (.30)(100) 5 25 1 30 5 55
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
118 chapter 3 Bivariate and Multivariate Data and Distributions
y 5 25 1 .30x
(15, 47)
50
= 10 + 2
30
(13, 28)
20
10
0
5 10 15 20 25
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 119
Similarly,
deviation from (13, 28) 5 28 2 [10 1 2(13)] 5 28
A positive deviation results from a point that lies above the chosen line, and a negative
deviation from a point that lies below this line. A particular line gives a good fit if the
deviations from the line are small in magnitude, that is, reasonably close to zero.
We now need a way to combine the n deviations into a single measure of fit. The
standard approach is to square the deviations (to obtain nonnegative numbers) and sum
these squared deviations.
DEFINITIONS The most widely used criterion for assessing the goodness of fit of a line y 5 a 1
bx to bivariate data (x1, y1), . . . , (xn, yn) is the sum of the squared deviations about
the line:
According to the principle of least squares, the line that gives the best fit to
the data is the one that minimizes this sum; it is called the least squares line or
sample regression line.
To find the equation of the least squares line, let g( a , b) 5 ^ [yi 2 (a 1 bxi)]2.
Then the intercept a and slope b of the least squares line are the values of a and
b that minimize g1 a, b2. These minimizing values are obtained by taking the par
tial derivative of the g function first with respect to a and then with respect to b,
and equating these two partial derivatives to zero (this is analogous to solving the
=
single equation f (z) 5 0 to find the value of z that minimizes a function of a single
variable). This results in the following two equations in two unknowns, called the
normal equations:
na 1 1^ xi 2 b 5 ^ yi 1^ xi 2 a 1 1 ^ x2i 2 b 5 ^ xi yi
These equations are easily solved because they are linear in the unknowns (a conse
quence of using squared deviations in the fitting criterion).
5
^ 2 A ^ B A ^ B/
5
^ 2
2 A ^ B 2/
The equation of the least squares line is often written as yn 5 a 1 bx, where the
“ˆ” above y emphasizes that yn is a prediction of y that results from the substitution of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
120 chapter 3 Bivariate and Multivariate Data and Distributions
any particular x value into the equation. Notice that the numerator and denominator
of b appeared previously in the formula for the sample correlation coefficient r.
Example 3.7 The cetane number is a critical property in specifying the ignition quality of a fuel
used in a diesel engine. Determining this number for a biodiesel fuel is expensive and
time consuming. The article “Relating the Cetane Number of Biodiesel Fuels to Their
Fatty Acid Composition: A Critical Study” (J. of Automobile Engr., 2009: 565–583) in
cluded the following data on x 5 iodine value (g) and y 5 cetane number for a sample
of 14 biofuels. The iodine value is the amount of iodine necessary to saturate a sample
of 100 g of oil.
x: 132.0 129.0 120.0 113.2 105.0 92.0 84.0
y: 46.0 48.0 51.0 52.1 54.0 52.0 59.0
x: 83.2 88.4 59.0 80.0 81.5 71.0 69.2
y: 58.7 61.6 64.0 61.4 54.6 58.8 58.0
The necessary summary quantities for hand calculation can be obtained by plac
ing the x values in a column and the y values in another column and then creating
columns for x2, xy, and y2 (the latter value is not needed at the moment but will be
used shortly). Calculating the column sums gives
^ xi 5 1307.5, ^ yi 5 779.2, ^ x2i 5 128,913.93,
^ xiyi 5 71,347.30, ^ y2i 5 43,745.22
from which
1307.5 779.2
x5 5 93.392857, y5 5 55.657143
14 14
Sxx 5 128,913.93 2 (1307.5)2 / 14 5 6802.7693
Sxy 5 71,347.30 2 (1307.5)(779.2) / 14 521424.41429
Thus
21424.41429
b5 5 2.20938742
6802.7693
a 5 55.657143 2 (2.20938742)(93.392857) 5 75.212432
and the equation of the least squares line is yn 5 75.212 2 .2094x, exactly that re
ported in the cited article.
Figure 3.11 generated by the statistical computer package Minitab shows
that the least squares line is a very good summary of the relationship between the
two variables. A prediction of the cetane number when the iodine value is 100 is
yn 5 75.212 2 .2094(100) 5 54.27. The slope of the least squares line tells us that
a decrease of roughly .209 in cetane number is associated with a 1-gram increase
in iodine value.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 121
65
Cet num 60
55
50
45
Figure 3.11 Scatterplot from Minitab for Example 3.7 with least squares
line superimposed
The least squares line should not be used to make a prediction for an x value much
beyond the range of the data, such as x 5 50 or x 5 250 in Example 3.7. The danger
of extrapolation is that the fitted relationship (here, a line) may not be valid for such x
values.
Regression
The term regression comes from the relationship between the least squares line and the
sample correlation coefficient. Let sx and sy denote the sample standard deviations of the
x and y values, respectively. Algebraic manipulation gives
Unless otherwise noted, all content on this page is © Cengage Learning.
sy sy
b 5 ra b yn 5 y 1 ra b (x 2 x)
sx sx
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
122 chapter 3 Bivariate and Multivariate Data and Distributions
height; the predicted height of a son was always closer to the mean height than was
his father’s height.
1 ^ yi 2 2
^ y2i 2 n
A computing formula for residual sum of squares makes it unnecessary to calcu
late the residuals:
SSResid 5 SSTo 2 bSxy
Because b and Sxy have the same sign, bSxy is a positive quantity unless b 5 0, so the
computing formula shows that SSResid 5 SSTo if b 5 0 and SSResid 6 SSTo otherwise.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 123
To avoid any rounding effects, use as much decimal accuracy in b as possible when
computing SSResid.
SSResid is often referred as a measure of “unexplained” variation; it is the amount
of variation in y that cannot be attributed to the linear relationship between x and y.
The more points in the scatterplot deviate from the least squares line, the larger the
value of SSResid and the greater the amount of y variation that cannot be explained
by a linear relation. Similarly, SSTo is interpreted as a measure of total variation; the
larger the value of SSTo, the greater the amount of variability in the observed yi’s. The
ratio SSResid/SSTo is the fraction or proportion of total variation that is unexplained
by a straight-line relation. Subtracting this ratio from 1.0 gives the proportion of total
variation that is explained.
Example 3.8 The scatterplot of the iodine value and cetane number data in Figure 3.11 portends
a reasonably high r2 value. With
Sxy 5 21424.41429 (the numerator of b) b 5 2.20938742
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
124 chapter 3 Bivariate and Multivariate Data and Distributions
100 2
SSResid
SSTo
The symbol r was used in Section 3.2 to denote Pearson’s sample correlation coef
ficient. It is not coincidental that r2 is used to represent the coefficient of determination. Unless otherwise noted, all content on this page is © Cengage Learning.
The notation suggests how these two quantities are related:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 125
DEFINITION The standard deviation about the least squares line is given by
SSResid
se 5
A n22
Roughly speaking, se is the typical amount by which an observation deviates from the
least squares line. Justification for division by n 2 2 and the use of the subscript e are
given in Chapter 11.
Example 3.9 The values of x 5 commuting distance and y 5 commuting time were determined
for workers in samples from three different regions. Data is presented in Table 3.1;
the three scatterplots are displayed in Figure 3.13.
For sample 1, a rather small proportion of variation in y can be attributed to
an approximate linear relationship, and a typical deviation from the least squares
line is roughly 4. The amount of variability about the line for sample 2 is the
same as for sample 1, but the value of r2 is much higher because y variation is
much greater overall in sample 2 than in sample 1. Sample 3 yields roughly the
same high value of r2 as does sample 2, but the typical deviation from the line for
sample 3 is only half that for sample 2. A complete picture of variation requires
that both r2 and se be computed.
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
126 chapter 3 Bivariate and Multivariate Data and Distributions
60 60 60
40 40 40
20 20 20
10 20 30 10 20 30 40 50 10 20 30 40 50
(a) Region 1 (b) Region 2 (c) Region 3
DEFINITION A residual plot is a plot of the (x, residual) pairs—that is, of the pairs
(x1, y1 2 ny1), (x2, y2 2 ny2), . . . , (xn, yn 2 nyn)—or of the residuals versus predicted
values—the pairs (yn1, y1 2 yn1), . . . , (ynn, yn 2 ynn).
Example 3.10 Consider the accompanying data (page 127) on x 5 height (in.) and y 5 average
weight (lb) for American females aged 30–39 (taken from The World Almanac and
Book of Facts). The scatterplot displayed in Figure 3.14(a) appears rather straight.
However, when the residuals from the least squares line (yn 5 298.2313.596x) are
plotted, substantial curvature is apparent (even though r2 < .99). It is not accurate to
say that weight increases in direct proportion to height (linearly with height). Instead,
average weight increases somewhat more rapidly in the range of relatively large
heights than it does for relatively small heights.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Fitting a Line to Bivariate Data 127
x: 58 59 60 61 62 63 64 65
y: 113 115 118 121 124 128 131 134
x: 66 67 68 69 70 71 72
y: 137 141 145 150 153 159 164
Residual
3
170
2
160
1
150
62 70
0
140 58 66
1
130
2
120
3
58 62 66 70 74
(a) (b)
Figure 3.14 Plots of data from Example 3.10: (a) scatterplot; (b) residual plot
We also hope that there are no unusual points in the plot. A point falling far
above or below the horizontal line at height zero corresponds to a large residual,
which may indicate some type of unusual behavior, such as a recording error, non
standard experimental condition, or atypical experimental subject. A point whose
x value differs greatly from others in the data set may have exerted excessive influ
ence in determining the fitted line. One method for assessing the impact of such
Unless otherwise noted, all content on this page is © Cengage Learning.
an isolated point on the fit is to delete it from the data set and then recompute the
best-fit line and various other quantities. Substantial changes in the equation, pre
dicted values, r2, and se warn of instability in the data. More information may then
be needed before reliable conclusions can be drawn.
Example 3.11 Bioaerosols are airborne particles such as bacteria or pollen that, when found in
indoor environments, may cause infectious or allergic health effects. The Andersen
method for determining bioaerosol concentration requires a 2–7-day incubation
period. The article “Measurement of Indoor Bioaerosol Levels by a Direct Count
ing Method” (J. of Envir. Engr., 1996: 374–378) discussed an alternative technique,
the FFDC method. Consider the accompanying data, read from a plot in the cited
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
128 chapter 3 Bivariate and Multivariate Data and Distributions
Observation x y yn Residual
1 119 239 225.1 13.9
2 140 262 240.3 21.7
3 150 202 247.6 245.6
4 157 224 252.7 228.7
5 171 255 262.8 27.8
6 200 292 283.9 8.1
7 218 350 296.9 53.1
8 250 298 320.2 222.2
9 272 313 336.2 223.2
10 321 415 371.7 43.3
11 573 542 554.7 212.7
The equation of the least squares line is yn 5 138.68 1 .726x, with r2 5 .901. (The
slope, intercept, and r2 differ very slightly from values given in the article.)
7
500 40 10
20
FFDCconc
400
Residual
Potentially 0
300 observation 20
40
200 8
200 300 400 500 300 400 500
Andconc Predicted
(a) (b) Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 3.15 shows a scatterplot and a residual plot (here, residuals versus predicted
values) from R (this package has excellent graphics capabilities). There is no single
residual that is much larger in magnitude than the other residuals. The most strik
ing feature here is that x11 is much larger than any other x value in the sample, so
that (x11, y11) is an observation with potentially high influence (sometimes called a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Exercises 129
high-leverage observation). This point would not in fact be highly influential if it fell
close to the least squares line based on just the first ten observations. However, the
equation of this line is yn 5 115.09 1 .850x with r2 5 .757; this r2 value is much
lower than the original value, and the slope and intercept have also changed substan
tially. Without the influential observation, evidence for a very strong linear relation
ship between concentrations assessed by the two methods is not nearly so compelling.
Resistant Lines
As Example 3.11 shows, the least squares line can be greatly affected by the presence
of even a single observation that shows a large discrepancy in the x or y direction
from the rest of the data. When the data set contains such unusual observations, it
is desirable to have a method for obtaining a summarizing line that is resistant to
the influence of these stray values. In recent years, many methods for obtaining a
resistant (or robust) line have been proposed, and various statistical packages will
fit such lines. Consult a statistician or a book on exploratory data analysis to obtain
more information.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
130 chapter 3 Bivariate and Multivariate Data and Distributions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.3 Exercises 131
c. Predict the value of the compressive strength b. Determine the equations for the least squares
when the fiber weight percentage is 6.5. line for the PMMA and glass data sets. Interpret
d. Would you feel comfortable using the least the slope for each equation.
squares line to predict the compressive strength c. For the PMMA lens, predict the reduction rate
when the fiber weight percentage is 25? Explain. of transmittance when sandblast momentum is
Now predict the value of y when x 5 25 and in at 50 g.m/s. Do the same for the glass lens type.
terpret the result. d. Based on your results, which lens type per
formed better in this experiment?
24. By their nature, deserts are typically exposed to
large amounts of solar radiation. Thus, such re 25. Two important properties of a soil are its initial
gions seem to be prime locations for harvesting so void ratio (e0, a measure of soil porosity) and its
lar energy through the installation of photovoltaic compression index (Cc, an indicator of soil com
modules. These modules rely on an optical system pressibility). The article “Consolidation and
to collect sunlight, often through some lens, so an Hydraulic Conductivity of Zeolite-Amended
important factor to consider would be the effect Soil-Bentonite Backfills” (J. Geotech. Geoenvi-
of desert sandstorms on lens performance. The ron. Engr., 2012: 15–25) reported the following
authors of “Sandblasting Durability of Acrylic and data (read from a graph) for the Cc and e0 vari
Glass Fresnel Lenses for Concentrator Photovol ables for sand–bentonite backfills with varying
taic Modules” (Solar Energy, 2012: 3021–3025) amounts and types of zeolites.
compared the performance of sandblasted acrylic
and glass Fresnel lenses used in concentrator pho e0: 0.988 1.018 1.058 1.070 1.085 1.145
tovoltaic modules. In the experiment, the transmit Cc: 0.19 0.20 0.20 0.22 0.23 0.24
tance after sandblasting of acrylic polymethylmeth
acrylate (PMMA) and glass Fresnel lenses were a. Using Cc as the response and e0 as the ex
measured. The experimental data, kindly provided planatory variable, create the corresponding
by the authors, compares y 5 reduction rate of scatterplot. Do the values of Cc appear to
transmittance (%) and x 5 sandblast momen be perfectly linearly related to the e0 values?
tum (g . m/s) for 14 PMMA and 8 glass substrate Explain.
samples: b. Determine the equation of the least squares
line.
PMMA: 10.56 20.80 15.84 31.20 48.00 c. What proportion of the observed variation in the
PMMA: 8.56 18.93 19.35 23.65 33.05 compression index can be attributed to the ap
proximate linear relationship between the two
PMMA: 21.12 41.60 64.00 16.80 33.20
variables?
PMMA: 18.53 29.21 40.39 17.21 27.21
d. Predict the value of the compression index
when the initial void ratio is 1.10. Would you
PMMA: 51.20 13.92 27.84 42.72
PMMA: 34.74 17.40 25.89 32.82 feel comfortable using the least squares line to
predict the compression index when the initial
Glass: 35.20 52.80 105.60 52.80 70.40 void ratio is .80? Explain.
Glass: 5.62 8.10 31.21 13.76 15.37
26. In biofiltration of wastewater, air discharged from
Glass: 56.00 48.00 139.20 a treatment facility is passed through a damp po
Glass: 14.76 16.55 37.08 rous membrane that causes contaminants to dis
solve in water and be transformed into harmless
a. In one graph, overlay the scatterplots for the products. The accompanying data on x 5 inlet
PMMA and the glass data sets and comment on temperature (°C) and y 5 removal efficiency (%)
any interesting features. Be sure to use different was the basis for a scatterplot that appeared in the
symbols for each data set. article “Treatment of Mixed Hydrogen Sulfide
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
132 chapter 3 Bivariate and Multivariate Data and Distributions
and Organic Vapors in a Rock Medium Biofilter” e. Personal communication with the authors of the
(Water Environment Research, 2001: 426–435): article revealed that there was one additional
observation that was not included in their scat
Removal Removal
terplot: (6.53, 96.55). What impact does this ad
Obs Temp % Obs Temp %
ditional observation have on the equation of the
1 7.68 98.09 17 8.55 98.27
least squares line and the values of se and r2?
2 6.51 98.25 18 7.57 98.00
3 6.43 97.82 19 6.94 98.09 27. Consider the following four (x, y) data sets; the first
4 5.48 97.82 20 8.32 98.25 three have the same x values, so these values are
5 6.57 97.82 21 10.50 98.41 listed only once (from “Graphs in Statistical Analy
6 10.22 97.93 22 17.83 98.51
sis,” Amer. Statistician, 1973: 17–21).
7 15.69 98.38 23 17.83 98.71
8 16.77 98.89 24 17.03 98.79 For each of these four data sets, the values of
9 17.13 98.96 25 16.18 98.87 the summary quantities, ^ xi, ^ yi, and so on, are al
10 17.63 98.90 26 16.26 98.76 most identical, so the equation of the least squares
11 16.72 98.68 27 14.44 98.58 line(yn 5 3 1 .5x), SSResid, SSTo, r2, and se will be
12 15.45 98.69 28 12.78 98.73 virtually the same for all four. Based on a scatterplot
13 12.06 98.51 29 12.25 98.45 and a residual plot for each data set, comment on
14 11.44 98.09 30 11.69 98.37 the appropriateness of fitting a straight line; include
15 10.17 98.25 31 11.34 98.36
any specific suggestions for how a “straight-line
16 9.64 98.36 32 10.97 98.45
analysis” might be modified or qualified.
Calculated summary quantities are ^ xi 5 384.26,
Data set: 1–3 1 2 3 4 4
^ yi 5 3149.04, ^ x2i 5 5099.2412, ^ xi yi 5
Variable: x y y y x y
37,850.7762, and ^ y2i 5 309,892.6548.
10.0 8.04 9.14 7.46 8.0 6.58
a. Does a scatterplot of the data suggest ap
propriateness of the simple linear regression 8.0 6.95 8.14 6.77 8.0 5.76
model? 13.0 7.58 8.74 12.74 8.0 7.71
b. Determine the equation of the least square line, 9.0 8.81 8.77 7.11 8.0 8.84
obtain a point prediction of removal efficiency 11.0 8.33 9.26 7.81 8.0 8.47
when temperature 5 10.50, and calculate the 14.0 9.96 8.10 8.84 8.0 7.04
value of the corresponding residual. 6.0 7.24 6.13 6.08 8.0 5.25
c. Roughly what is the size of a typical deviation of 4.0 4.26 3.10 5.39 19.0 12.50
points in the scatterplot from the least squares line? 12.0 10.84 9.13 8.15 8.0 5.56
d. What proportion of observed variation in re 7.0 4.82 7.26 6.42 8.0 7.91
moval efficiency can be attributed to the ap 5.0 5.68 4.74 5.73 8.0 6.89
proximate linear relationship?
Power Transformations
Suppose that the general pattern in a scatterplot is curved and monotonic—either strictly
increasing or strictly decreasing. In this case, it is often possible to find a power trans-
formation for x or y so that there is a linear pattern in a scatterplot of the transformed
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.4 Nonlinear Relationships 133
data. By a power transformation, we mean the use of exponents p and q such that the
transformed values are x= 5 xp and/or y= 5 yq; the relevant scatterplot is of the (x=, y=)
pairs. Figure 3.16 displays a “ladder” of the most frequently used transformations and
a guide for choosing an appropriate transformation, depending on the pattern in the
original scatterplot.
4 1
Power transformation ladder:
Transformed value 5 (original value)POWER
Power Transformed value Name
3
3 (Original value) Cube
2 (Original value)2 Square
, ,
1 Original value No transformation
1
1Original value Square root
2
1 3
2 Original value Cube root
3
, ,
0 Log(original value) Logarithm
–1 1 / (original value) Reciprocal
3 2
For example, suppose the pattern has the shape of segment 2 in Figure 3.16. Then
to straighten the plot, we should use a transformation on x that is up the ladder from
the no-transformation row, for example, x= 5 x2 or x3, or a transformation on y that is
down the ladder, such as y= 5 1yy or ln(y) (log10 would produce equivalent results). A
Unless otherwise noted, all content on this page is © Cengage Learning.
residual plot should be used to check that curvature has in fact been removed. Once a
straightening transformation has been identified, a straight line can be fit to the (x=, y=)
points using least squares. If it was not necessary to transform y, then the line provides
a direct way of predicting y values: calculate x= and substitute into the equation. When
y has been transformed, the line gives predictions of y= values. The transformation can
then be reversed to obtain predictions of y. For example, if x= 5 1yx and y= 5 1y, the
least squares line gives
1y a 1 byx
from which
y (a 1 byx)2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
134 chapter 3 Bivariate and Multivariate Data and Distributions
Example 3.12 No tortilla chip aficionado likes soggy chips, so it is important to find characteristics of
the production process that produce chips with an appealing texture. The following
data on x 5 frying time (sec) and y 5 moisture content (%) appeared in the article
“Thermal and Physical Properties of Tortilla Chips as a Function of Frying Time”
(J. of Food Processing and Preservation, 1995: 175–189):
x: 5 10 15 20 25 30 45 60
y: 16.3 9.7 8.1 4.2 3.4 2.9 1.9 1.3
The scatterplot in Figure 3.17(a), opposite, has the pattern of segment 3 in
Figure 3.16, so we must go down the ladder for x or y. A scatterplot of the (ln(x),
ln(y)) pairs in Figure 3.17(b) is quite straight. A regression of ln(y) on ln(x) gives
a 5 4.6384, b 5 21.04920, and r2 5 .976. The residual plot of Figure 3.17(c)
shows no evidence of curvature, though there is one rather large residual.
Moisture content
15
10
0 Frying time
0 10 20 30 40 50 60
(a)
ln(moisture content)
0 ln(frying time)
2 3 4
(b)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.4 Nonlinear Relationships 135
Residual
.3
.2
.1
–.1
Figure 3.17 Plots of the data from Example 3.12: (c) plot of the
residuals from the transformed regression
Taking the antilog of 1.495 gives a prediction of y itself: e1.495 5 4.46%. In fact, taking
the antilog of both sides of the linear equation gives an explicit nonlinear relation
ship between x and y:
y 5 e ln(y) e4.6384 2 1.04920[ln(x)] 5 (e4.6384)(e 21.04920 ln(x)) 5 103.379x 21.04920
This is often called a power function relationship between x and y.
decrease (like a bowl turned upside down) or for y first to decrease and then to increase.
In such instances, it is reasonable to fit a quadratic function a 1 b1x 1 b2x2, whose
graph is a parabola, to the data. If the quadratic coefficient b2 is positive, the parabola
turns upward, whereas it turns downward if b2 is negative. Just as in fitting a straight line,
the principle of least squares can be employed to find the best-fit quadratic. The least
squares coefficients a, b1, and b2 are the values of a, b 1, and b 2 that minimize
which is the sum of squared vertical deviations from the points in the scatterplot to the parab
ola determined by the quadratic with coefficients a, b 1, and b 2. Taking the partial derivative
of the g function first with respect to a, then with respect to b 1, and finally with respect to b 2,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
136 chapter 3 Bivariate and Multivariate Data and Distributions
and equating these three expressions to zero gives three equations in three unknowns. These
normal equations are again linear in the unknowns, but because there are three rather than
just two, there is no explicit elementary expression for their solution. Instead, matrix algebra
must be used to solve the system numerically for each different data set. Fortunately, solution
procedures have been programmed into the most popular statistical computer packages, so
it is necessary only to make the appropriate request and then sit back and wait for output.
Example 3.13 The scatterplot of y 5 glucose concentration versus x 5 fermentation time shown in
Figure 3.9 (at the end of Section 3.2) has the appearance of an upward-turning qua
dratic. We supplied the data to Minitab and made the appropriate regression request
to obtain the accompanying output. The fitted quadratic equation appears at the top
of the output, and the values of the least squares coefficients a, b1, b2 appear in the
Coef column just below the equation. A prediction for glucose concentration when
fermentation time is 4 hours is
yn 5 84.482 2 15.875(4) 1 1.7679(4)2 5 49.27
The regression equation is
glucconc = 84.5 - 15.9 time + 1.77 timesqd
Predictor Coef Stdev t-ratio p
Constant 84.482 4.904 17.23 0.000
time -15.875 2.500 -6.35 0.001
timesqd 1.7679 0.2712 6.52 0.001
s = 3.515 R–sq = 89.5% R–sq (adj) = 85.3%
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 525.11 262.55 21.25 0.004
Error 5 61.77 12.35
Total 7 586.88
Predicted or fitted values yn1, . . . , ynn are obtained by substituting the successive x
values x1, . . . , xn into the fitted quadratic equation (e.g., in Example 3.13, yn4 5 49.27),
and the residuals are the vertical deviations y1 2 yn1, . . . , yn 2 ynn from the observed
points to the graph of the fitted quadratic (e.g., y4 2 yn4 5 51 2 49.27 5 1.73).
Residual or error sum of squares and total sum of squares are defined exactly as they
were previously:
The Minitab output of Example 3.13 shows that SSResid 5 61.77 and SSTo 5
586.88. The coefficient of multiple determination, denoted by R2, is now the
proportion of observed y variation that can be attributed to the approximate qua
dratic relationship:
SSResid
R2 5 1 2
SSTo
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.4 Nonlinear Relationships 137
The R2 value in Example 3.13 is .895, so about 89.5% of the observed variation in
glucose concentration can be attributed to the approximate quadratic relation between
concentration and fermentation time.
The methodology employed to fit a quadratic is easily extended to fit a higher-order
polynomial. For example, using the principle of least squares to fit a cubic equation
gives a system of normal equations consisting of four equations in four unknowns. The
arithmetic is best left to a statistical computer package. In practice, a cubic equation
is rather rarely fit to data, and it is virtually never appropriate to fit anything of higher
order than this.
Smoothing a Scatterplot
Sometimes the pattern in a scatterplot is too complex for a line or curve of a par
ticular type (e.g., exponential or parabolic) to give a good fit. Statisticians have
recently developed some more flexible methods that permit a wide variety of pat
terns to be modeled using the same fitting procedure. One such method is LOW-
ESS (or LOESS), short for locally weighted scatterplot smoother. Let (x , y ) denote
a particular one of the n (x, y) pairs in the sample. The yn value corresponding to
(x , y ) is obtained by fitting a straight line using only a specified percentage of
the data (e.g., 25%) whose x values are closest to x . Furthermore, rather than
use “ordinary” least squares, which gives equal weight to all points, those with x
values closer to x are more heavily weighted than those whose x values are farther
away.1 The height of the resulting line above x is the fitted value yn .This process
is repeated for each of the n points, so n different lines are fit (you surely wouldn’t
want to do all this by hand). Finally, the fitted points are connected to produce a
LOWESS curve.
Example 3.14 Weighing large deceased animals found in wilderness areas is usually not feasible,
so it is desirable to have a method for estimating weight from various characteristics
of an animal that can be easily determined. Minitab has a stored data set consist
ing of various characteristics for a sample of n 5 143 wild bears. Figure 3.18(a),
opposite, displays a scatterplot of y 5 weight versus x 5 distance around the chest
(chest girth). At first glance, it looks as though a single line obtained from ordinary
least squares would effectively summarize the pattern. Figure 3.18(b) shows the
LOWESS curve produced by Minitab using a span of 50% (the fit at (x , y ) is
determined by the closest 50% of the sample). The curve appears to consist of two
straight-line segments joined together above approximately x 5 38. The steeper
line is to the right of 38, indicating that weight tends to increase more rapidly as
girth does for girths exceeding 38 in.
1
The weighted least squares criterion involves finding a and b to minimize, ^ wi[yi 2 ( a 1 bxi)]2, where
w1, . . . , wn are nonnegative weights. For example, if we take w5 5 0, then (x5, y5) is disregarded in obtaining
the fitted line. R will also fit a local quadratic in this way.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
138 chapter 3 Bivariate and Multivariate Data and Distributions
500
500
400
400
300
Weight
300
Weight
200
200
100
100
0 0
20 30 40 50 20 30 40 50
Chest.G Chest.G
(a) (b)
Figure 3.18 A Minitab scatterplot and LOWESS curve for the bear weight data of Example 3.14
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.4 Exercises 139
a. Would you fit a straight line to the data and use an inappropriate choice of transformation. If
it as a basis for predicting nondimensionalized necessary, return to part (b) and try a different
total bed load from the unsteadiness parameter? transformation.
Why or why not?
31. Failures in aircraft gas turbine engines due to
b. Find a transformation that produces an ap
high cycle fatigue is a pervasive problem. The
proximate linear relationship between the trans
article “Effect of Crystal Orientation on Fatigue
formed values. Then fit a line to the transformed
Failure of Single Crystal Nickel Base Turbine
data and use it to obtain an equation that de
Blade Superalloys” (J. of Engr. for Gas Turbines
scribes approximately the relationship between
and Power, 2002: 161–176) gave the accompany
the untransformed variables.
ing data and fit a nonlinear regression function
30. In the article “Sensitivity of Oklahoma Bind in order to predict strain amplitude from cycles
ers on Dynamic Modulus of Asphalt Mixes and to failure.
Distress Functions” (J. Mater. Civ. Engr., 2012:
Obs Cycfail Strampl Obs Cycfail Strampl
1076–1088), researchers measured various physi
cal characteristics of performance grade asphalt 1 1326 .01495 11 7356 .00576
binders commonly used in Oklahoma. One im 2 1593 .01470 12 7904 .00580
portant physical characteristic is dynamic shear 3 4414 .01100 13 79 .01212
modulus, G (kPa), which is the ratio of maxi 4 5673 .01190 14 4175 .00782
mum shear stress to the maximum shear strain and 5 29,516 .00873 15 34,676 .00596
is a measure of the stiffness or resistance of the 6 26 .01819 16 114,789 .00600
asphalt binder to deformation under load. In one 7 843 .00810 17 2672 .00880
experiment, the researchers measured the dynam 8 1016 .00801 18 7532 .00883
ic shear modulus of the asphalt binder samples 9 3410 .00600 19 30,220 .00676
over a range of testing temperatures (°C). The fol 10 7101 .00575
lowing is the corresponding data for binder type
PG64-22: a. Construct scatterplots of y versus x, y versus
ln(x), ln(y) versus ln(x), and 1@y versus 1@x.
Temp: 54.4 46.1 43.3 29.4 b. Which transformation from part (a) does the
G: 9.28 32.47 46.98 344.36 best job of producing an approximate linear
Temp: 21.1 12.7 4.4 relationship?
G: 1,030.38 4,870.00 18,300.00 c. Use the selected transformation to predict am
plitude when cycles to failure 5 5000.
a. Construct a scatterplot of y 5 dynamic shear 32. There has been an increasing demand for open-
modulus versus x 5 temperature. Would it be ended steel pipe piles to be used as deep founda
reasonable to characterize the relationship tions for offshore and onshore structures. When
between the two variables as approximately an open-ended pile is driven into the ground, a
linear? soil plug often forms within the pile. The driving
b. Transform only the dependent variable y so that resistance and the base capacity of the pile are
a scatterplot of the transformed data shows a heavily influenced by this plugging effect. As an
substantial linear pattern. Then fit a straight line indicator of the degree of plugging, researchers of
to this data, use the line to establish an approxi ten use the plug length ratio (PLR), which is the
mate relationship between x and y, and predict ratio of the plug length at the end of pile instal
the dynamic shear modulus when the tempera lation to the length of the pile. The article “Base
ture is 35°C. Capacity of Open-Ended Steel Pipe Piles in Sand”
c. Plot the residuals from your linear fit in part (b) (J. Geotech. Geoenviron. Engr., 2012: 1116–1128)
and look for any patterns that might suggest reported the PLR and corresponding pile inner
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
140 chapter 3 Bivariate and Multivariate Data and Distributions
diameter, d (mm), of nine test piles used in case a. Is it possible to transform this data as described
studies. The data is given here: in this section so that there is an approximate
linear relationship between the transformed
d: 691.0 292.0 83.7 37.2 78.9
variables? Why or why not?
PLR: 1.00 0.82 0.76 0.44 0.76
b. Use a statistical computer package to fit a qua
d: 107.9 82.5 1444.0 1444.0 dratic function to this data and then predict
PLR: 0.88 0.75 1.00 1.00 bond strength when thickness is 500. Assess the
fit of the quadratic to the data.
a. The authors were interested in predicting
PLR based on the pile inner diameter. Trans 34. The accompanying data was extracted from the
form only the independent variable x so that article “Effects of Cold and Warm Temperatures
a scatterplot of the transformed data shows a on Springback of Aluminum-Magnesium Alloy
substantial linear pattern. Then fit a straight 5083-H111” (J. Engr. Manuf., 2009: 427–431). The
line to this data, use the line to establish an response variable is yield strength (MPa), and the
approximate relationship between x and y, and predictor is temperature ( C).
predict the plug length ratio when the pile in x: 250 25 100 200 300
ner diameter is 500 mm. y: 91.0 120.5 136.0 133.1 120.8
b. Plot the residuals from your linear fit in part (a)
Here is Minitab output from fitting the quadratic
and look for any patterns that might suggest an
regression function (a graph in the cited paper sug
inappropriate choice of transformation. If neces
gests that the authors did this):
sary, return to part (a) and try a different transfor
mation. Predictor Coef SE Coef T P
Constant 111.277 2.100 52.98 0.000
33. The article “Residual Stresses and Adhesion of
temp 0.32845 0.03303 9.94 0.010
Thermal Spray Coatings” (Surface Engr., 2005:
tempsqd -0.0010050 0.0001213 -8.29 0.014
35–40) considered the relationship between the
thickness (mm) of NiCrAl coatings deposited on S = 3.44398 R–Sq = 98.1% R–Sq(adj) = 96.3%
stainless steel substrate and corresponding bond Analysis of Variance
strength (MPa). The following data was read from a
Source DF SS MS F P
plot in the paper: Regression 2 1245.39 622.69 52.50 0.019
Thickness: 220 220 220 220 370 Residual Error 2 23.72 11.86
Total 4 1269.11
Strength: 24.0 22.0 19.1 15.5 26.3
a. What is the equation of the best-fit quadratic?
Thickness: 370 370 370 440 440 Use this quadratic to predict yield strength when
Strength: 24.6 23.1 21.2 25.2 24.0 temperature is 110.
Thickness: 440 440 680 680 680 b. What are the values of SSResid and SSTo?
Strength: 21.7 19.2 17.0 14.9 13.0 Verify that these values are consistent with the
value of R-sq given on the output. Do you think
Thickness: 680 860 860 860 860 the fit of the quadratic is good? Explain.
Strength:
11.8 12.2 11.2 6.6 2.8
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5 Using More Than One Predictor 141
Example 3.15 Soil and sediment adsorption, the extent to which chemicals collect in a condensed
form on the surface, is an important characteristic because it influences the effec
tiveness of pesticides and various agricultural chemicals. The article “Adsorption of
Phosphate, Arsenate, Methanearsonate, and Cacodylate by Lake and Stream Sedi
ments: Comparison with Soils” (J. of Environ. Qual., 1984: 499–504) gave the fol
lowing data on y 5 phosphate adsorption index, x1 5 amount of extractable iron, and
x2 5 amount of extractable aluminum
Observation x1 x2 y
1 61 13 4
2 175 21 18
3 111 24 14
4 124 23 18
5 130 64 26
6 173 38 26
7 169 33 21
8 169 61 30
9 160 39 28
10 244 71 36
11 257 112 65
12 333 88 62
13 199 54 40
Thus the first observation is the triple (x11, x21, y1) 5 (61, 13, 4), . . . , and the last
observation is (x1,13, x2,13, y13) 5 (199, 54, 40).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
142 chapter 3 Bivariate and Multivariate Data and Distributions
20 40 60 80 100
300
Exiron 200
100
100
80
60 Exalum
40
Unless otherwise noted, all content on this page is © Cengage Learning.
20
60
50
40
Adsorpind
30
20
10
Figure 3.19 A scatterplot matrix from R of the data from Example 3.15
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5 Using More Than One Predictor 143
y a 1 b1x1 1 b2x2 1 1 bk xk
The g( ) function is the sum of squared deviations between observed y values and
what would be predicted by a 1 b 1x1 1 1 b k xk. Determination of the least squares
coefficients involves multivariable calculus: Take the partial derivative of g( ) with
respect to each unknown, equate these to zero to obtain a system of k 1 1 linear
equations in the k 1 1 unknowns (the normal equations), and solve the system. The
arithmetic is quite tedious, but any good statistical computer package can handle
the task upon request; a regression command of some sort is usually required.
Example 3.16 (Example 3.15 continued) Figure 3.20 shows partial Minitab output from a request
to fit a1b1x1 1b2x2 to the phosphate adsorption data using the principle of least
squares. The result is
yn 27.3511 .11273x1 1.34900x2 27.35 1 .113x1 1.349x2
1
Unless otherwise noted, all content on this page is © Cengage Learning.
100 2
SSResid
SSTo
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
144 chapter 3 Bivariate and Multivariate Data and Distributions
The same two sums of squares calculated after fitting a line are relevant here:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5 Using More Than One Predictor 145
Residual
Residual
5
5
0 0
–5 –5
Extractable Extractable
–10 –10
50 100 250 350 iron 50 100 aluminum
(a) (b)
Figure 3.21 Residual plots for the adsorption data of Examples 3.15 and 3.16
doorknob on the front door, and so on. It turns out that if 19 predictors are included
(one less than the number of observations), then it will virtually always be the case that
R2 5 1. So the goal here is not simply to obtain a set of predictors for which R2 is large,
but to obtain a large value using relatively few predictors while excluding those of mar
ginal significance. We will further discuss this issue in Chapter 11.
(interpretations will be given in Chapter 11); the fit with all five predictors is called
the full quadratic or complete second-order relationship. In fact, we used a quadratic
predictor in the previous section when fitting a quadratic function to bivariate data.
The two predictors there were x1 5 x and x2 5 x2, implying that quadratic (more
generally, polynomial) regression is a special case of multiple regression.
Example 3.17 Researchers carried out a study to see how y 5 ultimate deflection, d (mm), of
reinforced ultrahigh toughness cementitious composite beams were influenced
by x1 5 shear span ratio and x2 5 splitting tensile strength (MPa), resulting in the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
146 chapter 3 Bivariate and Multivariate Data and Distributions
where the forms of f1( ), . . . , and fk( ) are left unspecified. The statistical package R,
among others, will execute this general additive fit by calculating a and the individual
fi( )’s; one method for carrying out this latter task is based on the LOWESS technique
described in Section 3.4.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5 Using More Than One Predictor 147
Example 3.18 The ethanol data set stored in the R package contains 88 observations on variables
x1, x2, and y obtained in an experiment in which ethanol was burned in a single-
cylinder automobile test engine. The variables are
x1 5 C 5 compression ratio of the engine
x2 5 E 5 equivalence ratio at which the engine was run (a measure of
richness of the air/ethanol mix)
y 5 NOx 5 concentration of nitric oxide and nitrogen dioxide in engine
exhaust, normalized in a certain manner
Figure 3.22 shows a scatterplot matrix of the data; it appears that there is a sub
stantial nonlinear relation between y and x2. We asked R to obtain a general
additive fit using LOWESS with a span of .75 (closest 75% of the data values) for
each of the two component functions f1(x1) and f2(x2). Graphs of these two functions
appear in Figure 3.23. Sure enough, the second graph is highly nonlinear, and there
is also some nonlinearity in the first graph.
8 10 12 14 16 18
4
NOx 2
18
16
14
C
12
10
8
Unless otherwise noted, all content on this page is © Cengage Learning.
1.2
1.0
E
0.8
0.6
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
148 chapter 3 Bivariate and Multivariate Data and Distributions
0.3 1.0
0.2 0.5
lo(C, 0.75)
lo(E, 0.75)
0.1
0.0 –0.5
–0.2 –1.5
8 10 12 14 16 18 0.6 0.8 1.0 1.2
C E
Figure 3.23 R graphs of the component functions resulting from a general additive fit to the
ethanol data
The R2 value for this fit was .873, whereas the value for the linear fit a 1 b1x1 1
b2x2 was only .01. The reported value of the constant term a was 1.957, and the pre
dicted value of NOx when C 5 9.0 and E 5 1.0 was given by R as
yn 5 1.957 1 f1(9.0) 1 f2(1.0) 5 2.743
35. Recently there has been increased use of stainless y: 3.423 3.242 3.385 3.420 3.380 3.402
steel claddings in industrial settings. Claddings x1: 8.5 8.5 8.5 8.5 8.5 8.5
are used to finish the exterior walls of a building
x2: 60 40 40 40 40 40
and help weatherproof the structure. To ensure
the quality of claddings, it is essential to know how y: 3.382 3.388 3.398 3.404
welding parameters impact the cladding process. x1: 8.5 8.5 8.5 8.5
The authors of “Mathematical Modeling of Weld x2: 40 40 40 40
Bead Geometry, Quality, and Productivity for a. A least squares fit of y 5 a 1 b1x1 1 b2x2 to
Stainless Steel Claddings Deposited by FCAW” this data gave a 5 .0558, b1 5 .3749, and b2 5
(J. Mater. Engr. Perform., 2012: 1862–1872) in .0028. What value of deposition rate would you
Unless otherwise noted, all content on this page is © Cengage Learning.
vestigated how y 5 deposition rate was influenced predict when wire feed rate 5 11.5 and weld
by x1 5 wire feed rate (Wf, in m/min) and x2 5 ing speed 5 40? What is the value of the cor
welding speed (S, in cm/min). The following 22 responding residual?
observations correspond to the experiment condi b. Residual and total sums of squares are .03836
tion where applied voltage was less than 30v: and 5.1109, respectively. What proportion of
observed variation in deposition rate can be
y: 2.718 3.881 2.773 3.924 2.740 3.870
attributed to the stated approximate relation
x1: 17.0 10.0 7.0 10.0 7.0 10.0
ship between deposition rate and the two pre
x2: 30 30 50 50 30 30 dictor variables?
y: 2.847 3.901 2.204 4.454 3.324 3.319 36. The accompanying Minitab regression output
x1: 7.0 10.0 5.5 11.5 8.5 8.5 is based on data that appeared in the article “Ap
x2: 50 50 40 40 40 20 plication of Design of Experiments for Modeling
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.5 Exercises 149
relationship between surface roughness and the Df Sum Sq Mean Sq F value Pr(>F)
1 321625 321625 164.011 4.04e-09
four predictors? 1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
150 chapter 3 Bivariate and Multivariate Data and Distributions
from Molasses by Blakeslea trispora” (J. of Chem. Model 9 1.40762342 0.15640260 81.61 ,.0001
Error 10 0.01916523 0.00191652
Tech. and Biotech. 2002: 933–943) carried out a
C. Total 19 1.42678865
multiple regression analysis to relate the dependent
variable y 5 amount of -carotene (g/dm3) to the R–Square Coeff Var Root MSE beta Mean
three predictors amount of lineolic acid, amount of 0.986568 15.08576 0.043778 0.290195
kerosene, and amount of antioxidant (all g/dm3). Standard t Pr >
Parameter Estimate Error Value |t|
Obs Linoleic Kerosene Antiox Betacaro Intercept -2.368673650 0.25095313 -9.44 <.0001
1 30.00 30.00 10.00 0.7000 lino 0.115946557 0.00896686 12.93 <.0001
2 30.00 30.00 10.00 0.6300 kero 0.048329827 0.00896686 5.39 0.0003
3 30.00 30.00 18.41 0.0130 anti 0.125140001 0.01622284 7.71 <.0001
4 40.00 40.00 5.00 0.0490 lino*kero 0.000116125 0.00015478 0.75 0.4704
5 30.00 30.00 10.00 0.7000 lino*anti 0.000820250 0.00030956 2.65 0.0243
6 13.18 30.00 10.00 0.1000 kero*anti 0.001002750 0.00030956 3.24 0.0089
7 20.00 40.00 5.00 0.0400 lino*lino -0.002108721 0.00011530 -18.29 <.0001
8 20.00 40.00 15.00 0.0065 anti*anti -0.009219578 0.00046120 -19.99 <.0001
9 40.00 20.00 5.00 0.2020 kero*kero -0.001085436 0.00011530 -9.41 <.0001
10 30.00 30.00 10.00 0.6300
11 30.00 30.00 1.59 0.0400 a. What is the coefficient of multiple determina
12 40.00 20.00 15.00 0.1320 tion for each fitted function?
13 40.00 40.00 15.00 0.1500 b. For the fit using a 1 b1x1 1 b2x2 1 b3x3, what
14 30.00 30.00 10.00 0.7000
is the predicted value of -carotene when lin
15 30.00 46.82 10.00 0.3460
eolic acid 5 40, kerosene 5 20, and antioxi
16 30.00 30.00 10.00 0.6300
17 30.00 13.18 10.00 0.3970 dant 5 5? What is the corresponding residual?
18 20.00 20.00 5.00 0.2690 c. For the fit with predictors x1, x2, and x3 as well
19 20.00 20.00 15.00 0.0054 as quadratic and interaction predictors, what is
20 46.82 30.00 10.00 0.0640 the predicted value of -carotene when lineolic
acid 5 40, kerosene 5 20, and antioxidant 5 5?
A request to the SAS package to fit a 1 b1x1 1 What is the corresponding residual?
b2x2 1 b3x3 yielded the following output: d. Note the difference in magnitude of the re
Dependent Variable: beta
siduals you just computed for the two regres
Sum of Mean F Pr sions. Explain how it is reasonable for one of
Source DF Squares Square Value . F these to have a smaller residual magnitude
Model 3 0.02352595 0.00784198 0.09 0.9648
Error 16 1.40326270 0.08770392
given the difference in coefficients of multiple
C. Total 19 1.42678865 determination.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.6 Joint Distributions 151
40. The collapse of reinforced concrete buildings dur 41. A new surface finishing method has been developed
ing earthquakes can result in significant loss of for nanofinishing flat and three-dimensional work
property and life. Often such collapses are caused piece surfaces. The authors of “Parametric Analysis
by concrete column axial failure. The authors of of an Improved Ball End Magnetorheological Fin
“Rotation-Based Shear Failure Model for Lightly ishing Process” (J. Engr. Manuf., 2012: 1550–1563)
Confined RC Columns” (J. Struct. Engr., 2012: investigated how y 5 percent change in surface
1267–1278) introduced a model for the deforma roughness was influenced by x1 5 rotational speed
tion at onset of shear failure for a class of rein of tool core (N, in r/min), x2 5 magnetizing current
forced concrete columns. As part of the study, the (I, in A), x3 5 working gap (D, in mm).
authors investigated how y 5 maximum sustained
y: 47.68 39.80 80.69 34.12 45.10
shear (Vmax, in kN) is influenced by x1 5 transverse-
x1: 400 500 500 600 500
reinforcement yield stress (MPa) and x2 5 concrete x2: 5.0 2.3 4.0 5.0 4.0
cylinder compressive strength (MPa). x3: 2.00 1.50 0.66 2.00 1.50
y: 314.9 359.0 300.7 271.3 266.9
y: 46.51 69.63 63.62 37.18 36.75
x1: 469 469 469 400 400
x1: 500 500 600 668 400
x2: 21.10 21.10 20.90 25.60 25.60
x2: 4.0 5.7 5.0 4.0 3.0
y: 240.2 231.3 315.8 338.1 355.9 x3: 1.50 1.50 1.00 1.50 2.00
x1: 400 400 400 400 400 y: 49.94 45.86 70.64 54.75 24.97
x2: 33.10 33.10 25.70 27.60 27.60 x1: 500 500 400 600 600
x2: 4.0 4.0 5.0 3.0 3.0
y: 378.1 101.9 110.8 103.2 101.9
x3: 1.50 1.50 1.00 1.00 2.00
x1: 400 46 46 365 365
x2: 25.70 4.65 4.34 23.00 20.20 y: 49.38 59.85 55.18 32.05 44.94
x1: 500 400 332 500 500
y: 120.5 111.6 219.3 213.1 x2: 4.0 3.0 4.0 4.0 4.0
x1: 365 365 392 392 x3: 1.50 1.00 1.50 2.34 1.50
x2: 23.00 20.20 30.70 30.70
Use a statistical computer package to fit
Use a statistical computer package to fit (a) a 1 (a) a 1 b1x1 1 b2x2 1 b3x3, (b) a 1 b1x1 1 b2x2 1
b1x1 1 b2x2, (b) a 1 b1x1 1 b2x2 1 b3x1x2, and b3x3 1 b4x1x2 1 b5x1x3 1 b6x2x3, and (c) a 1 b1x1 1
(c) a 1 b1x1 1 b2x2 1 b3x1x2 1 b4x21 1 b5x22. Be b2x2 1 b3x3 1 b4x1x2 1 b5x1x3 1 b6x2x31 b7x21 1
sure to specify all function coefficients. For each b8x22 1 b9x23. Be sure to specify all function coeffi
function, also include the coefficient of multiple cients. For each fit, also include the coefficient of
determination and interpret its value. multiple determination and interpret its value.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
152 chapter 3 Bivariate and Multivariate Data and Distributions
f (x, y) $ 0 ^ f (x, y) 5 1
all (x,y)
Often there is no nice formula for f (x, y). When there are only a few possible values of x
and y, the mass function is most conveniently displayed in a rectangular table.
Example 3.19 A certain market has both an express checkout register and a superexpress register.
Let x denote the number of customers queueing at the express register at a particular
weekday time, and let y denote the number of customers in line at the superexpress
register at that same time. Suppose that the joint mass function is as given in the ac
companying table:
y
0 1 2 3
0 .08 .07 .04 .00
1 .06 .15 .05 .04
x 2 .05 .04 .10 .06
3 .00 .03 .04 .07
4 .00 .01 .05 .06
According to the table, f(x, y) . 0 for only 17 (x, y) pairs. Just as in the case of a single
variable, individual proportions from the mass function can be added to yield other
proportions of interest. For example, (x, y) pairs for which the number of customers
at the express register is equal to the number of customers at the other register are
(0, 0), (1, 1), (2, 2), and (3, 3), so
longrun proportion of
a b 5 f (0, 0)1 f (1, 1) 1 f (2, 2) 1 f (3, 3)
times for which x 5y
5 .08 1.15 1 .10 1.07
5 .40
The total number of customers at these two registers will be 2 if (x, y) 5 (2, 0),
(1, 1), or (0, 2), so
longrun proportion of
a b 5 f (2, 0) 1 f (1, 1) 1 f (0, 2)
times for which x 1 y 52
5 .05 1.15 1.04
5 .24
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.6 Joint Distributions 153
Suppose we are presented with the joint distribution but are interested only in the
distribution of x alone: the marginal distribution of x. In Example 3.19, we might wish
to know f1(0), f1(1), f1(2), f1(3), and f1(4), the long-run proportions for various values of
the first variable, x. Consider x 5 1, which occurs when (x, y) 5 (1, 0), (1, 1), (1, 2), or
(1, 3). Thus
f 1(1) 5 long@run proportion of the time that x 5 1
5 f (1, 0) 1 f (1, 1) 1 f (1, 2) 1 f (1, 3)
5 .06 1 .15 1 .05 1 .04 5 .30
This is nothing more than the sum of proportions in the x 5 1 row of the joint mass
table. Adding proportions in the other rows gives the entire marginal distribution of x,
whereas adding proportions in the various columns gives the marginal distribution of y,
denoted by f2(y):
x: 0 1 2 3 4 y: 0 1 2 3
f1(x): .19 .30 .25 .14 .12 f2(y): .19 .30 .28 .23
Now let’s consider the case of two continuous random variables. The distribution for
a single continuous variable x is specified by a density function f (x) that satisfies f (x) 0
and #2 f (x) dx 5 1. The graph of f (x) is the density curve, and various proportions
correspond to areas under this curve that are obtained by integrating the density function.
Extending these ideas to two variables requires that we use multivariate calculus, in par
ticular multiple integration. The joint distribution of x and y is specified by a joint density
function f (x, y) that satisfies
f (x, y) $ 0 #2 #2 f (x, y) dx dy 5 1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
154 chapter 3 Bivariate and Multivariate Data and Distributions
( , )
Surface ( , )
= Shaded rectangle
Let x and y denote the mean values of x and y, respectively. Then the function
h(x, y) 5 (x 2 x)(y 2 y)
is a product of x and y deviations from their mean values [like (x 2 x)(y 2 y) in our
discussion of sample correlation]. The mean value of this product of deviations
is called the covariance between x and y, and the population correlation coef-
ficient is
Unless otherwise noted, all content on this page is © Cengage Learning.
covariance(x, y)
5
xy
where x and y are the x and y standard deviations, respectively. This definition of is
very similar to the definition of the sample correlation coefficient r given in Section 3.2.
You need not worry about calculating , but we do want you to know that it exists and
shares many properties with r. In particular,
1. does not depend on the x or y units of measurement.
2. 21 1
3. The closer is to 11 or 21, the stronger the linear relationship between the two
variables.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
3.6 Joint Distributions 155
1 2 ca b 2 2 a ba b1a b d
f (x, y) 5 e 2(1 2 ) 2 x x y y
2xy 21 2 2
2 ,x,
2 ,y,
One interesting example of the use of this joint distribution appears in the article “Analysis
of Size-Grouped Potato Yield Data Using a Bivariate Normal Distribution of Tuber Size
and Weight” (J. of Agric. Science, 1993: 193–198). Figure 3.25 is a three-dimensional graph
of this function for specified parameter values. The function cannot be easily integrated,
so tables or numerical methods must be employed to calculate various proportions of in
terest. In Chapter 11, we consider an inferential procedure for drawing conclusions about
based on assuming that the sample was selected from a bivariate normal distribution.
.10
.08
.06
.04
.02
0
30
28
26
13
24 12
11
22 10
Unless otherwise noted, all content on this page is © Cengage Learning.
9
20 8
7
Figure 3.25 Graph of the bivariate normal density function when x 5 10,
x 5 1, y 5 25, y 5 2, and 5 .5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
156 chapter 3 Bivariate and Multivariate Data and Distributions
back at the joint distribution table for x and y in Example 3.19. Notice that if x 5 0,
then y 5 0 is a possibility but not y 5 3. However, if x 5 4, then y 5 0 is excluded,
whereas y 5 3 is possible. So the distribution of one variable does depend on the value
of the other, and the variables are therefore not independent.
Let f1(x) and f2(y) denote the marginal distributions of x and y, respectively.
Frequently, an investigator has enough knowledge of the situation under study to
assume independence. When this is the case, the joint mass or density function
must satisfy
( ) f (x, y) 5 f1(x) . f2(y)
For the variables of Example 3.19 to be independent, every entry in the joint table would
have to be the product of the row and column totals. Very importantly, once indepen
dence is assumed, one has only to select appropriate distributions for x and y separately
and then use ( ) to create the joint distribution.
Example 3.20 A business is planning to purchase two different new vehicles, a van and a sedan. Let
x denote the number of major defects on the first vehicle, and y be the number of
major defects on the second one. Because the vehicles come from different manu
facturers and assembly lines, an assumption of independence is reasonable. Suppose
x has a Poisson distribution with 5 2 and y has a Poisson distribution with 5 1.5
(the marginal distributions). Then
The long-run proportion of such purchases that would result in at most one major de
fect for the two vehicles combined (x 1 y # 1) is then f (0, 0) 1 f (0, 1) 1 f (1, 0) 5 .136.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 157
42. A large insurance agency provides services to a a. In what proportion of cycles will there be exactly
number of customers who have purchased both a one car and one bus?
homeowner’s policy and an automobile policy. For b. In what proportion of cycles will there be at
each type of policy, a deductible amount must be most one vehicle of each type?
specified. Let x denote the homeowner’s deductible c. In what proportion of cycles will the number of
amount and y denote the automobile deductible cars be the same as the number of buses?
amount for a customer who has both types of poli d. What is the mean value of the number of cars
cies. The joint mass function of x and y is as follows: per signal cycle?
e. If a bus occupies three vehicle spaces and a car
y
occupies just one, what is the mean value of the
f (x, y) 0 250 500 number of vehicle spaces occupied during a sig
200 .20 .10 .20 nal cycle? Hint: Let h(x, y) 5 x 1 3y.
x
500 .05 .15 .30
44. Let x denote the number of major defects for a
a. What proportion of customers have $500 de particular piece of machinery and y be the num
ductible amounts for both types of policies? ber of cosmetic flaws on this same piece. Sup
b. What proportion of customers have both de pose that x and y are independent variables with
ductible amounts less than $500? f1(x) 5 .80, .15, and .05 for x 5 0, 1, and 2, re
c. What is the marginal mass function of x? What spectively, and f2(y) 5 .50, .25, .15, .08, and .02
is the marginal mass function of y? for y 5 0, 1, . . . , 4, respectively.
a. What is the joint mass function of these two
43. The joint distribution of the number of cars (x) and
variables?
the number of buses (y) per signal cycle at a particular
b. What proportion of these machines will have no
left turn lane is displayed in the accompanying table:
major defects or cosmetic flaws? What propor
y tion will have at least one defect or flaw?
c. For what proportion of these machines will the
f (x, y) 0 1 2 number of cosmetic flaws exceed the number of
0 .025 .015 .010 major defects?
1 .050 .030 .020
45. Refer to Exercise 42. Compute the covariance
2 .125 .075 .050
x between x and y and then the value of the popu
3 .150 .090 .060
lation correlation coefficient. Do these two vari
4 .100 .060 .040
ables appear to be strongly related? Explain.
5 .050 .030 .020
Supplementary Exercises
46. Orthotropic steel bridge decks with closed ribs have 2011: 492–499), researchers examine the physical
been widely used in suspension bridges, cable- properties of 22 bridge specimens. Each speci
stayed bridges, and urban elevated expressways due men was attached to a fatigue testing apparatus.
to their overall light weights, ease of construction, Fatigue life was determined as the number of cycles
and high load-carrying capacities. In the article (in millions, p. 158) at the end of the fatigue test.
“Fatigue Evaluation of Rib-to-Deck Welded Joints For each specimen, the corresponding stress range
of Orthotropic Steel Bridge Deck” ( J. Bridge Engr., (MPa) was also recorded.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
158 chapter 3 Bivariate and Multivariate Data and Distributions
Stress: 121 71 108 99 77 the article: n 5 33; x 5 .17, .33, .50, .67, . . . , 5.50;
Cycles: 1.257 11.250 2.240 4.030 6.650 y 5 .50, 1.25, 1.50, 2.75, 3.50, 4.75, 5.75, 5.60, 7.00,
Stress: 70 79 56 89 75 8.00, 8.25, 9.50, 10.50, 11.00, 10.75, 12.50, 12.25,
Cycles: 6.970 6.430 19.140 3.950 9.000 13.25, 15.50, 15.00, 15.25, 16.25, 17.25, 18.00, 18.25,
Stress: 95 90 110 77 64 18.15, 20.25, 19.50, 20.00, 20.50, 20.60, 20.50, 19.80.
Cycles: 2.290 4.470 2.150 10.490 19.260 a. The r2 value resulting from a least squares fit is
Stress: 90 99 91 91 82
.977. Interpret this value and comment on the
Cycles: 4.120 1.800 2.190 3.150 5.800 appropriateness of assuming an approximate lin
ear relationship.
Stress: 75 79
Cycles: 5.130 5.970
b. The residuals, listed in the same order as the x
values, are
a. Would you fit a straight line to the data and use 21.03 20.92 21.35 20.78 20.68 20.11 0.21
it as a basis for predicting y 5 stress range from 20.59 0.13 0.45 0.06 0.62 0.94 0.80
x 5 number of cycles? Why or why not? 20.14 0.93 0.04 0.36 1.92 0.78 0.35
b. Find a transformation that produces an ap 0.67 1.02 1.09 0.66 20.09 1.33 20.10
proximate linear relationship between the trans 20.24 20.43 21.01 21.75 23.14
formed values. Then fit a line to the transformed Plot the residuals against elapsed time. What does
data and use it to obtain an equation that de the plot suggest?
scribes approximately the relationship between
49. An investigation was carried out to study the rela
the untransformed variables.
tionship between speed (ft/sec) and stride rate (num
47. An investigation of the relationship between the tem ber of steps taken/sec) among female marathon
perature (°F) at which a material is treated and the runners. Resulting summary quantities included
strength of the material involved an experiment in n 5 11, ^ (speed) 5 205.4, ^ (speed)2 5 3880.08,
which four different strength observations were ob ^ (rate) 5 35.16, ^ (rate)2 5 112.681, and ^ (speed)
tained at each of the temperatures 100, 110, 120, 130, (rate) 5 660.130.
and 140. A scatterplot of the data showed a substantial a. Calculate the equation of the least squares line that
linear pattern. The least squares line fit to the data had you would use to predict stride rate from speed.
a slope of .500 and a vertical intercept of 225.000. b. Calculate the equation of the least squares line that
a. Interpret the value of the slope. you would use to predict speed from stride rate.
b. The largest strength value when temperature was c. Calculate and interpret the coefficient of deter
120 was 40 and the smallest was 29. What value mination for the regression of stride rate on speed
of strength would you have predicted for this of part (a) and for the regression of speed on stride
temperature, and what are the values of the re rate of part (b). How are these two related?
siduals for the two aforementioned observations?
Why do these residuals have different signs? 50. Refer to Exercise 49. Consider predicting speed
c. The values of SSTo and SSResid were 1060.0 from stride rate, so that the response variable y
and 390.0, respectively. Calculate and interpret is speed. Suppose that the values of speed in the
the coefficient of determination. sample are expressed in meters/second. How does
this change in the unit of measurement for y affect
48. As the air temperature drops, river water becomes
the equation of the least squares line? More gener
supercooled and ice crystals form. Such ice can sig
ally, if each y value in the sample is multiplied by
nificantly affect the hydraulics of a river. The article
the same number c, what happens to the slope and
“Laboratory Study of Anchor Ice Growth” (J. of Cold
vertical intercept of the least squares line?
Regions Engr., 2001: 60–66) described an experi
ment in which ice thickness (mm) was studied as a 51. The relationship between x 5 strain (in./in.) and y 5
function of elapsed time (hr) under specified condi stress (ksi) for an experimental alloy tension member
tions. The following data was read from a graph in was investigated by making an observation on stress
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 159
for each of n 5 10 values of strain. A scatterplot of a. Would a straight line fit to this data give accu
the resulting data suggested a quadratic relationship rate predictions of viscosity?
between the two variables. Employing the principle of b. Let x951/x and y9 5 ln(y). Fit a straight line to
least squares gave yn 5 88.791 1 5697.0x 2 328,161x2 the (x9, y9) data, use it as a basis for predicting
as the equation of the best-fit quadratic. viscosity when temperature is 720, and calculate
a. One observation in the sample was made when a quantitative assessment of the extent to which
strain was .005, and the resulting value of stress the approximate linear relationship between x9
was 111. What value of stress would you have and y9 explains observed variation.
predicted in this situation, and what is the value
54. Ground motions resulting from an earthquake can be
of the corresponding residual?
heavily influenced by the dynamic properties of the
b. The observed values of stress were 91, 97, 108,
soils overlying bedrock. The authors of “Influence of
111, 114, 110, 112, 102, 98, and 91. Using the
Pore Fluid Viscosity on the Dynamic Properties of an
best-fit quadratic gave corresponding predicted
Artificial Clay” (J. Geotech. Geoenviron. Engr., 2011:
values of 94.16, 98.87, 102.93, 109.07, 111.16,
1190–1201) investigated properties of an artificial soil
113.36, 113.48, 104.22, 95.93, and 90.80, re
called modified glyben to study seismic soil-structure
spectively. Calculate a quantitative assessment
interaction. Researchers investigated the relationship
of the extent to which variation in observed stress
between x 5 fluid content by mass (%) and vane shear
values can be attributed to the approximate qua
strength (kPa) for three types of modified glyben at dif
dratic relationship between stress and strain.
ferent pore fluid viscosities (w/gw): y' 5 vane shear
c. What happens if the best-fit equation is used to
strength (0% w/gw), y'' 5 vane shear strength (25%
predict stress when strain is .03? Note: The larg
w/gw), y''' 5 vane shear strength (50% w/gw). The
est strain value in the sample was .017.
data below corresponds to a graph from the article:
52. An experiment carried out to investigate the relation x 35.0 37.5 40.0 42.5 45.0 47.5
ship between y 5 wire bond pull strength in a semi y9 75.0 63.0 57.0 45.0 28.5 38.0
conductor product and the two predictors x1 5 wire y0 52.0 41.5 38.0 35.0 20.0 16.0
length and x2 5 die height resulted in data for which y- 33.5 24.5 22.0 19.0 13.0 10.0
the best-fit equation according to the principle of a. Create the scatterplots for the pairs (x, y=), (x, y==),
least squares was yn 5 2.300 1 2.750x1 1 .0125x2. and (x, y===). Does each scatterplot suggest that a lin
a. Interpret the coefficients of x1 and x2 in the ear relationship holds for the respective variables?
given equation. b. Determine the least squares regression line for
b. The observed value of pull strength was 24.35 each pair. For each, determine the correspond
when wire length was 9 and die height was 100. ing coefficient of determination.
What value of pull strength would you have pre c. Given the slope coefficients from the regression,
dicted under these circumstances, and what is summarize the relationship between vane shear
the value of the corresponding residual? strength and fluid content by mass as pore fluid
c. The values of SSTo and SSResid were 6110.2 viscosity changes from 0%, to 25%, and to 50%.
and 123.4, respectively. Can a substantial per
55. Failures in aircraft gas turbine engines due to high
centage of the observed variation in strength be
cycle fatigue is a pervasive problem. The article
attributed to the postulated approximate relation
“Effect of Crystal Orientation on Fatigue Failure of
ship between strength and the two predictors?
Single Crystal Nickel Base Turbine Blade Superal
53. The accompanying data resulted from an investiga loys” (J. of Engr. for Gas Turbines and Power, 2002:
tion of the relationship between temperature (x, in 161–176) gave the accompanying data and fit a
°F) and viscosity (y, in poise) for specimens of bitu nonlinear regression model in order to predict strain
men removed from tar sand deposits: amplitude from cycles to failure. Fit an appropriate
x: 750 800 700 850 590 620 650 680 710 550 curve, investigate the quality of the fit, and predict
y: 50 16 102 10 945 818 403 151 114 1358 amplitude when cycles to failure 5 5000.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
160 chapter 3 Bivariate and Multivariate Data and Distributions
Bibliography
Kutner, M., C. Nachstein, and J. Neter, Applied Linear included in Applied Linear Statistical Models, a lon-
Regression Models (4th ed.), McGraw-Hill/Irwin, ger book by the same authors.)
Burr Ridge, IL, 2004. A comprehensive up-to-date Montgomery, D. C., E. A. Peck, and G. G. Vining,
exposition of regression and correlation analysis Introduction to Linear Regression Analysis (5th ed.),
without overindulging in theory, though matrix Wiley, New York, 2012. A very nice treatment of regres-
algebra is rather frequently used. (This material is also sion written for engineers and physical scientists.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4
Digital Vision/Getty Images
Obtaining Data
4.1 Operational Definitions
4.2 Data from Sampling
4.3 Data from Experiments
4.4 Measurement Systems
Introduction
Engineering has been defined as the art of applying science and technology for the
optimal conversion of the resources of nature into the uses of humankind.1 The sci-
ences, in turn, are grounded in mathematics, so it is natural that measurements of all
kinds should play a large role in engineering and scientific practice. In this chapter,
we examine some of the ways in which data is collected as well as some approaches
to ensuring data quality.
Scientists and statisticians have long realized that some sets of data are defi-
nitely more useful than others, and that at the heart of data quality lies the
realization that external conditions can often exert a large influence on mea-
sured values. Temperature, for example, is well known to affect the physical di-
mensions (length, area, etc.) of most materials, so the measured length of a thin
strip of aluminum will necessarily vary depending on the ambient temperature.
In an effort to control or eliminate the effects of such external or “noise” fac-
tors, engineers have developed a large number of professional standards whose
purpose is to ensure the consistency and quality of scientific data. We will look
at some specific examples of such standards in Section 4.1.
Since the early 1920s, statisticians have also addressed the problems of data
quality by introducing tightly controlled data collection schemes. These schemes,
1
Encyclopedia Britannica, 1998.
161
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
162 chapter 4 Obtaining Data
called experimental designs and sampling plans, provide methods not only
for controlling or eliminating the effects of external factors but also for assess-
ing the magnitude of their combined effect on measured data. Sampling plans
also address the problem of how far we can generalize the conclusions that we
draw from data. One important feature of experimental designs is the ability to
study the effects of several factors simultaneously on the values of another factor,
called a response variable. This feature is especially well suited to research and
development activities. The main components of such designs are introduced in
Sections 4.2 and 4.3.
The process of obtaining measurements is also vital to the eventual conclu-
sions drawn from data. Numerous questions can be asked about measurement
procedures: Can we trust a particular measuring instrument’s readings? Are the
readings accurate and precise? Do repeated measurements of the same object
give similar results, or do the results exhibit large variation? If different people or
special laboratories are involved at various stages of the measuring process, does
this have an adverse effect on the quality of the data? These questions are the sub-
ject of metrology, the study of measurement, and are examined in Section 4.4.
2
Surface tension causes the top of the water to form a bowl-like surface, called a meniscus. Using the top of
the meniscus leads to a different volume estimate than using the bottom of the meniscus.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.1 Operational Definitions 163
As this example shows, unless you are very specific about what to measure (e.g.,
seawater at 50°F and 1 atmosphere of pressure) and how to measure it, data can be quite
unreliable. Realizing this, the quality pioneer W. Edwards Deming recommended that,
prior to collecting any set of data, one should first create an operational definition that
spells out exactly what is to be measured and exactly how the measurements should
be made. The reward for doing this is consistent, reliable data. Any two people should
be able to follow the operational definition and obtain essentially the same measure-
ments. Cognizant of the importance of operational definitions, most scientists include
a Materials and Methods or Experimental Procedure section that outlines the exact
procedures employed to collect the data used in a study.
Example 4.1 Automobile gasoline is a carefully balanced blend of from 8 to 15 different hydrocar-
bons. The resulting blends must meet up to 15 quality and environmental require-
ments, including standards regarding vapor pressure, boiling point, stability, color, and
octane rating. The octane scale measures the degree to which a gasoline blend per-
forms like pure isooctane (which gives the least amount of premature firing or “knock”)
or pure normal heptane (which produces extreme knocking). If the blend performs
like a mixture of 90% isooctane and 10% heptane, it is assigned an octane rating of 90.
Because octane measurements are heavily influenced by engine speed and
temperature, an operational definition must be used when assigning octane ratings.
First, using a standard knock engine, the “research octane” level is measured under
mild conditions (600 rpm and 120°F). Second, “motor octane” is measured under
harsher conditions (900 rpm and 300°F). Finally, the “road octane” rating is calcu-
lated as the average of the research and motor octane levels. Road octane, calculated
by the (R 1 M)y2 method, is the one commonly reported on gasoline station pumps.
Example 4.2 Operational definitions are often created on the job. For example, when inspecting
injection-molded automobile dashboards, several types of defects can be observed,
such as pinholes, creases, burn marks, and voids (hollow areas underneath the outer
skin of the dashboard). To generate meaningful data about such defects, an opera-
tional definition must be created so that any two inspectors will report the same types
and severity of defects. For example, we might decide to classify creases longer than
1 inch as severe, whereas creases less than one-quarter inch might be called minor.
Pinholes that occur under the dashboard (not visible to passengers) could be classi-
fied differently from those that are in the passengers’ field of vision. Similarly, voids
with large diameters might be treated as major defects, whereas smaller voids are
minor defects. Once these definitions have been established, the resulting data can
be reliably used in quality control charts (Chapter 6) or other statistical methods.
Professional Standards
It often takes highly specialized knowledge to create operational definitions. Conse-
quently, entire professional societies have arisen to create such definitions, which are
then called professional standards or simply standards. One of the largest such groups
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
164 chapter 4 Obtaining Data
is the American Society for Testing and Materials (ASTM). ASTM publishes
standard test methods, specifications, practices, and guides for engineers working with
materials, products, systems, and services. Over 12,000 ASTM standards have now been
published, and these standards are commonly adopted by government agencies for use
in codes, regulations, and laws. Building codes, for example, commonly cite ASTM
standards for conducting tests on structures. In the following example, notice how each
step of a measurement process is carefully defined.
Example 4.3 Concrete used in construction must meet tight consistency standards. Consistency
refers to the fluidity of the concrete when poured. ASTM C 143 (Standard Method
for Slump of Portland Cement Concrete) is often cited in state construction codes as
the required method of testing consistency.
ASTM C 143 requires that a sample of concrete be poured into a cone shaped
like a megaphone (8-in. diameter at one end and 4-in. diameter at the other end).
The large base of the cone is on the ground during the pour. The cone is filled
one-third full and then tamped down 25 times. This procedure is repeated twice,
leaving the mold full. The cement sample must come from the middle portion of
the batch being poured. Next, the cone is lifted off the cement and quickly inverted
and placed beside the conical pile of cement. Without the support of the cone, the
height of the cement then diminishes or slumps. The distance between the top of the
cone and the top of the cement is called the slump, and, depending on the building
code used, the slump must fall within specified limits.
Other organizations, including the federal government, make extensive use of pub-
lished standards. The Code of Federal Regulations (CFR), for instance, is an important
source of engineering standards and requirements in all federally regulated industries.
Example 4.4 The Department of Transportation (DOT) oversees the testing and rating of au-
tomobile tires. Tires are rated for treadwear, traction, and temperature resistance.
These ratings are marked on the side of each tire. A treadwear rating of DOT 150,
for example, means that a tire wears about one and a half times as long as a tire rated
100 on a standard government test course. Estimating the treadwear of a given brand
of tire is done via regression analysis.
Because of the numerous factors that can affect treadwear (size of car, driving
style, road conditions, and speed), the operational definition specified by DOT is
extensive. In brief, Regulation 49CFR 575.104 (Uniform Tire Quality Grading
Standards) requires that a convoy of two or four rear-wheel-drive passenger cars be
driven over a 400-mile government test course in the vicinity of San Angelo, Texas.
One vehicle is outfitted with special government-manufactured course-monitoring
tires; the other vehicles have only test tires. Inflation pressures are specified, and
each vehicle is weight-loaded to put a required test load on the tires. Wheel align-
ments are checked, tires are broken in for two laps (800 miles), air pressure is
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.1 Operational Definitions 165
rechecked, and wheels are realigned. Initial tread depth, to the nearest .001 in., is
measured. The convoy is then driven for 6400 miles, rotating tires every 400 miles
in a specified pattern. A car’s position in the convoy is also rotated. In addition, tires
are also shifted from one vehicle to another every 1600 miles. Tread depth is mea-
sured every 800 miles. Finally, a regression line is fit to the nine treadwear points
(one initial reading and eight readings at 800-mile intervals). The regression line
is used to calculate a projected mileage for the test tires and the monitoring tires.
Comparisons between the projected test tire wear and monitoring tire wear are used
to assign the DOT wear rating.
Another organization that has played a major role in setting standards for various indus-
tries is the International Organization for Standardization (ISO). Founded in 1947,
the ISO has published more than 19,500 international standards covering diverse areas
such as food safety, computers, agriculture, and health care.
Example 4.5 We often assume that children’s toys, once made available on the shelves of a store,
are perfectly safe to use by children. Unfortunately, this is not always the case as evi-
denced by toy product recalls because of some hazard concern. For example, the U.S.
Consumer Product Safety Commission maintains a regularly updated website that
lists various hazardous toy recalls. In 2012, the ISO updated its series of toy safety stan-
dards that detail requirements and test methods for toys intended for use by children
under 14 years of age; it also sets age limits for various requirements. The series con-
tains four parts: Part 1—Safety aspects related to mechanical and physical properties;
Part 2—Flammability; Part 3—Migration of certain elements; and Part 4—Swings,
slides, and similar activity toys for indoor and outdoor family domestic use. Two new
parts are currently under development: Part 5—Determination of total concentration
of certain elements in toys; and Part 6—Toys and children’s products—Determination
of phthalate plasticizers in polyvinyl chloride plastics. By adopting the requirements
and recommendations of the ISO safety standards, toy manufacturers can help mini-
mize product recalls and reduce the risk of a child being injured by an unsafe toy.
Benchmarks
Operational definitions are especially appropriate for establishing industry and profes-
sional standards. However, when we want to compare several different products or pro-
cesses, another sort of standard is needed. For these applications, benchmarks are the
appropriate tools. Benchmarks are well-defined objects or processes whose character-
istics are already explicitly known. Knowing the exact value of some characteristic in
advance allows one to evaluate several products or processes by comparing how they
perform against the benchmark. For example, the National Institute of Standards and
Technology (NIST) keeps copies of standard physical units, such as the volt and the
kilogram. These standards are the benchmarks against which the precision and accu-
racy of all measuring instruments are eventually compared.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
166 chapter 4 Obtaining Data
Example 4.6 Benchmarks are routinely used for comparing software products. For instance, statis-
tical software packages are evaluated for computational accuracy by using specially
designed data sets whose statistical properties are precisely known. One repository
of such benchmark data sets can be found at http://www.itl.nist.gov/div898/strd
/index.html, a website maintained by the Information Technology Laboratory of
the National Institute of Standards and Technology. This website was produced as
part of the Statistical Reference Datasets Project. One of these data sets is the set of
three integers 10,000,001 to 10,000,003 that is used to evaluate a software programs
computation of the sample standard deviation, s. The sample standard deviation for
these three values is s 5 1, the same as for the sample 1, 2, 3.
Using this data set as a benchmark, it is possible to compare the different ap-
proaches to calculating s that are used in software packages. For instance, summing
the squares of the three integers (a step used in some formulas for s) leads to inac-
curate results. However, programs that use updating formulas (in which the value of
s is updated as each data point is entered) are generally very accurate.
2. Give an operational definition for measuring the fuel 5. Print speed (often measured in pages per minute,
efficiency of a car. In your definition, take into ac- ppm) is an important property to consider when
count factors such as the driving speed, octane rating, buying a printer. However, printer manufacturers
distance driven, tire pressure, and driving terrain. measure this property in different ways, making
comparison of print speeds difficult. In 2009,
3. Give an operational definition for measuring the day-
the ISO developed an international standard for
time temperature in a city. In your definition, take
measuring print speed. The standard, known as
into account factors such as time of day and location.
“ISO ppm,” allows a consumer to make “apples-
4. To test the accuracy of a new numerical algorithm, a to-apples” comparisons of real-world print speeds
programmer uses the algorithm to produce the first under standard conditions. It is now common for
200 digits of the number . The programmer checks the ISO ppm rating of a printer to be included in
the accuracy of 200 digits by comparing them to its product specifications listing. Here, would ISO
those in a published reference, whose accuracy has ppm more properly be considered an operational
been previously verified. In this application, would definition or a benchmark?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 167
The goal in all forms of sampling is to be able to draw conclusions about the larger
entity based solely on our analyses of the information in a sample. For this reason,
every effort is made to ensure that samples are truly representative of the thing we are
sampling. Professional standards usually provide great detail on how samples are to be
obtained. For example, ASTM C 172 (Standard Method of Sampling Freshly Mixed
Concrete) requires that samples of fresh concrete be taken “. . . at two or more regularly
spaced intervals during discharge of the middle of the batch . . .” and that the inspector
should “. . . perform sampling by passing a receptacle completely through the discharge
stream . . .” while taking care “. . . not to restrict the flow of the concrete . . . so as to
cause segregation.” Another method of assuring representative samples is based on the
concept of random sampling, described later in this section.
Example 4.7 The inspection and approval of metal welding in building construction can be based
on nondestructive test (abbreviated NDT) methods, destructive test methods, or vi-
sual inspection. There are several NDT methods available, including magnetic par-
ticle testing, radiographic inspection, penetrant inspection, ultrasonic testing (UT),
leak testing, and hardness testing. Each of these methods is based on a nondestruc-
tive examination of a sample of welded material.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
168 chapter 4 Obtaining Data
Penetrant inspection, for example, involves the application of a dye (often red in
color) to the welded surface. The dye penetrates any existing cracks and holes in the
metal surface. After the excess dye is wiped away, only the dye in the cracks remains.
To reveal these cracks, another liquid, called a developer, is applied to the surface. This
causes the dye to come to the surface of the crack and creates a highly visible marking
of each crack or hole in the weld. An experienced inspector can then make an evalua-
tion of the quality of the weld from the number and location of these markings.
Random Sampling
Random sampling is a form of sampling used extensively in statistical methods. This
technique presupposes that samples are to be obtained from some well-defined popu-
lation of distinct items, and it provides a simple mechanism for randomly selecting
items from the population to be included in a sample. The advantages of using random
sampling are (1) it helps to reduce or eliminate bias in the manner in which the sam-
pled items are chosen and (2) it enables us to make precise statements about the extent
to which conclusions drawn from a sample can be applied to the entire population.
Random samples are obtained by making sure that every sample of the desired size has
the same chance of being selected. This in turn implies that each item in the population has
an equally likely chance of being chosen. One popular method for achieving this is to first
create a list (called a sampling frame) of the items in a population. Next, successive positive
integers are assigned to the items on the list, and then a random number generator is used
to select a random sample of these positive integers. Random number generators can be
in the form of tables, functions on handheld calculators, or commands in programming
languages and statistical software. Whatever method is used, the selected integers will
correspond to specific items in the sampling frame.
When sampling, we are immediately faced with a decision to sample with or with-
out replacement. Sampling with replacement means that after each successive item
(or integer) is selected for the random sample, the item is “replaced” back into the
population and may even be selected again at a later stage. Thus, sampling with re-
placement allows for the possibility of having “repeats” occur in our random sample. In
practice, sampling with replacement is rarely used. Instead, the more common notion
of sampling is to allow only distinct items from the population in the sample. That is, no
repeats are allowed. Sampling in this manner is called sampling without replacement.
Although these two forms of sampling are indeed different, in most applications (i.e.,
when the sample size is small compared to the population size) there is little practical
difference between them. Unless otherwise stated, however, we will always assume that
random sampling is done without replacement.
Example 4.8 Suppose that we want to perform some electrical tests on a random sample of
5 integrated circuit chips from a package of 20 chips. Arranging the 20 chips in a
horizontal line on a table is a rapid way of associating a unique integer from 1 to
20 with each chip (the leftmost chip would be labeled “1,” the rightmost would be
“20,” and so forth). It is important to note that the particular ordering of the chips
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 169
is completely immaterial to the sampling process. All that is needed is a method for
assigning integers to the chips, and horizontal positioning achieves that purpose.
Using a random number generator from a calculator or a statistical software
package, we next generate a random sample of five integers from the numbers 1
through 20. When doing this, we have to decide whether to sample with replace-
ment or without replacement. Suppose we choose to sample without replacement
and that the randomly chosen integers turn out to be 4, 14, 3, 18, and 15. Then,
our random sample of 5 chips would consist of the 4th, 14th, 3rd, 18th, and 15th
chips, counting from left to right.
The sample size used in random sampling can sometimes change due to changes
in available budgets or changes in the precision of the information required from the
sample. In such cases, after already having drawn a random sample of size n from a
population of N items, we may find ourselves in the position of wanting to either reduce
or increase the sample size somewhat. A question then arises as to how to accomplish
this. Fortunately, as the following rules illustrate, adjusting the sample size does not
require that we discard the items already sampled.
[ The complement of any sample is the name given to those items in the sample.]
Example 4.9 Commercial and military aircraft are built using hundreds of thousands of specially
designed nuts and bolts, known as “fasteners.” Because these fasteners are subjected
to stress, fatigue, and a host of environmental conditions, random samples of each
type of fastener are routinely tested for strength requirements.
Suppose an inspector has drawn a random sample of size 10 from a box
of completed fasteners and conducts torque tests on them. After testing, the inspector
is informed that, in fact, a sample of size 25 is required by the customer for these fas-
teners. Since the fasteners remaining in the box are the complement of the original
3
Wright, T., and H. Tsao, “Some Useful Notes on Simple Random Sampling,” Journal of Quality Technology,
1985: 67–73.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
170 chapter 4 Obtaining Data
sample of 10, then the inspector need only select a random sample of 15 fasteners from
the box to add to the original sample. Rule 4 ensures that the group of 25 fasteners
selected in this fashion qualifies as a random sample from the box.
On the other hand, suppose the inspector had originally selected a sample of
size 25 but subsequently found that a sample of only 10 was needed. By simply
selecting a random sample of 10 from the original 25 items, the inspector will have
legitimately obtained a random sample of size 10 from the box.
Obtaining random samples often requires some ingenuity. This is especially the
case when it is difficult to develop a sampling frame for the population of interest.
For example, continuous processes, which are not conveniently divided into finite
numbers of discrete parts, usually pose special problems when developing sampling
frames. In such circumstances, it is helpful to remember that a sampling frame can
also be a procedure, not just a list.4
Example 4.10 Agricultural inspectors are required to select random samples of crops for testing and
evaluation. Harvested crops stored in cartons or bins, such as citrus fruit, pose special
sampling problems. Although it is easy to imagine tagging the fruit in a bin with succes-
sive integers and applying the random number scheme to generate samples, doing so
would be time-consuming and economically prohibitive. Instead, other schemes have
been developed to obtain random samples in a more economical fashion. One popular
technique is to select a bin of fruit at random (bins are generally easy to select by the
random number method) and then follow a “random corner” method for obtaining the
sample: First, one of the bin’s four corners is chosen at random (a small printed table of
random numbers is helpful here); then the fruit stacked in the selected corner are used
to form the sample. This method relies on the reasonable assumption that the fruit
were randomly mixed when packed in the bin. Choosing a corner at random has the
additional benefit of not allowing human inspectors to introduce bias into the result-
ing data by always choosing a corner in which the fruit looks especially good (or bad).
4
Kish, L., Survey Sampling, John Wiley & Sons, New York, 1965: 53.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 171
from the data. Such methods are objective and the only “trust” involved is in assuring
that random sampling or randomization is correctly employed while gathering the data.
On the other hand, with nonrandom samples (i.e., data not gathered using some sort of
randomizing technique), no such probability assessments are possible and the informa-
tion in such data cannot, as a rule, be generalized to larger populations.
The problems with nonrandom data go even deeper. Even with the best intentions,
when trying to subjectively obtain data that we think is “representative” of a larger popula-
tion, the resulting data can be badly skewed. For example, when assessing the reliability of a
product, an engineer might try to ensure that the data includes examples of each kind of fail-
ure mode that the product experiences in the field. This practice automatically ignores the
fact that some failure modes are usually much more prevalent than others, and inferences
based on such “representative” samples may not only be unreliable, but even misleading.
So, what should you do when nonrandomly collected data arises in practice?
Although it is acceptable to apply simple descriptive statistical measures to the data (e.g.,
means, histograms, and so forth), be aware that (1) such measures can’t legitimately be
generalized, and (2) the statistical techniques presented in the following chapters may
not be valid when applied to such data.
Stratified Sampling
The method of random sampling can be extended to incorporate additional sources
of information and to handle problems that arise when sampling from populations for
which suitable sampling frames are hard to obtain. To distinguish basic random sam-
pling (as previously described in this section) from the extended sampling schemes that
rely on it, random sampling is often referred to as simple random sampling (SRS).
One method for incorporating additional information is stratified sampling. In
stratified sampling, the population of interest is first divided into several nonoverlap-
ping subsets called strata, and then the SRS method is used to select a separate ran-
dom sample from each of the strata. All of the strata samples are then combined into
one large “stratified” sample from the population. When the strata are properly spec-
ified, stratified sampling will generally produce estimates that are more precise than
SRS sampling.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
172 chapter 4 Obtaining Data
standard deviation i. The selection of sample sizes can be done in two steps: (1) Decide
on the total sample size n that will be used, and then (2) decide how to divide n up into
the strata sample sizes n1, n2, n3, . . . , nk.
1, 1 2, 2 3, 3 ,
1 2 3
1 2 3 sum =
1 2 3 sum =
Example 4.11 Companies that produce or handle hazardous chemicals are required to apply for a
National Pollutant and Discharge Elimination System (NPDES) permit from the
federal government (“Measuring, Sampling, and Analyzing Storm Water,” Pollution
Engr., Mar. 1, 1992: 50–55). The environmental concerns addressed by the NP-
DES permit involve the amounts of pollutants carried by storm water runoff from a
company’s facility to nearby public waters. Pollutant levels are estimated by taking
random samples of storm water and subjecting them to chemical analysis.
Sampling runoff water is accomplished by stratifying runoff water according to
the different point sources, usually water channels, that carry the runoff. Using various
techniques and meters, the average velocity of water flow and the cross-sectional area
of each channel are estimated. These are used to estimate total flow volumes for
each point source. The flow volume can be thought of as a measure of the size Ni of
the ith stratum. The total of all flow volumes represents the population size. Water
samples from each point source are obtained and chemically analyzed. The total pol-
lutant level is then calculated as a weighted average of the pollutants in each sample,
weighted by the flow volume from the point source where the sample was obtained. Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 173
that the ith stratum sample represents in the total sample of n, that is, wi 5 niyn for
i 5 1, 2, 3, . . . , k. Given the wi’s, the Ni’s, the i’s, a confidence level of 95%, and B, it
can be shown that the minimum necessary sample n for estimating the population mean
to within a margin of error of 6 B is
k N2i2i
^ wi
i51
n5
B 2 k
N2 a b 1 ^ Ni2i
1.96 i51
k 2
Nii c ^ Nii d
k i51
ni 5 n ° ¢ where n 5
^ Nii N2 a
B 2 k
b 1 ^ Ni2i
i51
1.96 i51
Ni 1
ni 5 n a b where n 5
N B 2 1
a b 1
1.96 N
This is called the proportional allocation. Please consult one of the chapter references
for the case of unequal sampling costs.
Regardless of the allocation used, the stratified estimate of the population mean
is given by
N1 N2 N3 Nk
xstr 5 x1 a b 1 x2 a b 1 x3 a b1 1 xk a b
N N N N
where xi denotes the mean of the ni observations from stratum Si. One of the nice
features of the proportional allocation is that the resulting data is “self-weighting”; in
other words, instead of calculating the stratified estimate we can simply combine the
data from all the strata and calculate the ordinary sample mean of the combined data,
which, only in this case, will exactly equal xstr.
Stratified estimates of are usually accompanied by a measure called their standard
error (which will be discussed more fully in Chapter 7) that can be interpreted in much
the same way the sample standard deviation is interpreted. That is, if we think of all
the possible stratified samples of size n that we could have selected, about 95% of the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
174 chapter 4 Obtaining Data
estimated means from such samples will be within about 2 standard errors of . For
stratified sampling, the standard error is approximated by
1 k s2i N i 2 ni
2^ i
sstr N2 a ba b
CN i51
ni Ni 2 1
where s2i is the sample variance of the ni observations from stratum Si.
Example 4.12 Since 1991 the USGS (U.S. Geological Survey) has conducted the National Water
Quality Assessment Program (NAWQA), whose purpose is to study natural and human
factors that affect water quality. One important measurement that NAWQA produces
is an estimate of the percentages of a region covered by various crop types. In one
study (“Validation of National Land-Cover Characteristics Data for Regional Water
Quality Assessment,” Geocarto International, Dec. vol. 10, no. 4 1995: 69–80) of the
percentages of a region covered by corn crops, a region was divided into the following
strata: A (irrigated crops), B (small grains and mixed crops), C (grasslands and small
crops), D (wooded areas and crops), E (grasslands), and F (woods and pastures).
The region under study is first divided into smaller regions called quadrats, each
with an area of 1 km2. These subregions are then assigned to the various strata cat-
egories. Suppose that data from previous studies is used to obtain estimates of the
standard deviations i of the percentages of corn crops within each stratum and that
this information is collected in the following table:
Stratum (Si) Stratum size (Ni) Standard deviation (i)
A 500 .2
B 300 .2
C 100 .3
D 50 .4
E 50 .6
F 200 .8
Since aerial photographs are used to estimate the percentage of corn coverage at a
given site, the unit sampling costs will be about the same for each 1 km2 subregion,
so the Neyman allocation can be used. If we specify a 90% confidence level (the
area under the z curve between 21.645 and 11.645 is .90) and a margin or error
of 610% (i.e., B 5 .10), then
k 2
c ^ Nii d
i51
n5 2 k
B
N2 a b 1 ^ Ni2i
1.645 i51
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 175
Using the fact that ^ ki51 Nii 5 410.0, the Neyman allocation of n 5 110 to the strata is
The next step in the study is to obtain random samples of size n1 5 27, n2 5
16, . . . , n6 5 43 from the respective strata and to use aerial photographs of the selected
1-km2 regions to obtain estimates of the corn percentages in these regions. To illustrate, the
following table summarizes the data from such a study:
Stratum ni Ni xi si
A 27 500 .52 .18
B 16 300 .22 .23
C 11 100 .02 .35
D 5 50 .06 .45
E 8 50 .01 .64
F 43 200 .67 .78
From this data we estimate that overall percentage of the entire region that is covered by
corn crops is
and the estimated standard deviation that accompanies this estimate is sstr .03 (or, 3%).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
176 chapter 4 Obtaining Data
are not normally known exactly, there are various possibilities for
estimating them:
a. You can approximate the i values based on pilot studies or on results from
previous studies.
b. Or, if there is no prior information about i values, then be pessimistic
and use i 5 .5 for each i 5 1, 2 ,3 , . . . , k (this choice maximizes
1i(1 2 i)).
The stratified estimate of the population proportion is then given by
N1 N2 N3 Nk
pstr 5 p1 a b 1 p2 a b 1 p3 a b1 1 pk a b
N N N N
Example 4.13 Improper handling of newly planted citrus trees can cause a defect called benchroot,
which is the tendency for the root system to grow sideways. Benchroot eventually
causes trees to be less healthy and smaller than normal, which results in smaller
crops. Because citrus trees require several years of growth before reaching maximum
production levels, the presence of benchroot is not apparent until years after planting.
By sampling young trees shortly after planting, the extent of the benchroot problem
can be estimated in time to take other measures, such as replanting selected areas.
Suppose that a citrus cooperative consists of five different farms. Using the farms as
strata should increase the precision of the final sampling results since the trees within a
given farm ought to be more similar to each other than to trees on other farms. The number
of trees on the farms are known to be N1 5 2000, N2 5 4000, N3 5 8000, N4 5 8000,
and N5 5 1000. Based on records from previous plantings, the benchroot problem has
affected no more than about 10% of all trees, so a value of i 5 .10 (i 5 1, 2, 3, 4, 5) is
selected for each farm. This means that i 5 1.10(1 2 .10) 5 .3 for each farm. Since the
unit costs ci(i 5 1, 2, 3, 4, 5) of selecting and testing a tree are assumed to be equal for each
farm, the Neyman allocation can be used to find the required sample size and its allocation
to the strata (farms). Finally, suppose that a confidence level of 95% and an error bound
of B 5 .03 (i.e., 63%) are chosen. Based on this information, the required sample size is
k 2
c ^ Nii d
i51
n5
B 2 k
N2 a b 1 ^ Ni2i
1.96 i51
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.2 Data from Sampling 177
which we round to n 5 377. The following table shows the steps in allocating the
total sample of 377 to the five strata (farms). After sampling, the number of trees with
benchroot, xi, is recorded for each farm. Note that we have rounded all final sample
sizes to integer values.
The reader can verify that the standard error associated with this estimate is
sp 5 .011.
Cluster Sampling
Stratified and SRS sampling are best when relatively complete lists of population elements
and strata sizes are known before sampling. In some applications, however, such informa-
tion is difficult or impossible to obtain. In wildlife sampling, for instance, scientists usually
do not have advance knowledge of either the size of the particular population or the size of
the various strata in the population. In such cases, some form of cluster sampling is used
instead of SRS or stratified sampling. Like stratified sampling, cluster sampling requires
that we first divide a population into nonoverlapping groups, called clusters. However, we
do not need to know the number of population elements in each cluster. Instead, we simply
take an SRS sample of the clusters and then measure all elements within the selected clus-
ters. For example, the U.S. Census relies on cluster sampling when complete lists of city in-
habitants are not known. A city is divided into blocks (clusters) using maps, then a random
sample of these blocks is selected and all residences in the sampled blocks are contacted.
Example 4.14 Biologists and ecologists frequently sample geographic areas by dividing a map of a
region into a collection of small square regions called quadrats (Ripley, B. D., Spa-
tial Statistics, New York, Wiley, 2004: 102). By making sure the quadrats do not over-
lap, we can apply the method of cluster sampling by choosing a random sample of
quadrats to investigate. In wildlife studies, for instance, the number of a given species
in each of the selected quadrats is counted. Because the area of a quadrat is known,
these counts are usually converted into a count per unit area, which is a measure of
the abundance of the particular species per unit area.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
178 chapter 4 Obtaining Data
6. Devise a procedure for selecting a random sample (e.g., daily amounts of sunlight) are different
of words from a dictionary. Explain why your proce- on the four sides of the hill, the hill should be
dure guarantees that, for any n, each collection of n divided into four quadrants and trees should be
words has an equally likely chance of being selected. randomly sampled from each quadrant. What is
the name for this type of sampling procedure?
7. Sometimes it is difficult or impossible to determine
the population size before selecting a random sample. 10. In stratified sampling, explain why it is best to
Describe how you would go about selecting a random choose strata such that the objects within any stra-
sample of trees from a 1-square-mile area of forest. tum are relatively homogeneous.
8. Small manufactured goods are often gathered into 11. Explain how to use the 5RANDBETWEEN fun
large batches, called lots, for purposes of handling ction in Excel™ to generate a random sample from
and shipping. Random sampling is commonly the integers 1 through 1000. Does the 5RANDBE-
used to evaluate the quality of items in a given lot. TWEEN function generate samples with or without
Suppose an inspector selects a random sample of replacement?
20 items from a lot of 1000 items. 12. A population of items is partitioned into k strata of sizes
a. Before evaluating the 20 items, the inspector de- N1, N2, . . . , Nk. Using proportional allocation, ran-
cides that a sample of size 30 should be used in- dom samples of size n1, n2, n3, . . . , nk are selected
stead. If the inspector obtains a second random from the strata and the numbers x1, x2, x3, . . . , xk
sample of size 10 from the remaining 980 items, of items having a specified characteristic are deter-
can the two samples combined be validly con- mined. Sample proportions p1, p2, p3, . . . , pk are
sidered a random sample of 30 from the lot? Ex- then computed (i.e., pi 5 xiyni for each i).
plain your reasoning. a. Write an expression for the weighted average of
b. Suppose the inspector decides that only 15 items the sample proportions, using the stratum sizes
must be tested. Describe a method by which a as weights.
valid random sample of 15 from the lot can be b. Show that the weighted average in part (a)
formed from the 20 items already selected. simplifies to (x1 1 x2 1 x3 1 1 xk)y(n1 1 n2 1
9. Citrus trees are usually grown in orderly arrangements n3 1 1 nk).
of rows to facilitate automated farming and harvesting 13. Integrated circuits (ICs) consist of thousands of small
practices. Suppose a group of 1000 trees is laid out in circuits, electronic subcomponents (e.g., resistors),
40 rows of 25 trees each. To test the sugar content of and connections. An important factor in the manu-
fruit from a sample of 30 trees, researcher A suggests facture of ICs is the yield, the percentage of manu-
randomly selecting five rows and then randomly se- factured ICs that function correctly. Stratified sam-
lecting six trees from each sampled row. Researcher B pling has recently been used to estimate the number
suggests numbering a map of the trees from 1 to 1000 of defects of various kinds that occur throughout an
and selecting a random sample (without replace- IC. The area of the IC is first divided into smaller
ment) of 30 integers from the integers 1 to 1000. areas (i.e., strata) and then small sample areas are
a. Without performing any calculations, do you selected from the strata and examined for defects.
think that both methods are capable of gener- A stratified estimate of the overall proportion of de-
ating random samples from the block of trees? fects can be used to help estimate the eventual yield
Justify your answer using the rules for random of the IC manufacturing process.
samples listed in this section. In one such study, to estimate the proportion
b. Suppose that the group of trees is grown on of pinholes on an IC, its entire surface was first di-
the top and sides of a small hill. A researcher vided into 10 equal areas (strata), each of which was
suggests that, because growing conditions further subdivided into 1000 smaller rectangles that
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.3 Data from Experiments 179
served as the elements to be sampled. It was also as- 15. When the per unit cost of sampling from stratum
sumed that the unit costs and variances of the num- i is ci, it can be shown that the optimal weights for
bers of pinholes were equal from strata to strata. allocating the total sample size are given by
a. Calculate the population size N. Nii
b. Using a confidence level of 90% and a bound on 1ci
the error of estimation of B 5 .03 (i.e., 63%), wi 5
N11 N33
N22 Nkk
calculate the required sample size n and its 1 1 1 1
allocation n1, n2, n3, . . . , n10 to the ten strata. 1c1 1c2 1c3 1ck
Round all sample sizes to the nearest integer.
a. In the case where all unit sampling costs are equal,
c. Using the sample sizes in part (b), the results of
show that the resulting weights give the formulas
the study showed the following numbers of pin-
for n and ni specified by the Neyman allocation.
holes per sample:
b. In the case where all unit sampling costs are equal
Sample #: 1 2 3 4 5 6 7 8 9 10 and all strata variances are equal, show algebraically
Pinholes: 5 4 7 6 3 9 5 6 2 8 that the resulting weights give the formulas for n
Calculate the stratified estimate of the proportion and ni specified by the “proportional” allocation.
of pinholes on the entire IC. 16. In stratified sampling, explain why the number of
d. Calculate the standard error associated with the strata, k, should not exceed ny2, where n 5 n1 1
estimate in part (c). n2 1 n3 1 1 nk is the total sample size and ni
14. Of the elements of a certain population 20% are denotes the number of sampled items selected from
grouped into stratum S1 and the rest of the popu- stratum Si (i 5 1, 2, 3, . . . , k).
lation elements comprise stratum S2. Suppose that 17. In stratified sampling, what value would you use in
the variances of the characteristic being measured place of 1.96 if you wanted the confidence level to
are the same for each stratum, but it costs twice as be 99% rather than 95%? What is the consequence
much to obtain a sampled item from stratum S1 as of using the higher confidence level on the neces-
it does from stratum S2. What is the best allocation sary sample size?
of a total sample of n 5 1000 to these two strata?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
180 chapter 4 Obtaining Data
between variables and generalizing conclusions obtained from the data. Inherent in
these designs are methods for balancing the two opposing goals of comparability and
generalizability mentioned in the previous paragraph.
Example 4.15 Plastic resins used in injection molding machines are designed to meet various pro-
duction requirements (e.g., melting temperatures, hardness, color). Raw resins are
manufactured in the form of solid plastic pellets that are subsequently melted inside
an injection molding machine and then “shot” into molds.
Suppose that a company wants to test similar resins from two suppliers, A and B,
to determine which one better achieves the hardness requirements for certain molded
parts. One experimental approach is to test each resin two or more times using the same
molding machine. By combining more than one reading for each brand, we hope to
“average out” any unexpected biases that might creep into any single measurement. In
such an experiment, the average hardness measurements would be directly comparable.
That is, as long as all other experimental conditions are held constant, there would be
little doubt that differences between the average hardness measurements could be at-
tributed to differences between the two brands of resin. Figure 4.2(a) depicts this design.
Brand A Brand B
1 2 3 4 Machine 2 2 4
(a) (b)
It is very difficult, however, to extrapolate such results to a more general setting. For
instance, would the hardness measurements be significantly affected if we used several
different molding machines? Figure 4.2(b) shows a simple experimental design that
allows us to answer this question while simultaneously allowing us to answer the origi-
nal question about differences between the two brands. The noteworthy feature of this
design is that comparability between brands is maintained [by comparing the average
Unless otherwise noted, all content on this page is © Cengage Learning.
hardness reading (x1 1 x2)y2 for brand A to the average (x3 1 x4)y2 for brand B], yet
we can also answer questions about whether different machines influence the results
[by comparing the two machine averages (x1 1 x3)y2 and (x2 1 x4)y2]. As this design
illustrates, the key to maintaining comparability while answering questions about gen-
eralizability is to make each measurement work more than once. Note, for instance,
that reading x1 appears in the average for brand A and again in the average for machine
1. Designs such as the one in Figure 4.2(b) can easily be extended to handle more and
more complex questions involving the effects of changing several test conditions.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.3 Data from Experiments 181
experimenters simply observe and measure but otherwise allow all factors to vary freely.
The following list shows some of the most common applications of experimental design.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
182 chapter 4 Obtaining Data
the effect of external factors, and increasing the generalizability of the conclusions.
What follows is an overview of these tools. Specific designs are presented in Chapter 10.
Perhaps the most familiar tool is that of replication, that is, making several repeated
measurements at each fixed combination of factor or treatment levels. For instance, in
Figure 4.2(b) of Example 4.15, suppose that we decide to make three measurements
of plastic hardness at each of the four combinations of factor levels: {brand A with ma-
chine 1, brand B with machine 1, brand A with machine 2, brand B with machine 2}.
The purpose of doing this is twofold: (1) Biases tend to be eliminated when several
measurements are averaged and (2) the variation between repeated measurements gives
a measure of experimental error. Experimental error is the name given to the slight
differences that we expect to find between repeated experimental tests, even when we
attempt to hold all test conditions constant.
The next tool, randomization, is somewhat less familiar than replication. Random-
ization requires that treatments be given to the experimental units in random order, or
equivalently, that we assign experimental units to the various treatments in a random
fashion. In Example 4.15, the experimental units are the individual containers of plastic
pellets (of each brand) that are used for testing. Since we decided to use three replica-
tions for each combination of factor levels, there are a total of 12 tests to conduct (three
measurements at each of the four factor combinations). Randomization requires that
these 12 tests be run in random order. This is easy to accomplish using the methods of
Section 4.2, as the next example shows.
Example 4.16 In Figure 4.2(b) of Example 4.15 (page 180), denote the four distinct treatment com-
binations by M1A, M1B, M2A, and M2B, where M1A stands for the combination
“machine 1 and brand A,” M1B stands for “machine 1 and brand B,” and so forth. To
run three replicate tests at each treatment combination, we first number these tests
from 1 to 12 as in the following table. Next, a random sample of size 12 is chosen
(without replacement) from the integers 1 through 12. Suppose, for instance, the
random sample is {11, 3, 7, 1, 4, 5, 12, 2, 8, 10, 6, 9}. With this ordering, test 4 (M2A)
would be the first one conducted, test 8 (M1B) would be next, and so forth. In this
way, the tests will be conducted in random order.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.3 Data from Experiments 183
Randomization is used for much the same reasons that we use random sampling
(Section 4.2): to eliminate unforeseen biases from the experimental data and to lay the
groundwork for the statistical inferences that we eventually draw from the experiment.
The first reason is easy to understand when we again consider Example 4.15. To save
time, for instance, someone might decide to run all three tests involving machine 1 and
brand A sequentially, since the brand A plastic could simply be inserted in the machine
three times in a row, avoiding any downtime for cleaning the machine when switching
to the other brand. However, this might mean that all the tests with machine 1 and
brand B would have to be conducted on a different day than the brand A tests. Since it is
possible that environmental factors could change from one day to another or that differ-
ent machine operators might be used on different days, these different conditions them-
selves could be responsible for substantial differences in the hardness measurements.
In other words, we could no longer be confident about attributing differences between
hardness measurements solely to differences between the two brands. By running the 12
tests in random order, we can avoid systematic biases such as these.
The third tool used extensively in experimental design is blocking. Blocking is
used to screen out the effects of external factors that the experimenter suspects in
advance will have a large effect on the measurements. Pharmaceutical companies,
for example, use blocking when testing the effectiveness of a new drug. Because
different people often differ widely in their responses to drugs, experimenters first
divide the experimental subjects into homogeneous groups or blocks. The people
in a given block are “matched” on various characteristics (e.g., blood pressure, age,
gender) so that the people in any given block are very similar to one another but fairly
different from the people in other blocks. The goal is to maximize the similarity of
the subjects within each block and to maximize the differences between the blocks.
For instance, block 1 might consist of young females with low blood pressure, block
2 could consist of middle-aged men with high blood pressure, and so forth. After
the blocks are formed, the experimental treatments are applied within the blocks.
For example, half of the people in block 1 would be given the new drug, whereas
the other half would receive a placebo. Similarly, half the people in block 2 would
receive the new drug and half would receive the placebo. In this way, when we look
at a particular block, any differences in response between the two halves of the block
could be attributed to the different treatments (receiving the drug or receiving the
placebo), not to the differences between people. Without blocking, differences in the
response to different treatments can often be masked by large differences between
the individuals randomly selected for each treatment.
Blocking increases the sensitivity of an experiment for detecting differences be-
tween treatments. When blocking is applied in conjunction with randomization, it is
possible to design experiments that are simultaneously sensitive to differences between
the treatments studied but less sensitive to the unknown external factors that might
affect the data. One popular phrase that summarizes how these tools are to be used is
“block what you know, randomize what you don’t.”5 In other words, try to identify known
sources of variation and eliminate their effect by forming blocks. However, within each
block, remember to assign experimental units to the treatments in a random fashion.
5
Box, G. E. P., W. G. Hunter, and J. S. Hunter, Statistics for Experimenters (2nd ed.), John Wiley & Sons,
New York, 2005: 93.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
184 chapter 4 Obtaining Data
Example 4.17 The strength of concrete used in commercial construction tends to vary from
one batch to another. Consequently, small test cylinders of concrete sampled
from a batch are “cured” for periods up to about 28 days in temperature- and
moisture-controlled environments before strength measurements are made.
Concrete is then “bought and sold on the basis of strength test cylinders” (ASTM
C 31 Standard Test Method for Making and Curing Concrete Test Specimens
in the Field).
Suppose that we want to compare three different methods of curing concrete
specimens. We know that batch-to-batch variation can be a significant factor in
strength measurements. One way to compare the three methods is to use different
batches of concrete as blocks in an experimental design. This is accomplished by
separating each batch into three portions and then randomly assigning the portions
to the three curing methods. Table 4.1 shows the data from one such test using ten
batches of concrete of comparable strengths.
The purpose of blocking is to allow for fair comparisons among the three test Unless otherwise noted, all content on this page is © Cengage Learning.
methods. Notice, for example, that all three methods gave relatively lower values
for batch 6 and higher values for batch 5. This is evidence of a difference be-
tween batches 5 and 6. By blocking, however, any differences among the batches
are experienced by all three test methods. Consider how different things might
be if we had simply assigned entire batches at random to the three test methods.
By doing so, it is possible that batch 5 could be assigned to method C alone and
batch 6 to method A alone, which would increase the average strength measure-
ment for column C and decrease the average for column A. In other words, if
we do not use blocking, then differences among the three test methods could be
significantly influenced by the manner in which the batches of cement are as-
signed to the tests.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.3 Exercises 185
18. Four new word processing software programs are to b. What operational definitions would you sug-
be compared by measuring the speed with which gest that the researcher incorporate into this
various standard tasks can be completed. Before experiment?
conducting the tests, researchers note that the c. What changes would you make to the experi-
level of a person’s computer experience is likely to ment to increase the generalizability of the
have a large influence on the test results. Discuss experimental results?
how you would design an experiment that fairly
22. In a study of the ratio of nitrogen, phosphoric acid,
compares the word processing programs while
and potash in fertilizers, four different mixtures
simultaneously accounting for possible differences
(M1, M2, M3, M4) of the three chemicals are to be
in users’ computer proficiency.
tested for their effects on the rate of growth of grass
19. What primary purpose do replicated measurements seedlings. A square plot of land is subdivided into
serve in an experimental design? four equal-size square plots, each planted with the
same amount, by weight, of seedlings. Before the
20. In a study of factors that affect the ability of the laser
fertilizers are applied, each square subplot is itself
in a DVD player to read the information on a DVD,
divided into four more squares. Two experimental
a researcher decides to examine several different pho-
methods are proposed for applying the fertilizers to
toresist thicknesses used in making the plates from
the subplots. In experiment A, the four fertilizers
which plastic DVDs are stamped. As a response vari-
are randomly assigned to the large subplots, where-
able, the researcher decides to measure the average pit
as in experiment B, all four fertilizers are randomly
depth of the holes etched on the surface of the DVD.
assigned to the subplots of the four large plots. An
The experiment must be conducted under a fixed
illustration of both experimental designs follows.
budget and time constraint that allows the researcher
to analyze a sample of at most 20 DVDs. M1 M1 M4 M4 M1 M4 M3 M2
a. Suppose that it is known that, for any fixed
photoresist thickness, there tends to be little, M1 M1 M4 M4 M3 M2 M4 M1
if any, variation in the pit depths on a DVD.
M3 M3 M2 M2 M2 M1 M2 M3
Which would be better: (1) an experiment
with little or no replication and several pho- M3 M3 M2 M2 M4 M3 M1 M4
toresist thickness levels or (2) an experiment
with more replication, but fewer photoresist Experiment A Experiment B
thickness levels?
a. If care were taken to ensure that there are no
b. Suppose it is known that, even for a fixed photo-
significant differences in the growing conditions
resist thickness, pit depths can vary substantially.
(soil type, irrigation, drainage, sunlight, etc.)
Answer the question posed in part (a) for this
among the four large subplots, is one of these
situation.
designs preferable over the other? Why?
21. A researcher wants to test the effectiveness of a new b. If it is suspected that there could be significant
fuel additive for increasing the fuel efficiency (miles differences in the growing conditions among
per gallon, mpg) of automobiles. The researcher the four main subplots, is one of the two designs
proposes that a car be driven for a total of 500 miles preferable over the other? Why?
and that at the end of each 100-mile segment the
23. A complex chemical experiment is conducted and,
fuel efficiency be measured and recorded.
because the amount of precipitate produced is expect-
a. What is the purpose of measuring efficiency
ed to vary, the experiment is repeated several times.
every 100 miles? Why not just measure efficien-
A lengthy lab equipment setup, followed by a tedious
cy at the end of the 500-mile course?
experimental procedure, allows the experiment to
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
186 chapter 4 Obtaining Data
be repeated up to six times in any given day. Conse- a. Calculate an estimate of how much plastic hard-
quently, one lab assistant is assigned to set up the lab ness is increased or decreased by switching from
equipment and then conduct six runs one day. The the brand A resin to the brand B resin.
next day a second lab assistant conducts another six b. Calculate an estimate of how much plastic hard-
runs using the same lab setup from the previous day. ness is increased or decreased by switching from
What two basic experimental design principles are machine 1 to machine 2.
violated by this experimental procedure? c. Because this experiment does not provide any
estimate of the experimental error expected in
24. Refer to Example 4.15 and Figure 4.2(b). Suppose
successive experimental runs, it is impossible
the hardness measurements (in Mohs) of plastics in
to know whether the estimated change in part
four test runs are as follows:
(a) is caused by switching brands or is simply
Brand A Brand B
due to experimental variation. Describe how
Machine 1 2.6 3.2 you would improve this experiment to obtain
Machine 2 2.8 3.6 an estimate of the experimental error.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.4 Measurement Systems 187
some known value x, we measure the accuracy of the readings by the difference between
x and the average of the n readings:
accuracy 5 x 2 x
Refer to the n measurement readings displayed in the histogram in Figure 4.3. We can
think of accuracy as the distance between the center of the histogram (i.e., the mean)
and the true value of x.
Measurements
–
Accuracy
1
precision 5 s 5
A n21
^ (xi2 x)2
Figure 4.4 shows the various combinations of precision and accuracy that are pos-
sible in practice. The worst case occurs in Figure 4.4(a) where the measurements have
a large variation (i.e., low precision) and are biased to the left of the true value of x. The
best-case scenario is in Figure 4.4(d), where all the measurements are tightly packed
Unless otherwise noted, all content on this page is © Cengage Learning.
(a) Inaccurate and imprecise (b) Accurate, but imprecise (c) Precise, but inaccurate (d) Accurate and precise
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
188 chapter 4 Obtaining Data
Example 4.18 In a repeatability study, a worker selects a single manufactured part and measures its
length 25 times. The measurements (in.) and their sample mean and standard de-
viation are given in Table 4.2. The repeatability of the measuring instrument can be
reported either as the standard deviation s 5 .096 in. or in terms of 63s 5 63(.096) 5
6.288 in. The latter method has the intuitive interpretation that the instrument’s read-
ings generally lie within about .288 in. of the true length. For instance, if the worker
measures another part and obtains a reading of 9.98 in., then the true length of that
part should be somewhere between 9.692 and 10.268 in.
2 10.05 15 9.88
3 9.99 16 9.82
4 9.85 17 9.91
5 9.90 18 10.05
6 10.00 19 9.87
7 9.99 20 10.05
8 9.98 21 9.94
9 10.17 22 9.75
10 9.97 23 9.89
11 9.97 24 9.85
12 10.02 25 10.12
13 10.00
x 5 9.95 s 5 .096
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.4 Measurement Systems 189
Unfortunately, the terms repeatability and reproducibility are not uniquely defined
in the literature, so you may encounter alternative definitions from time to time. One
popular definition of repeatability is given by the formula k12s, which estimates the
maximum difference that, with high reliability, can be expected between any two in-
strument readings. In this formula, s is the sample standard deviation and the factor k
depends on the reliability level we specify. Tabled values of k, along with a detailed dis-
cussion of this form of repeatability, can be found in the article by Mandel and Lashof
listed in the chapter bibliography.
As we allow more and more parts of a measurement system to vary, we move from
repeatability to the concept of reproducibility. Reproducibility studies allow several fac-
tors to vary at the same time. In such studies, it is common to use several operators and
several instruments to measure several production items. The idea is to see how the mea-
surement system behaves in an environment more closely resembling a real production
environment. Reproducibility studies are usually based on simple experimental designs
that allow us to break measurement variation into distinct components that estimate the
contribution of the various noise factors (different operators, different parts, etc.) to the
overall measurement error. Examples of such designs are given in Chapter 10.
Interlaboratory Comparisons
Many measurements are done by laboratories specializing in complex measurement
procedures. This is the case, for example, for most of the nondestructive tests mentioned
in Example 4.7. For such data, our concern centers on the consistency of the results
reported by different laboratories. Practically speaking, we want some assurance that if
we submit the same sample material to laboratory A and laboratory B, then the results
reported by the two laboratories will be in close agreement.
The reliability of data from different laboratories is evaluated by means of interlabo-
ratory comparison programs. Professional organizations such as the American Society for
Testing and Materials (see Section 4.1) run several such programs each year. For example,
in the ASTM interlaboratory cross-check program for reformulated gasoline, participating
laboratories are given test samples each month for measurement. The test samples are
specially prepared under the direction of ASTM to ensure that each lab receives the same
test material. The data from all participating laboratories is then summarized and given
to the participating laboratories. In this way, each laboratory can evaluate its performance
against the others and, if necessary, make changes to its measurement system.
Youden plots, introduced in 1959, are the standard technique for compar-
ing the data from a group of laboratories (Youden, W. J., “Graphical Diagnosis of
Interlaboratory Test Results,” Industrial Quality Control, 1959: 24–28). To create
these simple scatterplots, each laboratory is given two nearly identical test samples
(labeled A and B) to measure. The two measurements from a given laboratory are
then plotted as a single point on the Youden plot. The horizontal axis is used for
the measurements of sample A and the vertical axis is used for sample B. As an aid
in interpreting the plots, horizontal and vertical lines positioned at the medians of
the sample A data and sample B data are included. Some typical Youden plots are
shown in Figure 4.5 (page 190). The points generally fall close to a 45° line because
the two samples (A and B) are similar and because each lab follows a fixed measure-
ment procedure.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
190 chapter 4 Obtaining Data
Lab 2
Sample A Sample A Sample A
(a) (b) (c)
Figure 4.5 Typical Youden plots and their interpretation: (a) ideal situation with the points evenly
scattered in all four quadrants; (b) laboratory 1 and laboratory 2 are using procedures that are
systematically different from those used at the other labs; (c) most of the labs are following slightly
different versions of the test procedure
Example 4.19 Nonsteroidal anti-inflammatory drugs (NSAIDs) are often used to reduce in-
flammation and relieve fever and pain. Examples of NSAIDs include ibuprofen,
ketoprofen, and naproxen. In “Second Interlaboratory Exercise on Non-Steroidal
Anti-Inflammatory Drug Analysis in Environmental Aqueous Samples” (Talanta,
2010: 1189–1196), researchers wanted to investigate interlaboratory comparisons
of NSAIDs in different aqueous samples. This research was conducted to ascertain
the level of interlaboratory agreement of NSAID analyses among various European
laboratories and also to determine possible sources of variation. In one investigation,
each of 12 laboratories measured the concentrations of ibuprofen (ng/L) in two test
samples of tap water. Table 4.3 shows the data from these tests as read from a graph.
The Youden plot for this data (Figure 4.6) shows many points scattered near the
45°-line, indicating that several of the laboratories are following different versions of
the chemical test procedure.
3 42.74 42.46
4 46.09 45.20
5 46.46 46.11
6 49.81 48.85
7 60.96 55.24
8 66.53 40.63
9 67.65 47.02
10 113.36 105.46
11 172.09 172.11
12 199.97 193.11
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4.4 Exercises 191
Sample B
200
150
100
50
0 Sample A
0 50 100 150 200
26. Calibration is the process of comparing an instru- 28. After carefully controlling all the chemical reagents
ment’s measurements to those of a reliable ref- and conditions during an experiment, a chemist
erence standard. If necessary, the instrument is weighs the amount of reactant produced by an ex-
adjusted to bring its measurements into agreement periment. The chemist weighs the reactant on an
with the reference standard. Explain what effect electronic balance, then reweighs the reactant five
Unless otherwise noted, all content on this page is © Cengage Learning.
calibration has on the estimated precision (not the times, being careful to remove and replace the re-
accuracy) of a measuring instrument. actant on the balance between weighings.
a. In the language of experimental design, can these
27. Many instrument makers report the accuracy of
six measurements be considered replications?
their instruments in terms of relative error as well as
b. What type of variation is measured by calculat-
absolute error. The relative error in a measurement
ing the sample standard deviation, s, of the six
is defined as (m 2 x)yx ? 100%, where m is the mea-
measurements?
sured value and x is the true value. Absolute error is
given by |m 2 x|. 29. The melt flow index (MFI) of a polymer is defined
a. Calculate the relative errors for each of the five to be the amount of the polymer (in grams) that can
measurements in Exercise 25. flow in 10 minutes through a standard die when
b. Relative errors are often stated in terms of the subjected to a specified force and temperature. MFI
maximum relative error to be expected for any is widely regarded as an important characteristic for
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
192 chapter 4 Obtaining Data
Supplementary Exercises
30. Consult a published reference, weather bureau, caught and tagged. After releasing the fish and al-
or Internet site to determine the operational defi- lowing sufficient time for them to mix with the rest
nition used by weather forecasters when making of the fish in the lake, a second sample of, say, 50 fish
statements like “There will be a 30% chance of are caught. The number of tagged fish in the second
rain tomorrow.” sample is counted.
a. Suppose there are five tagged fish found in the
31. A common method for selecting a random second sample. Because the samples are as-
sample without replacement from the integers sumed to be random samples from the entire
1, 2, 3, . . . , N is to generate a random sample population of T fish, the proportion of tagged
with replacement (using random number tables fish in the second sample should be approxi-
or a software program) and then discard any mately equal to the proportion of tagged fish in
duplicate numbers that appear in the sample. the population. Use this fact to estimate T, the
Use the sampling rules in Section 4.2 to justify total number of fish in the lake.
why this procedure will produce a valid random b. Generalize your result in part (a). That is, if xtag1
sample without replacement.
is the number of fish caught and tagged in the
32. The method of capture–recapture sampling is often first sample and xtag2 is the number of tagged
used to estimate the size of wildlife populations fish found in a second sample of size y, write an
(Thompson, S. K., Sampling, John Wiley & Sons, equation for the estimated value of T.
New York, 1992: 212–233). To illustrate the method, 33. Cr(VI) is a pollutant associated with chromite ore
suppose an initial sample of 100 fish from a lake are processing. In a study of Cr(VI) concentrations, a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Bibliography 193
sampling plan was devised to estimate ambient lev- several models of gas-powered lawn mowers. The
els of Cr(VI) in the air [“Background Air Concen- following table shows NOx emission rates (grams/
trations of Cr(VI) in Hudson County, New Jersey: kWh) for two measuring methods: STC (similar to
Implications for Setting Health-Based Standards certification), which measures emissions for a 10-sec
for Cr(VI) in Soil,” J. of Air and Waste Manage- period, and an experimental method C6M, which is
ment, 1997: 592–597]. The authors propose using a weighted average of emission rates obtained under
such background measurements as a basis for de- six different combinations of running speeds, times,
veloping health-based standards for chromite ore and engine loads.
processing plants.
a. In the study, background samples of air were NOx emission rate estimates
selected to be representative of land use in the Lawn mower STC C6M
vicinity of chromite ore processing sites, but not 1 3.03 4.40
so close that these samples would be affected 2 4.04 4.38
by emissions from the processing plants. What 3 5.34 7.64
role would such samples play in an experiment 4 6.42 8.28
5 4.17 7.21
to subsequently evaluate emissions at chromite 6 1.23 1.43
ore plants? 7 4.10 3.91
b. The authors used ASTM Standard Test Method 8 2.21 1.89
D5281–92 when measuring the concentrations 9 6.57 7.14
of Cr(VI). What experimental purpose does us- 10 3.80 4.71
11 4.76 6.80
ing such a standard serve? 12 .49 .01
c. Air samples were taken at two different locations, 13 1.97 2.91
an industrial area and an undeveloped commer- 14 1.64 1.23
cial site. Samples were collected at each site dur- 15 3.26 2.72
ing six 24-hour sampling periods; wet and dry 16 4.20 6.95
17 .32 .11
days were included. What general experimental 18 7.76 8.73
design principles are illustrated here? 19 4.79 6.75
20 .98 1.12
34. Youden plots are frequently used to compare two dif-
ferent instruments or evaluation methods. In a study a. Construct a Youden plot of this data.
of lawn mower exhaust emissions (“Exhaust Emis- b. Use the methods of Chapter 3 to fit a regression
sions from Four-Stroke Lawn Mower Engines,” J. of line to this data, with STC as y and C6M as x.
Air and Waste Management, 1997: 945–950), two c. What conclusions can you draw from the results
methods of measuring NOx (nitrogen oxide) emis- in parts (a) and (b) about the two NOx measur-
sion rates were compared by using both methods on ing methods?
Bibliography
Box, G. E. P, W. G. Hunter, and J. S. Hunter, Statistics Mandel, N., and T. W. Lashof, “The Nature of Repeatability
for Experimenters (2nd ed.), Wiley, New York, 2005. and Reproducibility,” J. of Quality Technology, vol. 19,
Written for researchers. Emphasis is on explanation no. 1, 1987: 29–36. A nice explanation and comparison
and application of experimental design techniques to of two concepts that are sometimes confused in practice.
real data and examples. Thomas, G. G., Engineering Metrology, Wiley, New
Lohr, Sharon, Sampling Design and Analysis (2nd ed.), York, 1974. Explains the various methods used to
Duxbury, Belmont, CA, 2009. A comprehensive obtain measurements of different physical quantities.
survey of sampling.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
194 chapter 5 Probability and Sampling Distributions
5
Scorpp/Shutterstock.com
Probability and Sampling
Distributions
5.1 Chance Experiments
5.2 Probability Concepts
5.3 Conditional Probability and Independence
5.4 Random Variables
5.5 Sampling Distributions
5.6 Describing Sampling Distributions
Introduction
Chapter 5 marks a transition from purely descriptive methods to the inferential
methods discussed in the remainder of this book. Beginning in this chapter, we will
refer to any numerical measure calculated from sample data as a statistic. As you
have seen in Chapters 1–3, statistics such as the sample mean, standard deviation,
and correlation coefficient are useful tools for describing sets of data. Similarly, den-
sity and mass functions provide concise descriptions of populations and ongoing
processes. One important question left unanswered in those chapters, however, is:
How do we know what parameter values to use in a mass function or density func-
tion? For example, the Weibull density is commonly used for modeling the lifetimes
of products, but how do you go about selecting numerical values for the
Weibull parameters, and , that best describe the lifetimes of a particular product?
One way to answer such questions is to use statistical inference, a tech-
nique that converts the information from random samples (see Section 4.2)
into reliable estimates of, and conclusions about, population or process parame-
ters. Sections 5.5 and 5.6 illustrate how statistical inference works.When reading
these sections, it is important to keep in mind the crucial role played by ran-
dom sampling. random sampling, statistics can only provide descriptive
194
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.1 Chance Experiments 195
summaries of the data itself. random sampling, though, our conclusions can
be reliably extended beyond the data, to the population or process from which
the data arose. Figure 5.1 illustrates the difference between statistics based on
ordinary data sets and statistics based on random samples.
Population Sample
or Data
process
Statistic
(a)
Drawing conclusions from samples necessarily involves some risk. Samples, after
all, only give approximate pictures of populations or processes. Intuition tells us that
the clarity of these pictures ought to increase as the sample size grows, but intuition
fails to be more precise than that. For example, when testing a large shipment of
parts for defective items, most people would agree that finding two defective items
in a random sample of 10 is very different from finding 200 defectives in a random
sample of 1000. Although the sample percentage (i.e., the statistic calculated from
the data) is the same in both cases, the 20% defect rate in the larger sample seems
much more credible than the 20% defect rate in the smaller sample. To quantify just
how much more credible the information in the larger sample is, we use the tools
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
196 chapter 5 Probability and Sampling Distributions
chance experiment. Under this rather wide definition, determining whether a metal
part withstands a stress test, recording whether it rains tomorrow, measuring the yield of
a chemical reaction, assessing the potency of a pharmaceutical product, or measuring
the volume of water flowing in a drainage system all qualify as chance experiments.
Most chance experiments in the sciences arise because either (1) some natural
phenomenon is at work, causing unpredictable changes in experimental outcomes, or
(2) we purposely introduce randomness as a tool for extrapolating information from data
to conclusions about populations or processes (see Section 4.3). As an example of the
former, yields of chemical reactions often vary with each repetition of an experiment,
no matter how hard one tries to control the conditions of the experiment. Slight differ-
ences in handling (e.g., the amount of mixing, the ambient temperature, the elapsed
time of the reaction) or even in the behavior at the molecular level (e.g., Brownian mo-
tion, material flow) can induce small changes in experimental results. However chance
experiments may arise, from natural forces or by statistical methodology, probability
provides a structure for measuring and consistently handling uncertainty.
Events
Underlying the computations of probability is an organized system for describing and
working with the outcomes of chance experiments. These outcomes can be divided into
two types: (1) simple events, which are the individual outcomes of an experiment and,
more generally, (2) events, which consist of collections of simple events. For instance,
the chance experiment of conducting a series of stress tests on three metal parts has the
eight possible outcomes PPP, PPF, PFP, FPP, PFF, FPF, FFP, and FFF, where P and
F denote the test results “pass” and “fail,” and the order in which the letters appear cor-
responds to the part number tested (e.g., PPF indicates that the first two parts passed the
test, but the third part failed). Each of these eight outcomes is a simple event, which,
taken together, form the sample space of the experiment.
Events are often denoted by single uppercase letters, usually from the beginning of
the alphabet, much like we denote constants in formulas by lowercase letters. Single-letter
names for events are very useful when applying the probability formulas in Section 5.2.
Thus we might denote the event that at least two parts pass the stress test by A, the event
that exactly 1 part passes the stress test by B, and so forth. Events can also be described
by just listing, in brackets, the simple events that comprise them. For example, the
event that at least two parts pass the stress test corresponds to the set of outcomes {PPP,
PPF, PFP, FPP}. If we had also chosen to denote this event by the letter A, then we
could also write A 5 {PPP, PPF, PFP, FPP}.
Example 5.1 Let’s continue with our example of stress-testing metal parts. Suppose that we now se-
lect and test four parts. Using sequences of Ps (for parts that pass the test) and Fs (for
parts that fail the test), the sample space of the experiment of selecting and testing
four metal parts is somewhat larger than that of the experiment of selecting and test-
ing three metal parts, discussed previously. In particular, the sample space consists of
these 16 simple events: {PPPP, PPPF, PPFP, PFPP, FPPP, PPFF, PFPF, PFFP, FPPF,
FPFP, FFPP, PFFF, FPFF, FFPF, FFFP, FFFF}. For convenience, these events are
listed in order of decreasing numbers of Ps in each four-letter sequence.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.1 Chance Experiments 197
Suppose we are interested in the events A 5 at least two parts pass the stress test and
B 5 at most two parts pass the stress test. In terms of simple events, we can write A and B as
A 5 {PPPP, PPPF, PPFP, PFPP, FPPP, PPFF, PFPF, PFFP, FPPF, FPFP, FFPP}
B 5 {PPFF, PFPF, PFFP, FPPF, FPFP, FFPP, PFFF, FPFF, FFPF, FFFP, FFFF}
Note that A and B have several simple events in common (shown underlined).
Example 5.2 A reasonably large percentage of C11 programs written at a particular company
compile on the first run, but some do not (a compiler is a program that translates
source code—in this case, C11 programs—into machine language so programs can
be executed). Suppose an experiment consists of selecting and compiling C11 pro-
grams at this location one by one until encountering a program that compiles on the
first run. Denote a program that compiles on the first run by S (for success) and one
that does not by F (for failure). Although it may not be very likely, a possible outcome
of this experiment is that the first 5 (or 10 or 20 or . . .) are F’s, and the next one is
an S. In other words, for any positive integer n, we may have to examine n programs
before seeing the first S. The sample space is {S, FS, FFS, FFFS, . . .}, which con-
tains an infinite number of possible outcomes. The same abbreviated form of the
sample space is appropriate for an experiment in which, starting at a specified time,
the gender of each newborn infant is recorded until the birth of a male is observed.
Depicting Events
Various devices have been created to help visually describe the events in a sample space.
Tree diagrams are especially useful for depicting experiments that are conducted in a
sequence of steps, such as our example of testing three metal parts. Beginning at the left,
each step in the sequence is given its own set of branches, which themselves form the
starting points for all branches to their right. Figure 5.2 shows a tree diagram for the ex-
periment of selecting and testing three metal parts. Simple events are formed by follow-
ing any branch of the tree diagram from the leftmost point to one of the rightmost points.
P
P
F
P
P
F
F
P
P
F
F
P
F
F
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
198 chapter 5 Probability and Sampling Distributions
Another visual device, the Venn diagram, is especially useful for depicting rela-
tionships between events. Venn diagrams are simple two-dimensional figures, often
rectangles or circles, whose enclosed regions are intended to depict a collection of
simple events, called points, in a sample space. Figure 5.3 shows a Venn diagram of
several events based on Example 5.1. Events like A and B that contain points in com-
mon are depicted as overlapping regions in the diagram. Events that do not contain
any common points, such as the events B 5 at most two parts pass the test and C 5
exactly three parts pass the test, are shown as nonoverlapping regions. An event that
contains all the points of some other event is shown as surrounding the smaller event.
For example, the event A 5 at least two parts pass the test contains all of the simple
events in event C 5 exactly three parts pass the test, so C is shown inside of A in
Figure 5.3.
Sample space
Venn diagrams and tree diagrams are indispensable tools in many parts of probabil-
ity theory, but they are not essential to conducting statistical studies. We will use these
diagrams primarily as an aid for discussing certain probability concepts, but, beyond Unless otherwise noted, all content on this page is © Cengage Learning.
that, their use is not emphasized. The interested reader may consult texts on probability
for more information on working with Venn diagrams.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.1 Chance Experiments 199
1. The event A or B consists of all simple events that are contained in either
A or B. A or B can also be described as the event that at least one of A or B
occurs.
2. The event A and B consists of all simple events common to both A and B. A
and B can be described as the event that both A and B occur.
he event A=, called the complement of A, consists of all simple events that
3. T
are not contained in A. A= is the event that A does not occur.
Example 5.3 Refer to Example 5.1, the experiment of selecting and testing four metal parts. To
form the event A or B, we simply list all events that are in either A or B, or in both.
The easiest way to do this is to list all the events in A and then add the events in B
that are not duplicates of those in A. Thus
For these two events, A or B happens to contain all 16 sample space points. In a
similar fashion, the event A and B, which consists only of the underlined events in
both A and B, is given by
In this case, it is possible to give a short verbal description of the event A and B;
namely, A and B 5 exactly two parts pass (and, hence, two fail) the stress test. Finally,
the complement of event A is
A= can also be verbally described as the event that at most one part passes the test.
When two events A and B have no simple events in common, we say that they
are mutually exclusive or disjoint. More intuitively, mutually exclusive events are
ones that cannot occur simultaneously; the occurrence of either event precludes the
occurrence of the other. In a Venn diagram, mutually exclusive events are depicted
as nonoverlapping regions. As we will see in Section 5.2, probability calculations in-
volving disjoint events are particularly simple. For this reason, we often try to decom-
pose complex events into collections of mutually exclusive events when computing
probabilities.
Several of the previous definitions can be extended to include events formed from
more than two events. These definitions are given in the next box.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
200 chapter 5 Probability and Sampling Distributions
definitions Given a chance experiment and any events A1, A2, A3, . . . , Ak:
1. T
he event A1 or A2 or A3 or . . . or Ak consists of all the simple events that are
contained in at least one of the events A1, A2, A3, . . . , or Ak. It can also be de-
scribed as the event that at least one of the events A1, A2, A3, . . . , or Ak occurs.
2. T
he event A1 and A2 and A3 and . . . and Ak consists of all simple events com-
mon to all the events A1, A2, A3, . . . , and Ak. This event can be described as
the event that all of the events A1, A2, A3, . . . , and Ak occur.
3. S
everal events A1, A2, A3, . . . , and Ak are said to be mutually exclusive or
disjoint if no two of them have any simple events in common.
Example 5.4 Sampling inspection is a common method for ascertaining the quality level of
batches (called lots) of finished products. Sampling inspection can be used by a
manufacturer to check the quality of products prior to shipment or by a customer
to check the quality of incoming shipments before accepting them. In either case,
sampling inspection is done by first selecting a random sample of n items from a lot
and counting the number of sampled items that do not meet quality standards.
Suppose, for example, that n 5 20 items are randomly selected from a large lot.
In this situation, an event that we might be interested in is A 5 the sample contains at
most one item that fails to meet quality standards. As you can imagine from reading the
other examples in this section, the sample space of the experiment of randomly select-
ing and testing 20 items is prohibitively large. Even a tree diagram is of no help in de-
picting the simple events or the event A itself. However, relying on only verbal descrip-
tions of the events, it is possible to decompose A into a combination of two less complex
events: B 5 no items fail inspection and C 5 exactly one item fails inspection. In fact,
it is not hard to see that the event B or C is the same as the event A. We write this as A 5
B or C. Furthermore, B and C are mutually exclusive events. In Section 5.2, we show
how to use this fact to more easily compute the probability that A occurs.
1. A random sample, without replacement, of three E1 5 the plant at site 1 is completed by the
items is to be selected from a population of five contract date
items (labeled a, b, c, d, and e). E2 5 the plant at site 2 is completed by the
a. List all possible different samples. contract date
b. List the samples that correspond to the event E3 5 the plant at site 3 is completed by the
A 5 items a and c are included in the sample. contract date
c. List the samples that correspond to the comple-
Draw a Venn diagram that depicts these three
ment of the event A in part (b).
events as intersecting circles. Shade the region on
2. An engineering firm is constructing power plants at the Venn diagram corresponding to each of the fol-
three different sites. Define the events E1, E2, and lowing events (redraw the Venn diagram for each
E3 as follows: question):
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.2 Probability Concepts 201
a. At least one plant is completed by the contract by vibrations during flight, some fasteners are slightly
date. crimped so that they lock more tightly. The amount
b. All plants are completed by the contract date. of crimping, however, must meet specific standards.
c. None of the plants is completed by the contract To test finished fasteners, an initial inspection classi-
date. fies them into two groups: those that meet standards
d. Only the plant at site 1 is completed by the con- and those that do not. Of those not meeting standards,
tract date. some are completely defective and must be scrapped,
e. Exactly one of the three plants is completed by whereas the rest can be run through a machine that
the contract date. readjusts the amount of crimping. Of the recrimped
f. Either the plant at site 1 or site 2 or both of the fasteners, some are corrected by the recrimping opera-
two plants are completed by the contract date. tion and pass inspection, whereas the remainder can-
not be salvaged and are scrapped. Draw a tree diagram
3. Let A and B denote the events A 5 there are more
that depicts the testing and rework operations.
than three defective items in a random sample of
ten items and B 5 there are fewer than six defec- 6. Information theory is concerned with the transmis-
tives in a random sample of ten items. sion of data, usually encoded as a stream of 0s and 1s,
a. Describe, in words, the event A and B. over communication channels. Because channels are
b. Describe, in words, the event A or B. “noisy,” there is a chance that some 0s sent through
c. Describe, in words, the complement of A. the channel are mistakenly received at the other end
as 1s, and vice versa. The majority of digits sent, how-
4. Draw a Venn diagram depicting two events A and
ever, are not altered by the channel. Draw a tree dia-
B that are not disjoint. Shade in the portion of this
gram that depicts the type of bit sent (either 0 or 1) and
diagram that corresponds to the event A and B=.
the type of bit received at the end of the channel.
5. Nuts and bolts used in aircraft manufacturing are
7. Use a Venn diagram to find a simple expression for
called fasteners. To ensure that they are not loosened
{A and B}= in terms of A= and B=.
Assigning Probabilities
Writing in his treatise Théorie Analytique des Probabilités (1812), mathematician and theo-
retical astronomer Pierre Simon de Laplace (1749–1827) stated that “at bottom, the theory
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
202 chapter 5 Probability and Sampling Distributions
of probability is only common sense reduced to calculation.” With this brief statement,
Laplace recognized that any rigorous definition of probability must satisfy certain com-
monsense requirements. For example, the probability of any event must lie between 0 and
1. This is another way of stating the obvious condition that, in any number of repetitions
of an experiment, no event can occur less than 0% of the time nor more frequently than
100% of the time. In practice, this requirement provides a quick check on our probability
calculations; calculated values that lie outside the interval [0, 1] are immediate signals that
a mistake has occurred somewhere in the computations. Used correctly, the probability
formulas given in this chapter will never yield probabilities outside the interval [0, 1].
A second self-evident requirement is that probabilities of events must not lead to
logical inconsistencies. For example, it does not make sense to state that 90% of metal
parts pass a stress test and that 20% fail the test. These two probabilities are inconsistent
because we know that exactly 100%, not 110%, of the parts will either pass or fail the
test. In the same vein, it would not make sense to say that 90% pass and 5% fail the test,
since this implies the illogical conclusion that only 95% of all parts pass or fail the test.
To avoid nonsensical statements like these, we demand that the probabilities associated
with the simple events always total to exactly 1. Thus any sensible assignment of prob-
abilities to events must satisfy the following two basic requirements:
Probability Axioms
1. The probability of any event must lie between 0 and 1. That is, 0 # ( ) # 1 for any
event .
2. The total probability assigned to the sample space of an experiment must be 1.
Within the limits imposed by these axioms, there are several ways to determine
probabilities: (1) as frequencies of occurrence, (2) from subjective estimates, (3) by
assuming that events are equally likely, and (4) by using density and mass functions
(see Section 5.4). Depending on the circumstances, each method has its merits. For
example, when it is possible to repeat a chance experiment, the “frequentist” approach
defines the probability of an event A to be the long-run ratio
number of times A occurs
P 1A2 5
numbers of times experiment is repeated
The justification for this approach is that, as the number of trials increases, we expect
this ratio to stabilize and eventually approach a limiting value, which we take as our
definition of P(A). For example, let A be the event that a package sent within the state
of California for 2nd-day delivery actually arrives within 1 day. The results from sending
10 such packages (the first 10 replications) are as follows:
Package No. 1 2 3 4 5 6 7 8 9 10
Did A occur N Y Y Y N N Y Y N N
Relative frequency of A 0 .5 .667 .75 .6 .5 .571 .625 .556 .5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.2 Probability Concepts 203
Figure 5.4(a) shows how the relative frequency fluctuates rather substantially over
the course of the first 50 replications. But as the number of replications continues
to increase, Figure 5.4(b) illustrates how the relative frequency stabilizes. Using
Figure 5.4(b), we would be inclined to state that P(A) is close to .60.
1.0
Relative 9 .60
frequency 15 .7
.8
.6 Approaches .6
.6
.4
Relative 5
frequency .50
10
.2 .5
0
0 10 20 30 40 50 0 100 200 300 400 500 600 700 800 900 1000
Number of packages Number of packages
(a) (b)
Figure 5.4 Behavior of relative frequency: (a) initial fluctuation; (b) long-run stabilization
Of course, the frequentist approach does not work when experiments cannot
be faithfully replicated, as is the case with sports competitions. In these instances,
subjective estimates, guided by the probability axioms, can be used to arrive at
numerical probabilities that certain teams will win or lose a game. Needless to say,
entire texts can and have been written comparing the various methods for assigning
probabilities to events. It is not our purpose to compare each of these methods. Instead,
in Section 5.4, we emphasize the technique that is most often used in statistical studies,
defining probabilities by means of mass and density functions.
Probability rules, or laws, are formulas that are intended to simplify the process of cal-
culating the probabilities of complex events. They achieve this purpose by first decom-
posing some event of interest into two or more less complex events whose probabili-
ties are more easily found. The formulas then describe how to recombine the simpler
probabilities to find the probability of the original event. One of the most frequently
used laws is the addition rule for disjoint events, which states that the probability of
the event A1 or A2 or A3 or . . . or Ak is simply the sum of the individual probabilities
P(A1) 1 P(A2) 1 P(A3) 1 1 P(Ak) as long as all the events A1, A2, A3, . . . , and Ak are
mutually exclusive. The addition rule is usually applied to an event E by first finding a
collection of less complicated events A1, A2, A3, . . . , and Ak that satisfy two conditions:
(1) the events A1, A2, A3, . . . , and Ak are disjoint and (2) E 5 A1 or A2 or A3 or . . . or Ak.
The events A1, A2, A3, . . . , Ak are sometimes said to partition the event E into mutually
exclusive events.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
204 chapter 5 Probability and Sampling Distributions
( )5 ( )1 ( )
( 1 2 3 ... ) 5 ( 1) 1 ( 2) 1 ( 3) 1 1 ( )
Example 5.5 Suppose that you want to find the probability that at most one item fails to meet
quality standards in a random sample of n 5 20 items from a large shipment of
such items. Denote the event of interest as A 5 at most one item fails to meet a
quality standard. In Example 5.4, we showed that A can be partitioned into the
events B 5 no items fail inspection and C 5 exactly one item fails inspection.
That is, we can write A 5 B or C, where B and C are disjoint events. According
to the addition rule for disjoint events, P(A) can be found by simply adding the
probabilities P(B) and P(C), both of which are easier to find than P(A). In fact,
in Section 5.4 we show that the binomial mass function can be used to find both
P(B) and P(C).
Complementary Events
The complement A= of an event A was defined in Section 5.1 to be the collection of
simple events that are not in A. In more intuitive terms, it is helpful to think of A9 as the
opposite of A when trying to express A= in words. For example, if A is the event that at
least one metal part passes a stress test, then the opposite event must be A= 5 no metal
parts pass the stress test. Notice that we did not need to write down the sample space of
the experiment to arrive at this description of A=. Consider how you might describe the
complement of A=. Since A and A= are opposites, then the complement of A= is simply
the event A itself, which we can write as (A=)= 5 A.
Yet another way to describe the complement of an event A is to say that when
A does not occur, then, necessarily, its complement A9 has occurred. Viewed this
way, the symbol A is somewhat like a switch that is either on (A) or off (A= ). The
truth-table logic you would use to describe electronic circuits can then be applied
to finding complements of complex events. For instance, consider how you might
go about finding the complement of the event A or B. If the event A or B does not
happen, then it must be true that both A and B do not happen, which we can express
by writing A= and B= . In equation form, {A or B}= 5 A= and B= . Figure 5.5 shows how
a tree diagram can be used to demonstrate the same result. The branches of the
tree depict all possible combinations of the events A, B, A= , and B9. The top three
branches correspond to the event A or B, which implies that its complement must
be the bottom branch, A= and B= .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.2 Probability Concepts 205
and occur
These three branches
correspond to the
and occur event .
and occur
definition When an event A does not occur, we say that its complement, denoted by A= ,
has occurred, and vice versa. The probabilities of A and A= are related by the
formula P(A) 5 1 2 P( A= ).
Example 5.6 Refer to Example 5.5. Suppose you want to find the probability that, of the 20 items
randomly selected for inspection, at least one item fails to meet quality standards.
Unless otherwise noted, all content on this page is © Cengage Learning.
Denote this event by D 5 at least one item fails inspection. One approach to find-
ing this probability is to partition D into the events E1, E2, E3, . . . , E20, where, for
each i 5 1, 2, 3, . . . , 20, the Ei denotes the event that exactly i items fail inspec-
tion. Since E1 through E20 are disjoint, the addition rule says that P(D) 5 P(E1) 1
P(E2) 1 1 P(E20). As mentioned in Example 5.5, the binomial mass function
could then be used to find each P(Ei) in this summation.
Although the addition rule will give the correct value for P(D), an easier
method for finding P(D) is to use the law of complementary events, P(D) 5
1 2 P(D=). The complement of the event D 5 at least one item fails inspection is
the event D= 5 no items fail inspection. As we will see in Section 5.4, finding P(D=)
requires only one computation with the binomial mass function, whereas the par-
tition method requires 20 separate computations.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
206 chapter 5 Probability and Sampling Distributions
Sample space
and
Here is a simple intuitive justification for the general addition rule. Referring to the
Venn diagram in Figure 5.6, imagine that the events A and B represent circular rugs on
a floor and that we want to find the total floor area covered by these two rugs, analogous
to determining P(A or B). For the purposes of this example, think of P(A) and P(B) rep-
resenting the floor areas covered by each rug individually. To find the total area covered
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.2 Exercises 207
by both rugs we could start by adding the areas of these two rugs, but then the floor area
where the two rugs overlap has been counted twice by this simple addition. The obvi-
ous remedy is to subtract the overlapping area, represented by P(A and B), once from
the sum, giving a final result of P(A) 1 P(B) 2 P(A and B). This is in essence how the
general addition rule works.
Example 5.7 In a certain residential suburb, 60% of all households get Internet service from the
local cable company, 80% get television service from that company, and 50% get
both services from that company. If a household is randomly selected, what is the
probability that it gets at least one of these two services from the company? With A 5
{gets Internet service} and B 5 {gets TV service}, the given information implies that
P(A) 5 .6, P(B) 5 .8 and P(A and B) 5 .5. The general addition rule now yields
P(subscribes to at least one of the two services)
5 P(A or B) 5 P(A) 1 P(B) 2 P(A and B) 5 .6 1 .8 2 .5 5 .9
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
208 chapter 5 Probability and Sampling Distributions
11. For any collection of events A1, A2, A3, . . . , Ak, it 12. Suppose that 55% of all adults regularly consume
can be shown that the inequality coffee, 45% regularly consume carbonated soda,
and 70% regularly consume at least one of these
P(A1 or A2 or A3 or . . . or Ak)
two types of drinks.
# P(A1) 1 P(A2) 1 P(A3) 1 1 P(Ak)
a. What is the probability that a randomly selected
always holds. This inequality is most useful in cases adult regularly consumes both coffee and soda?
where the events involved have relatively small b. What is the probability that a randomly selected
probabilities. For example, suppose a system con- adult doesn’t regularly consume at least one of
sists of five subcomponents connected in series these two products?
(cf. Example 5.8) and that each component has a c. What is the probability that a randomly selected
.01 probability of failing. Find an upper bound on adult regularly consumes coffee but does not
the probability that the entire system fails. regularly consume soda?
Conditional Probability
Before shipping finished products, manufacturers routinely use automatic test equip-
ment (ATE) to assess the functionality of products and systems. In addition to giving
physical measurements of product characteristics, ATE machines can conduct a se-
quence of complex tests that eventually result in a final “thumbs up” or “thumbs down”
determination for the item being tested. Before testing, historical process data can be
used to estimate the probability that any particular item will function correctly. Sup-
pose, for example, that such records show that 95% of the items in a certain product line
perform correctly. Letting A denote the event that a randomly selected item is defect
free, we can then say that P(A) 5 .95. Now consider how this estimate may change
when we submit a particular item to an ATE test. Because the determinations given by
ATE are good but not perfect, we will want to give a good deal of weight, but not 100%,
to the ATE test result. Thus if the ATE test indicates that the item is defective, then we
will definitely want to reduce our estimate of P(A). Alternatively, if the item passes the
ATE test, then we will revise P(A) upward. In both cases, we want to update our estimate
of P(A) for the item being tested by factoring in the new information from the ATE test.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.3 Conditional Probability and Independence 209
Let B 5 the item passes the ATE test. Then the conditional probability of A given
B is denoted by P(A B). Conditional probabilities are computed from the following
definition:
P(A and B)
P(A|B) 5
P(B)
This formula can be justified by thinking of probability as the proportion of times that
an event occurs in a large number of trials N: About P(B) 3 N of the trials will result in
items that pass the ATE test and about P(A and B) 3 N of the trials will correspond to
items that not only pass the test but are truly defect-free. Thus P(A|B), the proportion
of items that are truly defect-free out of the total number passing the ATE test, should
be P(A and B) 3 Ny (P(B) 3 N), which simplifies to P(A and B)y P(B).
Tree diagrams are very useful for summarizing problems that involve conditional
probabilities. Figure 5.7 shows such a diagram for our ATE example. Note that con-
ditional probabilities correspond to the branches on the tree. By writing the formula
P(A|B) 5 P(A and B)yP(B) in the form P(A and B) 5 P(B)P(A|B), we see that the prob-
ability of taking a particular path through the diagram (from left to right) is simply the
product of the probabilities of the branches that comprise that path.
( | ) ( )= ( ) ( )
( )
( | ) ( )= ( ) ( )
( | ) ( )= ( ) ( )
( )
( | ) ( )= ( ) ( )
definition Let A and B be two events with P(B) . 0. The conditional probability of A
occurring given that event B has already occurred is denoted by P(A|B) and can
be calculated from the formula P(A | B) 5 P(A and B)yP(B).
Unless otherwise noted, all content on this page is © Cengage Learning.
Independent Events
Conditional probability is used when the likelihood of occurrence of an event depends
on whether or not another event occurs. At the other end of the spectrum are events
that do not impose such restrictions on each other’s chances of occurring. Two events,
A and B, are said to be independent if the occurrence of either event has no effect
whatsoever on the likelihood of occurrence of the other. This definition readily extends
to any number of events.
To understand the role played by independence in probability calculations, con-
sider the following example. To filter certain harmful particles out of a given volume
of air, suppose we sequentially use two filters A and B, each of which captures a large
percentage of the particles in any air passing through it. In particular, filter A allows
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
210 chapter 5 Probability and Sampling Distributions
only 5% of the particles to pass through, whereas filter B has about a 10% pass-through
rate. If we begin with a fixed volume of air containing V harmful particles, then after
passing through filter A, there should be (.05)V particles remaining. When this air is
screened through filter B, an additional 90% of the remaining particles are removed,
leaving a total of (.10)(.05)V particles after the two screenings. We then ask, Would it
make any difference if we changed the order in which the filtering is performed? This
is equivalent to asking, Do the two filters perform independently of one another? If the
filters are independent, then we should be able to reverse the filtering procedure without
changing the pass-through rates of the filters (see Figure 5.8). Thus filter B leaves (.10)
V particles, of which filter A then leaves (.05)(.10)V particles.
Filter A Filter B
(.05) (.10)(.05)
Filter B Filter A
(.10) (.05)(.10)
definition Two events, A and B, are independent events if the probability that either one
occurs is not affected by the occurrence of the other. In this case,
P(A and B) 5 P(A)P(B)
Several events, A1, A2, A3, . . . , Ak, are independent if the probability of each
event is unaltered by the occurrence of any subset of the remaining events. In
this case, the product rule can be applied to any subset of the k events. That is,
the probability that all the events in any subset occur equals the product of their
individual probabilities of occurring. In particular, for all k events,
P(A1 and A2 and A3 and and Ak) 5 P(A1)P(A2)P(A3) P(Ak)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.3 Conditional Probability and Independence 211
Determining whether two (or more) events are independent is not quite as easy as
deciding whether they are mutually exclusive. With independent events, we rely either
on our intuition or on special procedures (such as random sampling and randomization).
Intuition is what we generally employ when we assume that different tosses of a coin or
different air filters are independent. With statistical methods, on the other hand, we rely
on random sampling, not intuition, to ensure that events are independent. Practically
speaking, we often assume independence when we do not know of any strong reasons
why the events should be related. At other times, independence provides a reasonable
approximation to the truth for the application at hand, but it may not be reasonable if
the situation changes a little. In our filter example, for instance, independence may be
a good assumption when the volume of particulate matter in the air is relatively large,
but it may cease to be valid for small volumes (e.g., after being screened by one filter,
the volume of particles may have dropped below the detection limit of the other filter).
Example 5.8 One branch of reliability theory, called topological reliability, is concerned with
calculating the reliability of systems comprising several components connected
in specific patterns. One common layout for components is the series system
(Figure 5.9), in which the system operates correctly only if each of its subcompo-
nents works correctly. A familiar example of such a system is a circuit with two
switches, both of which must be closed for the circuit to conduct electricity. It is
commonly assumed that the components are independent when performing reli-
ability calculations.
Component A Component B
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
212 chapter 5 Probability and Sampling Distributions
events will occur? For two independent events, A and B, the event that at least one of
these events occurs can be written {A or B}. As we showed in our discussion of comple-
mentary events, the complement of {A or B} is the event {A9 and B9}. Therefore, using
the additional knowledge that the complements of independent events must themselves
be independent, we can write
P(at least one of two independent events occurs)
5 P(A or B) 5 1 2 P(A= and B=) 5 1 2 P(A=)P(B=)
This formula can readily be extended to any number of independent events, A1, A2,
A3, . . . , Ak. That is,
P(at least one of k independent events occurs)
5 1 2 P(A1= )P(A2= )P(A3= ) P(Ak= )
The “at least one” rule has numerous applications, two of which are given in the
following examples.
Example 5.9 In an example demonstrating how vendor quality affects customer quality, H. S.
Gitlow and D. A. Wiesner (“Vendor Relations: An Important Piece of the Quality
Puzzle,” Quality Progress, 1988: 19–23) considered a hypothetical product consist-
ing of 50 critical parts, any one of which, if defective, could cause the finished
product to be defective. Suppose that each of these parts is purchased from a dif-
ferent vendor. It is therefore reasonable to assume that the condition of each part,
created by a different vendor, should be independent of the conditions of the oth-
ers. Furthermore, suppose that about 99.5% of all the parts supplied by a given
vendor are good. What is the overall proportion of assembled products that can be
expected to be defective?
To answer this question, let Di denote the event that the part purchased from the
ith vendor is defective, so that P(Di) 5 .005 and P(Di=) 5 .995. Then, the probability
we seek is
P(at least one of the 50 parts is defective) 5 1 2 P(D1= )P(D2= )P(D3= ) P(D50
=
)
50
5 1 2 (.995) 5 1 2 .7783 5 .2217
This example demonstrates the important point that it is possible for complex
systems to have high failure rates even if the quality of their individual components
is relatively good.
Example 5.10 Consider the portion of an electronic circuit diagrammed in Figure 5.10. The cir-
cuit is primarily a parallel system (i.e., either switch A or both switches B and C
must function if the current is to flow from left to right). The branch containing
switches B and C, however, forms a series system. To compute the probability that a
closed circuit is made between the left and right sides of the diagram, we must find
the probability of the event {A or {B and C}}. Assuming that the switches function
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.3 Exercises 213
independently of one another and that they are closed with probabilities P(A) 5 .80,
P(B) 5 .70, and P(C) 5 .90, we proceed as follows:
The general addition
rule applied to the
P(A or (B and C)) 5 P(A) 1 P(B and C) 2 P(A and (B and C)) events A and
{B and C}
Since A, B, and C
5 P(A) 1 P(B)P(C) 2 P(A)P(B)P(C) are independent
5 .80 1 (.70)(.90) 2 (.80)(.70)(.90) 5 .926
Thus the circuit is closed about 92.6% of the time. Since switch A is closed 80% of
the time, the probability that the circuit is closed must certainly exceed 80%, so our
answer makes sense.
Switch A
Switch B Switch C
13. Five companies (A, B, C, D, and E) that make elec- a. Use the addition law to show that P(A) 5
trical relays compete each year to be the sole sup- P(A and B) 1 P(A and B9).
plier of relays to a major automobile manufacturer. b. Use the conditional probability formula to
The auto company’s records show that the probabili- write P(A and B) in terms of P(A | B) and P(B).
ties of choosing a company to be the sole supplier are Develop a similar formula for P(A and B9) in
terms of P(A | B9) and P(B9).
Supplier chosen: A B C D E
c. Use parts (a) and (b) to show that
Probability: .20 .25 .15 .30 .10
a. Suppose that supplier E goes out of business this P(AuB)P(B)
Unless otherwise noted, all content on this page is © Cengage Learning.
P(BuA) 5
year, leaving the remaining four companies to P(AuB)P(B) 1 P(AuB=)P(B=)
compete with one another. What are the new
This formula, known as Bayes’ theorem, is used
probabilities of companies A, B, C, and D being
to “turn conditional probabilities around”; that
chosen as the sole supplier this year?
is, it allows us to express P(B | A) in terms of
b. Suppose the auto company narrows the choice
P(A | B) and P(A | B=).
of suppliers to companies A and C. What is the
d. In Figure 5.7, the probability associated with any
probability that company A is chosen this year?
path from left to right through the tree is simply
14. Refer to the tree diagram in Figure 5.7. Suppose the product of the probabilities of the branches.
you want to find the probability P(B | A) using the Why?
information available in the tree diagram. To do e. Use the observation in part (d) and the condi-
this, P(B | A) must be expressed in terms of condi- tional probability formula for P(B | A) to justify
tional probabilities, like P(A | B) and P(A9 | B). Bayes’ theorem.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
214 chapter 5 Probability and Sampling Distributions
15. In Exercise 5, suppose that 95% of the fasteners pass d. Suppose that 5 1026. What is the probabil-
the initial inspection. Of those that fail inspection, ity that at least one person in a sample of one
20% are defective. Of the fasteners sent to the re- million will have a blood type matching that
crimping operation, 40% cannot be corrected and found at the crime scene?
are scrapped; the rest are corrected by the recrimp-
19. In forensic science, the probability that any two
ing and then pass inspection.
people match with respect to a given characteristic
a. What proportion of fasteners that fail the initial
(hair color, blood type, etc.) is called the probability
inspection pass the second inspection (after the
of a match. Suppose that the frequencies of blood
recrimping operation)?
phenotypes in the population are as follows:
b. What proportion of fasteners pass inspection?
c. Given that a fastener passes inspection, what is A B AB O
the probability that it passed the initial inspection .42 .10 .04 .44
and did not have to go through the recrimping
a. What is the probability that two randomly cho-
operation?
sen people both have blood type A?
16. In Exercise 6, suppose that there is a probability of b. Repeat the calculation in part (a) for the other
.01 that a digit is incorrectly sent over a commu- three blood types.
nication channel (i.e., that a digit sent as a 1 is re- c. Find the probability that two randomly chosen
ceived as a 0, or a digit sent as a 0 is received as a 1). people have matching blood types. Note: A per-
Consider a message that consists of exactly 60% 1s. son can have only one phenotype.
a. What is the proportion of 1s received at the end d. The probability that two people do not match for a
of the channel? given characteristic is called discriminating power.
b. If a 1 is received, what is the probability that a 1 was What is the discriminating power for the compari-
sent? Hint: Use the tree diagram from Exercise 6. son of two people’s blood types in part (c)?
17. Suppose that A and B are independent events with 20. A construction firm has bid on two different con-
P(A) 5 .5 and P(B) 5 .6. Can A and B be mutually tracts. Let E1 be the event that the bid on the first
exclusive events? contract is successful, and define E2 analogously for
the second contract. Suppose that P(E1) 5 .4 and
18. Probability calculations play an important role in
P(E2) 5 .3 and that E1 and E2 are independent.
modern forensic science (Aitken, C., Statistics and
a. Find the probability that both bids are successful.
the Evaluation of Evidence for Forensic Scientists,
b. Find the probability that neither bid is successful.
John Wiley, New York, 1995). Suppose that a sus-
c. Find the probability that at least one of the bids
pect is found whose blood type matches a rare
is successful.
blood type found at a crime scene. Let denote
the frequency with which people in the popula- 21. Consider a system of components connected as
Unless otherwise noted, all content on this page is © Cengage Learning.
tion have this particular blood type. Assuming that shown in the following figure.
people in the population are sampled at random,
answer the following questions: 1
a. What is the probability that a randomly chosen
person from the population does not have the 2
same blood type as that found at the crime scene?
b. What is the probability that none of n randomly 3 4
chosen people will match the blood type found
at the crime scene? Components 1 and 2 are connected in parallel, so
c. What is the probability that at least one person that their subsystem functions correctly if either
in a random sample of n people will match the component 1 or 2 functions. Components 3 and 4
blood type found at the crime scene? are connected in series, so their subsystem works
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Random Variables 215
only if both components work correctly. If all com- Each point plotted on a control chart can signal either
ponents work independently of one another and that a manufacturing process is operating correctly or
P(a given component works) 5 .9, calculate the that it is not operating correctly. However, even when
probability that the entire system works correctly. a process is running correctly, there is a small prob-
ability, say, 1%, that a charted point will mistakenly
22. The reviews editor for a certain scientific journal
signal that there is a problem with the process.
decides whether the review for any particular book
a. What is the probability that at least one of ten
should be short (1–2 pages), medium (3–4 pages),
points on a control chart signals a problem with
or long (5–6 pages). Data on recent reviews indicates
a manufacturing process when in fact the pro-
that 60% of them are short, 30% are medium, and the
cess is running correctly?
other 10% are long. Reviews are submitted in either
b. What is the probability that at least 1 of 25 points
Word or a typesetting program called LaTeX. For short
on a control chart signals a problem with a man-
reviews, 80% are in Word, whereas 50% of medium
ufacturing process when in fact the process is
reviews are in Word and 30% of long reviews are in
running correctly?
Word. Suppose a recent review is randomly selected.
a. What is the probability that the selected review 25. If A and B are independent events, show that A9
was submitted in Word format? and B are also independent. Hint: Use a Venn
b. Suppose you are told the selected review was diagram to show that P(A= and B) 5 P(B) 2
submitted in Word format. What is the probabil- P(A and B).
ity that the review was medium in length?
26. In October 1994, a flaw in a certain Pentium chip
23. In a certain population, 1% of all individuals are installed in computers was discovered that could re-
carriers of a particular disease. A diagnostic test for sult in a wrong answer when performing a division.
this disease has a 90% detection rate for carriers and The manufacturer initially claimed that the chance
a 5% detection rate for noncarriers. Suppose that of any particular division being incorrect was only
the diagnostic test is applied independently to two 1 in 9 billion, so that it would take thousands of
different samples from the same randomly selected years before a typical user encountered a mistake.
individual. However, statisticians are not typical users; some
a. What is the probability that both tests yield the modern statistical techniques are so computation-
same result? ally intensive that a billion divisions over a short
b. If both tests are positive, what is the probability time period is not outside the realm of possibility.
that the selected individual is a carrier? Assuming that the 1 in 9 billion figure is correct and
that results of different divisions are independent of
24. One of the assumptions underlying the theory of con-
one another. What is the probability that at least 1
trol charts (see Chapter 6) is that the successive points
error occurs in 1 billion divisions with this chip?
plotted on a chart are independent of one another.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
216 chapter 5 Probability and Sampling Distributions
Random Variables
When the same numerical characteristic can conceivably be measured on any out-
come of a chance experiment, we say that this quantity is a random variable. For
instance, the measured yield of a chemical reaction is a random variable. Random-
ness enters the picture because we expect there to be slight unpredictable differenc-
es between each repetition of the reaction, which, in turn, will be reflected in the
measured yields. There can be any number of random variables associated with a
chance experiment. In a chemical reaction, any quantifiable feature associated with
the reaction is a random variable (e.g., yield, density, weight, viscosity, volume, and
translucence of the material produced). To make them easier to work with, random
variables are usually denoted by single letters near the end of the alphabet. The yield
of a chemical reaction might simply be denoted by the letter x, the density of the ma-
terial by w, and so forth. The assignment of a letter to a random variable is sometimes
written in the form of an equation, such as x 5 yield of a chemical reaction or w 5
density of the material produced in the reaction.
Technically speaking, the numerical values of a random variable are not the simple
events of a chance experiment. Instead, a random variable is a function that assigns
numerical values to the possible outcomes of a chance experiment, as illustrated in
Figure 5.11. Notice that it is possible for more than one point in the sample space to be
assigned the same real number. For instance, the random variable y 5 number of metal
parts that pass a stress test out of three randomly selected parts assigns the number y 5 2
to each of the sample space points PPF, PFP, and FPP.
Numerical measurements
from each outcome in
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Random Variables 217
Sample space
Unless otherwise noted, all content on this page is © Cengage Learning.
0 1 2 3 4
Figure 5.12 The event that at least two parts pass a stress test
and the random variable 5 number of parts passing the
stress test
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
218 chapter 5 Probability and Sampling Distributions
selected manufactured part. Then an event {18 # x # 21} can, if desired, be partitioned
into the disjoint events {18 # x # 21} 5 {18 # x , 19} or {19 # x , 20} or {20 # x #
21}. Notice that the particular choice of strict and inclusive inequality signs is what
causes these events to be disjoint. The addition rule for disjoint events then states that
P(18 # x # 21) 5 P(18 # x , 19) 1 P(19 # x , 20) 1 P(20 # x # 21). Similarly, because
the event {x . 18} is the complement of the event {x # 18}, the law of complementary
events allows us to write P(x # 18) 5 1 2 P(x . 18).
Probability Distributions
The mechanism for assigning probabilities to events defined by random variables is
to use either a mass function (for discrete random variables) or a density function (for
continuous variables). In either case, we first envision an event of interest as a particular
subset of the real number line. For discrete variables, the probability of the event is
defined to be the sum of the mass function values that lie within the event subset. For
continuous variables, the probability of an event is defined to be the area under the
portion of the density curve that lies over the event on the number line. Figure 5.13
shows how a mass or density function assigns a probability to any event of interest on the
real number line. When used to describe random variables, mass functions and density
functions are both called probability distributions.
(2.3 5.4)
(2) (4)
(5)
(1)
(0) (6)
(7)
0 1 2 3 4 5 6 7 2.3 5.4
Event: 2 5 Event: 2.3 5.4
Discrete random variable Continuous random variable
Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 5.13 Using mass or density functions to assign probabilities to events
When a probability distribution has one of the familiar distributional forms de-
scribed in Chapter 1, the methods described in that chapter can be used to find event
probabilities. For example, if we believe that the length, x, of a randomly selected part
can be described by a normal distribution with a mean of 20 cm and a standard devia-
tion of 1.8 cm, then probabilities associated with x are found by standardizing, as shown
in Chapter 1. Thus
18 2 20 21 2 20
P(18 # x # 21) 5 P a #z# b
1.8 1.8
5 P(21.11 # z # .56) 5 .5788
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Random Variables 219
There are several ways to choose an appropriate probability distribution for describing
a random variable. In the upcoming examples and chapters, we will use the following
methods to justify our choices of probability distributions:
1. Examine a histogram of data, and select a familiar density or mass function
whose shape approximately matches that of the histogram.
2. Use a density or mass function recommended by previous studies or profes-
sional practice.
3. Verify conditions that are known to give rise to certain mass or density func-
tions (see binomial distributions in Section 1.6, normal distributions in
Section 5.6).
Example 5.11 Examples 5.4–5.6 describe several events related to the chance experiment of ran-
domly sampling and testing 20 items from a large shipment:
These events can be recast in terms of the random variable x 5 number of items that
fail to meet quality standards, as follows:
A {x # 1}
B {x 5 0}
C {x 5 1}
D {x $ 1}
Because random sampling ensures that each of the 20 selections is independent of
the others, a binomial mass function is a good choice for describing probabilities
associated with x (see Section 1.6). Suppose that it is known from manufacturing
records that about 2% of all such items do not conform to quality standards. Using
5 .02 and n 5 20 in the formula for the binomial mass function, we calculate the
probabilities of the previously described events as
P(x # 1) 5 P(x 5 0) 1 P(x 5 1)
20! 20!
5 (.02)0(.98)20 1 (.02)1(.98)19
0! 20! 1! 19!
5 1(.98)20 1 20(.02)(.98)19
5 .6676 1 .2725 5.9401
P(B) P(C)
Thus if groups of 20 items are repeatedly selected, in the long run about 94% of all
groups should have at most one item failing to meet standards.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
220 chapter 5 Probability and Sampling Distributions
The probability that at least one item fails to meet quality standards is
P(x $ 1) 5 1 2 P(x 5 0) 5 1 2 .6676 5 .3324
Notice that the addition rule and the law of complementary events were used to
simplify the computations of P(x # 1) and P(x $ 1).
5 ^ xp(x)
x
5 #x f (x) dx
Similarly, the variance 2 of a random variable is calculated from the familiar formulas
in Chapter 2. The standard deviation of a random variable is defined to be the square
root of its variance:
2 5 ^ (x2)2p(x) or 2 5 #(x2)2f(x) dx
The mean and standard deviation of a random variable frequently appear as param-
eters in the defining formulas for a mass or density function. For this reason, it is often
necessary to obtain estimates of and before probability calculations are possible. As
discussed later in the chapter, statistics such as the sample mean, x, and sample standard
deviation, s, are frequently used to provide such estimates.
Example 5.12 The reliability of a product at time t, denoted by R(t), is defined as the probability
that the product is still working correctly after t units of time (see Section 6.6).
For complex products consisting of several parts and subassemblies, the time x
until a product fails often follows an exponential distribution with parameter .
In such applications, the mean of the distribution, 5 1y , is called either the
mean time between (or before) failures (MTBF) or the mean time to failure
(MTTF). According to the definition of R(t), the reliability can be calculated
from the formula
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Random Variables 221
Example 5.13 Resistors come in two varieties, general purpose (with tolerances of 65% or greater)
and precision (with tolerances of 62% or less). The tolerance is the amount by which
the true resistance can deviate from the stated resistance. For example, a 6.0-kilohm
(kV) resistor with a tolerance of 610% can be expected to have a measured resis-
tance of 6.0 6 (.10)(6.0), that is, from 5.4 kV to 6.6 kV. Assuming that a uniform
density adequately describes the possible values of x, then the true resistance x of a
randomly selected 6.0-kV resistor is a random variable described by the density func-
tion (see Chapter 1 for the definition of uniform densities):
1
for 5.4 , x , 6.6
f (x) 5 c 1.2
0 otherwise
Suppose we want to find the probability that the conductance (defined as the recip-
rocal of resistance) is greater than a specified amount, say, .16 siemens (S). Writing
this probability statement, we can take reciprocals of both sides to find
1
P(conductance . .16) 5 Pa . .16b 5 P(x , 6.25) 5 .7083
x
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
222 chapter 5 Probability and Sampling Distributions
Example 5.14 Images displayed on computer screens consist of thousands of small regions
called picture elements, or pixels for short. The intensity of the electron beam
focused at a given point (x0, y0) on a flat screen is usually described by two in-
dependent normal random variables x and y, with means x0 and y0, respectively.
That is, we represent the intensity of the beam by a joint density function of two
independent random variables. For example, Figure 5.14 shows a graph of the
joint density function describing an electron beam focused on the point (x0, y0) 5
(30, 50). The standard deviations of the two normal distributions are x 5 .2 and
y 5 .2. Because x and y are independent, we can write the joint density as the product
f (x, y) 5 f1(x)f2(y)
1 1 x 2 30 2 1 1 y 2 50 2
5 e2 2 a .2
b
? e2 2 a .2
b
x 22 y 22
1 1 x 2 30 2 y 2 50 2
5 e2 2 c a .2 b 1a
.2
b d
2xy
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Random Variables 223
0
29
49
30
50
31 51
The volume under this density that sits over a given region B in the x–y
plane describes the proportion of time that the electron beam spends in region
B. Although the joint density can be used to find the probability associated with
any set B of points near (30, 50) on the screen, the probability of some sets can
be found in an easier way. For example, if we want to find the proportion of time
that the beam spends in the region where x , 29.5 and y , 49.6, we can simply
use the independence of x and y to obtain
instead of integrating the density over the region B 5 {(x, y)|x , 29.5, y , 49.6}.
Thus
The proportion of time that the beam spends in this region is very small.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
224 chapter 5 Probability and Sampling Distributions
Example 5.15 In Sections 5.5 and 5.6, we will be concerned with sums and averages of indepen-
dent random variables. Suppose, for example, that two printed circuit boards are ran-
domly selected and tested. Let x be the number of defective computer chips found
on one board; let y be the number of defectives found on the other board. Suppose
the following mass functions describe x and y:
x: 1 2 3 4 y: 1 2 3 4
p1(x): .25 .25 .25 .25 p2(y): .25 .25 .25 .25
To find the mass function associated with the average number of defectives on two
boards, w 5 (x 1 y)y2, we can use mutually exclusive events and independence to
simplify each probability. For example, to find P(w 5 2.5), first break up the event
{(x 1 y)y2 5 2.5} into the disjoint events {x 5 1 and y 5 4}, {x 5 2 and y 5 3}, {x 5
3 and y 5 2}, and {x 5 4 and y 5 1}. Next, find the probabilities of these events by
multiplying mass function values:
Finally, add the probabilities of these disjoint events to find P(w 5 2.5) 5 .2500.
Proceeding in this manner gives the mass function of the average, w:
The graphs of all three mass functions are shown in Figure 5.15. Notice that the
mass function of the average tends to bunch more closely around its mean than do
either of its constituent mass functions, p1(x) and p2( y).
1 2 3 4
+
––––
1 1.5 2 2.5 3 3.5 4 2
1 2 3 4
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Exercises 225
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
226 chapter 5 Probability and Sampling Distributions
Suppose a dilute suspension of bacteria is divided b. Suppose the measuring instrument in part (a)
into several different test tubes. The number of bac- is replaced with a more precise measuring in-
teria x in a test tube has a Poisson mass function with strument having a standard deviation of .5 mm.
a parameter that represents the mean number of What is the probability that a measurement
bacterial cells contained in the different test tubes. from the new instrument lies within 2 mm of
a. Express the probability that a particular test tube the true length of an object?
contains no bacteria, in terms of .
37. Acceptance sampling is a method that uses small
b. In terms of , what is the probability that a test
random samples from incoming shipments of prod-
tube contains at least one bacterial cell?
ucts to assess the quality of the entire shipment. Typi-
c. After a certain period of time, all of the test
cally, a random sample of size n is selected from a
tubes are examined, and it is found that 40%
shipment, and each sampled item is tested to see
of the tubes contain at least one bacterial cell.
whether it meets quality specifications. The number
Use your answer from part (b) to estimate , the
of sampled items that do not meet specifications is
mean number of cells per test tube.
denoted by x. As long as x does not exceed a prespeci-
35. A standard procedure for testing safety glass is to fied integer c, called the acceptance number, then the
drop a 1/2-lb iron ball onto a 12-in. square of glass entire shipment is accepted for use. If x exceeds c,
supported on a frame (“Statistical Methods in Plas- then the shipment is returned to the vendor. In prac-
tics Research and Development,” Quality Engr., tice, because n is usually small in comparison to the
1989: 81–89). The height from which the ball is number of items in a shipment, a binomial distribu-
dropped is determined so that there is a 50% chance tion is used to describe the random variable x.
of breaking through the glass. A breakthrough is a. Suppose a company uses samples of size
considered to be a failure, whereas a ball that is n 5 10 and an acceptance number of c 5 1
stopped by the glass (even if the glass cracks) is con- to evaluate shipments. If 10% of the items in
sidered to be a success. Suppose that 100 sheets of a certain shipment are defective, what is the
safety glass are randomly selected and tested, and probability that this shipment will be returned
that no change has been made in the resin used to to the vendor?
manufacture the glass. b. Suppose that a certain shipment contains no
a. What is the expected number of sheets that will defective items. What is the probability that the
experience a breakthrough? shipment will be accepted by the sampling plan
b. What is the probability that 60 or more sheets in part (a)?
will have a breakthrough? c. Rework part (a) for shipments that are 5%, 20%,
and 50% defective.
36. The normal distribution is commonly used to model
d. Let denote the proportion of defective items
the variability expected when making measurements
in a given shipment. Use your answers to parts
(Taylor, J. R., An Introduction to Error Analysis: The
(a)2(c) to plot the probability of accepting a
Study of Uncertainties in Physical Measurements,
shipment (on the vertical axis) against 5 0,
University Science Books, Sausalito, CA, 1997). In
.05, .10, .20, and .50 (on the horizontal axis).
this context, a measured quantity x is assumed to
Connect the points on the graph with a smooth
have a normal distribution whose mean is assumed
curve. The resulting curve is called the operat-
to be the “true” value of the object being measured.
ing characteristic (OC) curve of the sampling
The precision of the measuring instrument deter-
plan. It gives a visual summary of how the plan
mines the standard deviation of the distribution.
performs for shipments of differing quality.
a. If the measurements of the length of an object
have a normal probability distribution with a 38. Refer to Exercise 37. Acceptance sampling plans that
standard deviation of 1 mm, what is the prob- use an acceptance number of c 5 0 are given the
ability that a single measurement will lie within name zero acceptance plans. Zero acceptance plans
2 mm of the true length of the object? are not frequently used because, although they protect
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.4 Exercises 227
against accepting shipments of inferior quality, they MTBF of 500 hours (see Example 5.12 for the
also tend to reject many shipments of good quality. definition of MTBF). Find the median of this
a. Let denote the proportion of defective items distribution. The median is the time by which
in a shipment. Develop a general formula for half of all such assemblies will break down.
the probability of accepting a shipment having b. Is the median time to failure from part (a) larger
3 100% defective items. or smaller than the mean time before failure
b. Plot the OC curve for the zero acceptance plan (MTBF)?
that uses sample sizes of n 5 10. c. From your answer to part (a), find a general
c. For what value of is the probability of accept- formula (for any value of MTBF) for expressing
ing a shipment about .05? the median time to failure in terms of the mean
time before failure.
39. Qualification exams for becoming a state-certified
welding inspector are based on multiple-choice 42. On a construction site, subcontractor A is respon-
tests. As in any multiple-choice test, there is a sible for completing the structural frame of a build-
possibility that someone who is simply guessing ing. When this task is complete, subcontractor B
the answers to each question might pass the test. then begins the task of installing electrical wiring
Let x denote the number of correct answers given and outlets. The following tables show estimated
by a person who is guessing each answer on a probabilities of completing each task in x days:
25-question exam, with each question having five
Framing time (days), x: 10 15 20 25 30
possible answers (for each question, assume only
one of the five choices is correct). Probability, p1(x): .10 .20 .30 .30 .10
a. What type of probability distribution does x Wiring time (days), y: 5 10 15 20
have? Probability, p2(y): .20 .50 .20 .10
b. For the 25-question test, what are the mean and
standard deviation of x? a. Calculate the expected completion time for
c. The exam administrators want to make sure each task.
that there is a very small chance, say, 1%, that a b. Find the probability distribution of the total
person who is guessing will pass the test. What time for completing both tasks (assume that the
minimum passing score should they allow on framing and wiring tasks are independent).
the exam to meet this requirement? c. What is the probability that the total time to
complete both tasks is less than 35 days?
40. When used to model lifetimes of components, a d. What is the expected time for completing both
probability distribution is said to be “memoryless” tasks?
if, for a component that has already lasted (with-
43. Let x be the cost ($) of an appetizer and y be the
out failure) for t hours, the probability that it lasts
cost of a main course at a certain restaurant for a
for another s hours does not depend on t. That is,
customer who orders both courses. Suppose that x
P(x $ t 1 sux $ t) 5 P(x $ s). Show that the expo-
and y have the following joint distribution:
nential distribution is memoryless.
y
41. The concept of the median of a set of data can also
10 15 20
be applied to the probability distribution of a ran-
5 .20 .15 .05
dom variable. If x is a random variable with density
x 6 .10 .15 .10
function f (x), then the median of this distribution is
7 .10 .10 .05
defined to be the value for which half the area
under the density curve lies to the left of . That is, a. Find the probability mass function of x.
is the solution to the equation #2 f1x2 dx 5 12.
b. Find the probability mass function of y.
a. Suppose the lifetime x of an electronic assem- c. Find the probability that x 1 y # 21.
bly follows an exponential distribution with an d. Are x and y independent?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
228 chapter 5 Probability and Sampling Distributions
definition The sampling distribution of a statistic is a mass or density function that char-
acterizes all the possible values that the statistic can assume in repeated random
samples from a population or process.
Example 5.16 Suppose that we draw 1000 random samples, each of size n 5 25, from a normal pop-
ulation with a mean of 50 and a standard deviation of 2. If we calculate the mean x
of each sample, then the distribution of all 1000 x values gives a good approximation
to the sampling distribution of x. Figure 5.16 shows a histogram of the results of such
an experiment. Notice that the 1000 sample means stack up around the population
mean ( 5 50) and that variation among the sample means is smaller than variation
in the population. In particular, none of the sample means fall outside the range of
48.5 to 51.5 (i.e., none are more than 1.5 units away from ). In fact, it also appears
that very few sample means fall outside the interval 49 to 51; that is, they are gener-
ally within 1 unit of .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.5 Sampling Distributions 229
From the shape and location of the sampling distribution, we can begin to see
which values of the sample statistic are more likely to occur than others. In this sense,
the information in a sampling distribution provides a template for evaluating any sam-
ple, even future samples, from a population or process. In Figure 5.16, for instance, we
can use the tails of the sampling distribution to place bounds on the values that x can
assume whenever we take random samples of size 25 from a normal population with a
mean of 50 and a standard deviation of 2. Going a step further, we can reasonably say
that the mean affects only the location of the histogram and that the value of affects
only the spread of the sample results. If this is so, then we now know a lot about what to
expect when sampling from any normal population whose standard deviation is 5 2.
Example 5.17 Refer to Example 5.16. Suppose that next week we select a single sample of size 25
from a normal population whose standard deviation is known to be 5 2, but whose
mean is unknown to us. If x 5 70 for this sample, then the results in Example 5.16
indicate that 70 is almost certainly no farther than 1.5 units away from the population
mean and it is fairly likely that it is within 1 unit of . That is, we can infer that
the unknown population mean is almost certainly between 68.5 and 71.5, and we
can be reasonably confident that it is between 69 and 71. In this way, by using our
knowledge of what the sampling distribution of x looks like, we can begin to make
inferences about the likely values of the unknown population parameter .
Unless otherwise noted, all content on this page is © Cengage Learning.
x: 0 1
p(x): .80 .20
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
230 chapter 5 Probability and Sampling Distributions
Mean Median
describes such a process in which the proportion of defective items is 20%. By calculat-
ing the sample proportion defective p for each of 1000 random samples of size n 5 25,
an approximate sampling distribution for the statistic p can be formed (Figure 5.18).
Note that this distribution has many more possible values than just the values x 5 0
and x 5 1 in the population (each of the values 0y25, 1y25, 2y25, 3y25, . . . , 25y25 is
a possible value of p). The shape of this sampling distribution is similar to the one in
Figure 5.16, although it contains some gaps because only the values of p shown previ-
ously are possible to attain in a sample of size 25.
Frequency
200
100 Unless otherwise noted, all content on this page is © Cengage Learning.
0
0 .1 .2 .3 .4 .5
Proportion
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.5 Sampling Distributions 231
distributions in Figure 5.17 along with the actual values of the corresponding population
parameters (for a normal population with 5 50, 5 2). The similarity between the
column of population parameters and the column of means of the sampling distribu-
tions leads us to conjecture that the center (i.e., the mean) of the sampling distribution
of a statistic may, in fact, coincide with the corresponding population parameter. When
this happens, we say that the statistic is unbiased, or that it is an unbiased estimator of
the population parameter. As we shall see in Section 5.6, some of the most important
statistics we have encountered so far are unbiased.
Table 5.1 Means and standard deviations of sampling distributions in Figure 5.17
Population Actual Sample mean of Sample standard deviation
parameter value sampling distribution of sampling distribution
Finally, and perhaps most importantly, do we really have to conduct a lengthy sam-
pling experiment every time we want to make inferences based on a statistic generated
from a single sample? As we shall see in Section 5.6, the surprising answer is “no.” In
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
232 chapter 5 Probability and Sampling Distributions
fact, the approximate shape of the sampling distribution is often known in advance, be-
fore taking even a single sample! Furthermore, knowing the specific shape of a sampling
distribution also enables us to calculate probabilities, which allow us to quantify exactly
what we mean by saying that, for example, the sample mean is highly likely to be within
1 unit of the population mean.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.6 Describing Sampling Distributions 233
Sampling Distribution of x
The sampling distribution of x, also called the sampling distribution of the mean, is
the probability distribution that describes the behavior of x in repeated random samples
from a population or process. Like any distribution, the sampling distribution of x has
its own unique mean and standard deviation, which we denote by x and x, respec-
tively. The next general result relates x and x to the population or process mean and
standard deviation.
These equations hold regardless of the particular form of the population distribution. To
emphasize the fact that it describes a sampling distribution, not a population, x is also
called the standard error of x, or the standard error of the mean.
One of the key features of the standard error of the mean x is that it decreases as
the sample size increases. In fact, many statistics have this property (see Section 5.5).
This makes intuitive sense, since we expect that more information ought to provide
better estimates (i.e., smaller standard errors). As a result, increasing the size of a ran-
dom sample has the desirable effect of increasing the probability that the estimate x will
lie close to the population mean .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
234 chapter 5 Probability and Sampling Distributions
–––
––
10
( = 10)
–––
––
30
( = 30)
––––
–––
100
( = 100)
Figure 5.19 The probability that falls within a fixed distance from increases as
increases
Example 5.18 Physical characteristics of manufactured products are often well described by
normal distributions. Suppose, for example, that we want to evaluate the length
(in cm) of certain parts in a production process based on the information in a
random sample of five such parts. The parts are required to have a nominal length
of 20 cm; past experience with this process indicates that the standard deviation
is known to be 5 1.8 cm. If we assume that the lengths can be described by a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.6 Describing Sampling Distributions 235
normal distribution, what is the probability that the mean of this sample will be
within 2 mm of the current process mean ? That is, what is the probability that x
will lie between 2 2 and 1 2?
The solution to this type of problem lies in recognizing that the sam-
pling distribution of x is normal with a mean of x 5 and standard error of
x 5 y1n 5 1.8y15 5 .805. To find the probability P 1 2 2 , x , 1 22, we
standardize, making sure to use the mean and standard error of x while doing this:
2 22 122
,z,
P( 2 2 , x , 1 2) 5 P °
¢
1n 1n
22 2
5 Pa ,z, b 5 .9868
.805 .805
That is, there is a 98.68% chance that the mean of a random sample of size n 5 5
will be within 2 units of the population mean . Notice how the unknown mean
cancels itself during the standardization. In other words, we do not need to know (or
assume) a value for . Instead, when we select our sample of five parts, we can be
relatively confident that the sample mean will be no farther than 2 cm from the true
(unknown) process mean.
Although the sampling distribution of x must reflect certain features of the popula-
tion being sampled (especially its location, ), the shape of the sampling distribution is
primarily influenced by n. That is, as n increases, the particular shape of the population
(e.g., uniform, exponential, normal, Weibull) exerts less and less influence on the shape
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
236 chapter 5 Probability and Sampling Distributions
of the sampling distribution, which becomes more and more normal in appearance.
Figure 5.20 illustrates this effect for several different populations. The closer the popu-
lation is to being normal, the more rapidly the sampling distribution of x approaches
normality. For instance, we saw this behavior emerging in Figure 5.15, where even
small samples of size n 5 2 from a uniform population result in a sampling distribution
that is already beginning to take on the characteristic normal shape.
Uniform population
=2 = 10
Exponential population
=2 = 50
Normal population
=2 = 10
Many authors use n $ 30 as a rough guide for what constitutes a “large enough”
sample size for invoking the Central Limit Theorem. This is not a bad rule in general,
but there are cases where substantially smaller values of n will suffice (e.g., with sym-
metric populations like the uniform and the normal), as well as cases where larger Unless otherwise noted, all content on this page is © Cengage Learning.
sample sizes are needed (especially for highly skewed populations). As a rule, the less
symmetric a population is, the larger the sample size will have to be to ensure normality
of x. For example, in the case of an exponential population, sample sizes of 40 to 50 are
often required to achieve normality.
Example 5.19 Consider the distribution shown in Figure 5.21 for the amount purchased (rounded
to the nearest dollar) by a randomly selected customer at a particular gas station (a
similar distribution for purchases in Britain (in pounds) appeared in the article “Data
Mining for Fun and Profit,” Statistical Science, 2000: 111–131; there were big spikes at
the values 10, 15, 20, 25, and 30). The distribution is obviously quite nonnormal. We
asked Minitab to select 1000 different samples, each consisting of n 5 15 observations,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.6 Describing Sampling Distributions 237
Probability
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
Purchase
0.00
5 10 15 20 25 30 35 40 45 50 55 60 amount
and calculate the value of the sample mean x for each one. Figure 5.22 is a histogram
of the resulting 1000 values; this is the approximate sampling distribution of x under
the specified circumstances. This distribution is clearly approximately normal even
though the sample size is not very large. A normal quantile plot based on the 1000 x
values exhibits a very prominent linear pattern.
Density
0.14
0.12
0.10
Unless otherwise noted, all content on this page is © Cengage Learning.
0.08
0.06
0.04
0.02
0.00 Mean
18 21 24 27 30 33 36
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
238 chapter 5 Probability and Sampling Distributions
Example 5.20 Printed circuit boards (PCBs), used in electronic equipment such as computers and
appliances, are laminated cards (usually green) upon which various electronic compo-
nents are mounted. One step in the manufacture of PCBs uses machines to automati-
cally insert the metal connecting pins on the components into the appropriate hole
patterns on a PCB. Components of each type (e.g., resistors, capacitors) are adhesively
mounted on large paper-tape rolls and fed into the machines, which then insert them
into a PCB. The amount of time it takes to insert all the components on a given PCB
varies somewhat from board to board because of machine downtime for replenish-
ing tape rolls and replacing components with broken pins. Suppose that an insertion
machine can complete a certain type of PCB in an average time of 3 minutes with a
standard deviation of .5 minute. If an order of 100 PCBs is run on this machine, what
is the probability that the average time to complete all the boards exceeds 3.1 minutes?
Viewing the completion times as a random sample from a population with
5 3 and 5 .5, we can calculate the mean and standard error of the sampling
distribution of the average completion time (of the 100 boards) as follows:
.5
x 5 5 3 and x 5 5 5 .05
1n 1100
Because the sample size n 5 100 is large, the Central Limit Theorem allows us to
use the normal distribution to calculate the desired probability:
3.1 2 3
P(x . 3.1) Paz . b
.05
5 P(z . 2) 5 1 2 P(z # 2) 5 1 2 .9772 5 .0228
That is, there is only a 2.28% chance that the average completion time will exceed
3.1 minutes. Since x .3.1 is equivalent to (100) x . 100(3.1), we can also state that
there is a 2.28% chance that the total time for completing the 100 boards will exceed
310 minutes (5 hours, 10 minutes).
5 ^ xp(x) 5 0(1 2 ) 1 1 ? 5
x
5 ^ (x 2 )2p(x) 5 (1 2 )
2
5 2(1 2 )
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.6 Describing Sampling Distributions 239
(1 – )
1–
0 1
Every random sample drawn from such a population will consist entirely of 0s and
1s. Suppose, for instance, that a particular sample of size 10 contains the observations
{0, 0, 1, 1, 0, 1, 0, 0, 1, 0}. Then the sample mean is (0 1 0 1 1 1 1 1 0 1 1 1 0 1 0 1
1 1 0)y 10 5 .40. That is, the sample mean is simply the proportion of 1s in the sample.
We use the notation p to denote the proportion of successes, also called the sample
proportion, in a random sample of size n.
Since p is actually a sample mean, we can use the earlier results in this section to
determine its sampling distribution. For example, the mean and standard error of the
sampling distribution of p are given by
2(1 2 ) (1 2 )
p 5 5 and p 5 5 5
1n 1n C n
Furthermore, for a sufficiently large sample size n, the Central Limit Theorem in-
dicates that the sampling distribution of p will be approximately normal. Because
we record only whether each sampled item has a certain characteristic or not,
large samples are often easy to come by when estimating a population proportion
Unless otherwise noted, all content on this page is © Cengage Learning.
. As a general rule, the accuracy of the normal approximation is best when both
n $ 5 and n(1 2 ) $ 5.
Sampling Distribution of
The mean and standard error of the sampling distribution of are given by
(1 2 )
5 and 5
B
In addition, for a large enough , the sampling distribution of is approximately normal. In
general, the normal approximation is best when $ 5 and 11 2 2 $ 5.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
240 chapter 5 Probability and Sampling Distributions
The fact that the formulas for p and p both contain the unknown parameter
might at first appear to negate the usefulness of the sampling distribution of p. After all,
if the population proportion is unknown, how can we possibly find 2(1 2 )yn ?
In practice, there are two relatively simple solutions to this problem: (1) Use a prede-
termined value of that describes some hypothetical value of against which the
sample data is to be compared or (2) use 5 1y2 in the formula for p, which results
in a conservatively large value of p.
The second approach is based on the observation that (1 2 ) # .25 for any value
of between 0 and 1.1 This means that
(1 2 ) .25 1
p 5 # 5
B n A n 21n
no matter what the true value of . Thus, by choosing the sample size n large enough,
1y21n (and hence p) can be made as small as desired. This approach is commonly
used in all forms of survey sampling.
Example 5.21 Control charts are graphs that monitor the movements in a sample statistic (such
as x or p) in periodic samples taken from an ongoing process. Using the sampling
distribution of the statistic as a yardstick, values of the statistic “too far” away from the
center of the sampling distribution are taken to be signals of possible problems with
the process. For example, a p chart is often used to monitor the proportion of non-
conforming products in a manufacturing process. Using past data from the process, a
value of is selected as being representative of the long-run behavior of the process.
Suppose, for example, that a certain process constantly generates an average of about
5% nonconforming products and that samples of size 100 are taken each day to test
whether the 5% nonconformance rate has changed. On one particular day, 12 non-
conforming products appear in the sample. How do we interpret this information?
Assuming that the process is behaving as it has in the past, we set 5 .05. For
this value of , n 5 100(.05) 5 5 and n(1 2 ) 5 100(.95) 5 95, so the condition
for applying the normal approximation is met. Furthermore, the mean and standard
deviation of the sampling distribution of p can be calculated:
(1 2 ) (.05)(1 2 .05)
p 5 .05 and 5 p 5 5 .0218
B n B 100
Because the sampling distribution of p is approximately normal, we can evaluate the
sample proportion of p 5 12y100 5 .12 by determining how far away it is from the mean
of .05. Since (.12 2 .05)y.0218 5 3.21, we see that the value of .12 is 3.21 standard
deviations above the process mean. In other words, this sample result has a very small
probability of occurring if the process is running as usual. Our conclusion is that it is
more likely that something has caused an increase in the process nonconformance rate.
1
Writing (1 2 ) as 1y4 2 (1y2 2 )2 you can see that the maximum value 1y 4 occurs when 5 1y2.
Alternatively, you could use calculus, setting the derivative of (1 2 ) equal to 0, to find that 5 1y2
maximizes the quantity (1 2 .)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.6 Exercises 241
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
242 chapter 5 Probability and Sampling Distributions
a. If a random sample of 25 such specimens is se- 58. In Exercise 36, what is the probability that the aver-
lected, what is the probability that the sample age of two measurements will lie within 2 mm of
average sediment density is at most 3.00? Be- the true length of the object?
tween 2.65 and 3.00?
59. Roughly speaking, the Central Limit Theorem
b. How large a sample would be required to ensure
says that sums of independent random variables
that the first probability in part (a) is at least .99?
tend to have (approximately) normal distributions.
56. The number of flaws x on an electroplated automo- Similarly, it can be shown that products of inde-
bile grill is known to have the following probability pendent positive random variables tend to have
mass function: lognormal distributions. Recall from Section 1.5
that a random variable x is said to have a lognor-
x: 0 1 2 3
mal distribution with parameters and if the
p(x): .8 .1 .05 .05
random variable y 5 ln(x) is normal with mean
a. Calculate the mean and standard deviation of x. and standard deviation . The successive breaking
b. What are the mean and standard deviation of of particles into finer and finer pieces, a process
the sampling distribution of the average number that can be modeled as a product of positive ran-
of flaws per grill in a random sample of 64 grills? dom variables, leads to lognormal particle size dis-
c. For a random sample of 64 grills, calculate the tributions. In particular, small particles suspended
approximate probability that the average num- in the atmosphere (called aerosols) have radii that
ber of flaws per grill exceeds 1. can be described by a lognormal distribution, with
parameters 5 22.62 and 5 .788 (Crow, E.
57. Only 2% of a large population of 100-ohm gold-band
L., and K. Shimizu, Lognormal Distributions: The-
resistors have resistances that exceed 105 ohms.
ory and Applications, Marcel Dekker, New York,
a. For samples of size 100 from this population,
1988: 337).
describe the sampling distribution of the sample
a. Find the mean radius (in m) of the atmospheric
proportion of resistors that have resistances in
particles.
excess of 105 ohms.
b. What is the probability that an atmospheric par-
b. What is the probability that the proportion of re-
ticle will have a radius exceeding .12 m?
sistors with resistances exceeding 105 ohms in a
random sample of 100 will be less than 3%?
Supplementary Exercises
60. Figure 5.5 shows how a tree diagram can be used a. What is the probability that a randomly chosen
to verify that {A or B}= 5 {A= and B=}. Use a Venn dia- tree comes from one of the first three parcels of
gram to prove this fact. land?
b. What is the probability that a randomly chosen
61. A large farming area is divided into five parcels of
tree does not come from parcel 5?
land of different sizes, as follows:
Parcel: B1 B2 B3 B4 B5 62. A complex assembly contains 20 critical compo-
Size (acres): 15 20 25 10 20 nents (labeled C1, C2, . . .), each having a probabil-
Because crop-bearing trees are uniformly planted ity of .95 of functioning correctly. Each component
within each parcel, the probability that a randomly must function correctly for the entire assembly to
sampled tree from the farm comes from a particular function. Let A denote the event that the assembly
parcel is assumed to be proportional to the size of fails to function correctly and let B denote the event
the parcel. that component C1 fails to function correctly.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 243
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
244 chapter 5 Probability and Sampling Distributions
72. An electrical appliance uses four 1.5-volt batteries. what is the probability that all births occurred on
The batteries are connected in series so that the total March 11? Hint: The deviation of birth date from
voltage supplied to the appliance is the sum of the due date is normally distributed with mean 0.
voltages in the four batteries. Suppose that the ac- d. Explain how you would use the information in
tual voltage of all 1.5-volt batteries is known to have a part (c) to calculate the probability of a common
mean of 1.5 volts and a standard deviation of .2 volt. birth date.
a. What are the mean and standard error of the
75. A friend who lives in Los Angeles makes frequent
sampling distribution of the average voltage in
consulting trips to Washington, DC; 50% of the time
four randomly selected 1.5-volt batteries?
she travels on airline #1, 30% of the time on airline
b. What is the mean of the sampling distribution of
#2, and the remaining 20% of the time on airline #3.
the total voltage in four randomly selected 1.5-
For airline #1, flights are late into DC 30% of the time
volt batteries?
and late into LA 10% of the time. For airline #2, these
73. Five randomly selected 100-ohm resistors are con- percentages are 25% and 20%, whereas for airline #3
nected in a series circuit. Suppose that it is known the percentages are 40% and 25%. If we learn that
that the population of all such resistors has a mean on a particular trip she arrived late at exactly one of
resistance of exactly 100 ohms with a standard de- the two destinations, what are the posterior probabili-
viation of 1.7 ohms. ties of having flown on airlines #1, #2, and #3? Hint:
a. What is the probability that the average resis- From the tip of each first-generation branch on a
tance in the circuit exceeds 105 ohms? tree diagram, draw three second-generation branches
b. What is the probability that the total resistance labeled, respectively, 0 late, 1 late, and 2 late.
in the circuit differs from 500 ohms by more
76. A factory uses three production lines to manufac-
than 11 ohms?
ture cans of a certain type. The accompanying
c. Find the number of resistors, n, for which
table gives percentages of nonconforming cans,
p1490 # T # 5102 5 .95, where T denotes the
categorized by type of nonconformance, for each of
total resistance in the circuit.
the three lines during a particular time period:
74. The article “Three Sisters Give Birth on the Same
Day” (Chance, Spring 2001, 23–25) used the Line 1 Line 2 Line 3
fact that three Utah sisters had all given birth on Blemish 15 12 20
March 11, 1998 as a basis for posing some interest-
Crack 50 44 40
ing questions regarding birth coincidences.
a. Disregarding leap year and assuming that the Pull-tab problem 21 28 24
other 365 days are equally likely, what is the Surface defect 10 8 15
probability that three randomly selected births Other 4 8 2
all occur on March 11? Be sure to indicate what,
if any, extra assumptions you are making. During this period, line 1 produced 500 noncon-
b. With the assumptions used in part (a), what is forming cans, line 2 produced 400 such cans, and
the probability that three randomly selected line 3 was responsible for 600 nonconforming cans.
births all occur on the same day? Suppose that one of these 1500 cans is randomly
c. The author suggested that, based on extensive selected.
data, the length of gestation (time between con- a. What is the probability that the can was pro-
ception and birth) could be modeled as having duced by line 1? That the reason for nonconfor-
a normal distribution with mean value 280 days mance is a crack?
and standard deviation 19.88 days. The due dates b. If the selected can came from line 1, what is the
for the three Utah sisters were March 15, April 1, probability that it had a blemish?
and April 4, respectively. Assuming that all three c. Given that the selected can had a surface defect,
due dates are at the mean of the distribution, what is the probability that it came from line 1?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Bibliography 245
77. One satellite is scheduled to be launched from different from the bit received (a reversal). Assume
Cape Canaveral in Florida, and another launching that relays operate independently of one another.
is scheduled for Vandenberg Air Force Base in Cali- Transmitter Relay 1 Relay 2 Relay 3
fornia. Let A denote the event that the Vandenberg Receiver
launch goes off on schedule, and let B represent the
a. If a 1 is sent from the transmitter, what is the
event that the Cape Canaveral launch goes off on
probability that a 1 is sent by all three relays?
schedule. If A and B are independent events with
b. If a 1 is sent from the transmitter, what is the
P(A) . P(B) and P(A or B) 5 .626, P(A and B) 5
probability that a 1 is received by the receiver?
.144, determine the values of P(A) and P(B).
Hint: Use a tree diagram.
78. A message is transmitted using a binary code of c. Suppose that 70% of all bits sent from the trans-
0s and 1s. Each transmitted bit (0 or 1) must pass mitter are 1s. If a 1 is received by the receiver,
through three relays before reaching a receiver. At what is the probability that a 1 was sent? Hint:
each relay, the probability is .20 that the bit sent is Use a tree diagram.
Bibliography
Devore, J. L. and K. N. Berk, Modern Mathematical Ross, S., A First Course in Probability (8th ed.), Wiley,
Statistics with Applications (2nd ed.), Springer, New York, 2009. A succinct mathematical treatment
New York, 2012. A more mathematical treatment with good examples and problems.
than given in this text, but still readable, with good Sachs, L., Applied Statistics: A Handbook of Tech-
examples and problems. niques (2nd ed.), Springer, New York, 1984. A one-
Olofsson, P., Probabilities: The Little Numbers That volume summary of statistical methods that emphasiz-
Rule Our Lives, Wiley, New York, 2007. An outstand- es short summaries of essentials, easy examples, tables,
ing non-mathematical exposition, with great insights. notes, and detailed references for further reading.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
246 chapter 6 Quality and Reliability
6
hilmi_m/Istockphoto.com
Quality and Reliability
6.1 Terminology
6.2 How Control Charts Work
6.3 Control Charts for Mean and Variation
6.4 Process Capability Analysis
6.5 Control Charts for Attributes Data
6.6 Reliability
Introduction
Statistical methods for monitoring and improving the quality of manufactured goods
have been around since the early 1920s when Bell Laboratories engineer W. A.
Shewhart introduced the graphical control chart method for detecting possible
problems in manufacturing processes (Sections 6.2, 6.3, and 6.5). Current applica-
tions of statistical methods of quality assurance have widened to include service
industries as well as traditional manufacturing applications. Since the 1980s, there
has also been a greatly increased emphasis on the use of experimental design
techniques that seek to identify the key factors that lead to improvements in pro-
cesses and products. Experimental design methods, which were briefly described
in Section 4.3, are discussed in detail in Chapters 9 and 10. Although the focus in
Chapter 6 is on the various control charts that have been developed to monitor
existing production systems, we also include a discussion of the important topic of
evaluating the reliability of finished products (Section 6.6).
The statistical tools underlying the methods of this chapter are fairly basic.
Calculations of tail areas of normal distributions are used in Section 6.4 to estimate the
capability of a production process to produce acceptable products. Control chart
methods in the remaining sections are based on knowing the sampling distribution
(Sections 5.5 and 5.6) of the various statistics used to describe the output of a
246
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.1 Terminology 247
6.1 Terminology
Applying statistics to a specific field, such as quality control, requires some knowledge of
the jargon used in that field. The terminology introduced subsequently is used through-
out the remaining sections of this chapter. In some cases, familiar statistical terms (such
as discrete and continuous measurements) are given different names by quality practitio-
ners, making it necessary to know both names when working in this field.
Specification Limits
When product designs are translated into tangible entities, it becomes necessary to
precisely define the key characteristics of a product and each of its subcomponents. For
manufactured products, this is done by specifying the exact physical dimensions and
other quality characteristics that finished products should have. For services, specifica-
tions often take the form of rules for processing transactions or guidelines for interacting
with customers. In many cases, especially in manufacturing, a single value corresponds
to the most desired quality level for a given product characteristic. We refer to this value
as the nominal or target value of the quality characteristic.
Practically speaking, it is almost impossible to make each unit of product identical
to the next, so some flexibility is required in achieving target values. This is done by
choosing specification limits or tolerances that delineate the range of measured values
that we will accept as “close enough” to the target value, in the sense that products that
are within the specification range should be fit for their intended use.1 For example,
car doors are made with a certain nominal width, but specification limits are neces-
sary because doors cannot be too wide (or they may not close properly) or too narrow
(or they may fail to latch correctly). Quality characteristics that have both upper and
lower specification limits are said to have a two-sided tolerance. Those with only one
specification limit have a one-sided tolerance. Examples of characteristics with one-
sided tolerances include breaking strengths of materials, which have lower specifica-
tion limits, and the level of contaminants in a water supply, which have only upper
specification limits.
Nominal values and their associated specification limits are generally stated
together in an abbreviated form such as 1 in. 6.005 in., which describes a character-
istic with a nominal value of 1 in., a lower specification limit of .995 in., and an upper
specification limit of 1.005 in. Together, the nominal value and specification limits are
called the specifications or, more simply, the “specs.” When data do not exceed the
specification limits placed on them, we say that the particular process giving rise to the
data is “within specifications.” Otherwise, the process is said to “fail the specifications”
or to be “out of spec.”
1
In the 1970s, quality was defined to be “fitness for use.” Around 1983, the American Society for Quality
Control (ASQC) expanded the definition to “quality is the totality of features and characteristics of a product
or service that bear on its ability to satisfy given needs.”
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
248 chapter 6 Quality and Reliability
definitions The largest allowable value that a quality characteristic can have is called the
upper specification limit (USL); the smallest allowable value is called the lower
specification limit (LSL).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.1 Terminology 249
(a)
(b)
data in the quality professions. Discrete data, those that arise by counting things, is called
attributes data. These names are commonly used to describe the various statistical tech-
niques used in quality control. For example, control charts are classified as variables
control charts or attributes control charts, depending on the kind of data used to form
the charts (see Sections 6.2, 6.3, and 6.5).
Histograms
As shown in Figures 6.2 and 6.3, histograms are very effective tools for understanding
processes that generate variables data. Because many processes tend to produce vari-
ables data that follow normal distributions, normal curves are often superimposed over
such histograms. This technique is so commonly used that it is standard practice to
describe a process’s output by drawing a normal curve centered at the sample mean of
the data, sometimes without even including the histogram of the data. When specifica-
tion limits are included as well, we get a visual picture of how a process is behaving with
respect to its specifications (Figure 6.2). From this figure, it is easy to see how much of
the process data is nonconforming, that is, outside of the specification range.
Unless otherwise noted, all content on this page is © Cengage Learning.
Measurement
LSL Process USL
average
Histograms are often used to give warnings of possible process problems. A smoothly
running process usually generates data whose histogram appears similar to that in Fig-
ure 6.2. Irregularities in a process are evidenced by histogram shapes that differ from
a normal curve. Figure 6.3 shows some of the typical histogram shapes that can occur
along with the most likely reasons for their appearance.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
250 chapter 6 Quality and Reliability
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.1 Exercises 251
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
252 chapter 6 Quality and Reliability
and 3.1 cm (so the lower specification limit is a randomly selected cork will conform to
LSL 5 2.9 cm, and the upper specification limit is specification?
USL 5 3.1 cm). b. If instead the mean value is 3.00 and the standard
a. If cork diameter is a normally distributed vari- deviation is .05, is the probability of conforming
able with mean value 3.04 cm and standard to specification smaller or larger than it was in
deviation .02 cm, what is the probability that part (a)?
Control Charts
Control charts are constructed by taking successive samples from the output of a pro-
cess, making measurements on the sampled items, and then plotting summary statistics
of these results. Figure 6.4 shows a typical control chart. The samples, also called sub-
groups, of size n are taken at regular intervals of time. For each subgroup, a summary
statistic is calculated and plotted (on the vertical axis) versus the subgroup number (on
the horizontal axis). Any statistic of interest can be calculated, but the most commonly
used are x (subgroup mean), R (subgroup range), s (subgroup standard deviation), p
(proportion nonconforming), c (number of nonconformities), and u (nonconformities
per unit). A control chart derives its name from the name of the particular statistic
Unless otherwise noted, all content on this page is © Cengage Learning.
Subgroup statistic
Assignable causes
Upper control limit
UCL
Centerline
Common cause
variation
Lower control limit
Assignable causes LCL
Subgroup number
1 2 3 4 ...
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.2 How Control Charts Work 253
calculated in the subgroups. For example, an x chart (read “x bar chart”) is one that
monitors successive subgroup means, an R chart monitors subgroup ranges, and so forth.
The control limits and centerline of a control chart are based on the sampling
distribution (see Sections 5.5 and 5.6) of the chart statistic. The smaller of the two con-
trol limits is called the lower control limit (LCL) and the larger one is called the upper
control limit (UCL). In the United States, control limits are set at a distance of 3 stan-
dard errors (i.e., 3 standard deviations of the subgroup statistic) from the mean of the
sampling distribution. This is based on the fact that many sampling distributions closely
approximate normal distributions, the majority of whose probability, about 99.73%, lies
within 3 standard deviations of the mean. For example, in an x chart, the standard error
y1n of the sampling distribution of x is used to establish the control limits, which in
theory would be set at 6 3y1n. In practice, of course, estimates of the process mean
and process standard deviation must be used in this formula. In England and other
countries, control limits are set by specifying the probability, typically around 99%, that
lies under the sampling distribution curve between the control limits.
Plotted points that fall outside (i.e., above the UCL or below the LCL) are inter-
preted as signals of possible special causes, whereas points within the control limits are
usually (but not always) associated with common cause variation, that is, the absence
of special causes. It is also important to remember that control limits are different from
specification limits, which are not plotted on a control chart.
Statistical Control
When all the points on a control chart lie between the control limits and when there are no
other anomalous patterns in the charted points, a process is said to be in a state of statistical
control or, more briefly, “in control.” Otherwise, the process is said to be “out of control.”
The phrase out of control, which can sometimes be misinterpreted, is only a way of indicat-
ing that control chart points are behaving in a nonrandom fashion. It does not imply that
the process itself is bad nor does it necessarily imply that any nonconforming products are
being made. “Out of control” simply means that assignable causes are likely to be present.
When control charts were first introduced, the primary signal of an “out-of-control”
condition was when one or more points were outside one of the control limits. If the
sampling distribution of the subgroup statistic is approximately normal, this means that
there is a probability of about .0027 (or .27%) that a control chart point will fall outside
one of the control limits when no assignable causes are present. That is, when a process is
running smoothly and no special causes are operating, there is a relatively small chance
(.27%) that a control chart point will give a false positive—mistakenly signaling the pres-
ence of a special cause. On the other hand, when special causes are present, there is also
a chance that the chart will fail to detect them. To increase the sensitivity of a chart for
detecting special causes, while still maintaining the false positive rate at .27%, an extended
set of “out-of-control” rules is often used. The “out-of-control” rules in Figure 6.5 are
commonly used by quality control software to help detect the presence of special causes.
Rational Subgroups
Selecting rational subgroups is key to the proper use of control charts. The name ratio-
nal subgroup is intended to remind us that the subgroups are chosen in a thoughtful
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
254 chapter 6 Quality and Reliability
UCL UCL
A A
B B
C C
Centerline Centerline
C C
B B
A A
LCL LCL
UCL UCL
A A
B B
C C
Centerline Centerline
C C
B B
A A
LCL LCL
UCL UCL
A A
B B
C C
Centerline Centerline
C C
B B
A A
LCL LCL
Test 5. Two out of three points in Test 6. Four out of five points in a
a row in Zone A or beyond row in Zone B or beyond
UCL UCL
A A
B B Unless otherwise noted, all content on this page is © Cengage Learning.
C C
Centerline Centerline
C C
B B
A A
LCL LCL
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.2 Exercises 255
manner and are usually not random samples. Instead, rational subgroups should be
chosen in a way that maximizes the ability of the chart to detect special causes. The goal
is to have the variation within any rational subgroup represent the common cause varia-
tion in the process. In this way, any significant variation between subgroups can be
attributed to possible special causes. Randomness and sampling distributions enter the
picture when we make the assumption that there are no special causes at work, in which
case each rational subgroup can be considered to be a random sample from the process.
That is, if a process is in control, then successive items (and subgroups of such items)
should vary according to a system of random causes, which then permits us to use the
properties of a sampling distribution to form control limits.
One commonly used method for forming rational subgroups is to choose subgroup
elements over a fairly short span of time. The time span should be short enough so that it
is unlikely for the occurrence of a special cause to overlap two subgroups. For example,
if differences between raw materials are a potential source of process problems, then
subgroups should be formed such that all elements in each subgroup correspond to only
one type of raw material. Then, if a problem occurs when raw materials are changed,
the data in all subgroups occurring after the change of materials will differ from the data
in the subgroups taken before the change, and the control chart points calculated from
such subgroups will have a good chance of detecting the problem.
A general strategy for deciding how to form rational subgroups is (1) to decide which
causes are important to detect and which are not, then (2) to design subgroups that
maximize the chance of detecting the important causes and relegate the unimportant
causes to the within-subgroup variation. For instance, suppose that daily changes in
temperature are known to have a small, but inconsequential, effect on the lengths of
plastic parts, whereas impurities in batches of raw plastic pellets are known to have a
serious effect on part lengths. If each batch of pellets lasts, say, for 4 hours of produc-
tion, then subgroups of size 6 might be formed once an hour by selecting one part about
every 10 minutes after a new batch of pellets is opened. In this way, each subgroup of
6 would represent a specific batch, but several different temperatures would be repre-
sented over each 1-hour collection period.
9. Two identical machines are used to make a par- are taken from the output of machine 2; an
ticular metal part. The finished parts from both hour later, five parts are sampled from ma-
machines are mixed together on a conveyor system chine 1; and so forth.
that moves the parts to a subsequent assembly op- Which method of choosing rational sub-
eration. Consider the following two methods for groups would be better able to detect when one
generating rational subgroups for a control chart of of the machines is not in statistical control?
this process:
10. When a process is in a state of statistical control,
a. Method 1: Five parts per hour are sampled from
all of the points on a control chart should fall
the finished parts on the conveyor system each
within the control limits. However, it is undesir-
hour.
able that all of the points should fall extremely
b. Method 2: Before reaching the conveyor sys-
near, or exactly on, the centerline of the control
tem, a sample of five parts is taken from the
chart. Why?
output of machine 1; an hour later, five parts
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
256 chapter 6 Quality and Reliability
LCL
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.3 Control Charts for Mean and Variation 257
Theoretically, the control limits for the x chart are based on 3-sigma limits of the
sampling distribution of the statistic x:
UCL 5 1 3 and LCL 5 2 3
1n 1n
where and denote, respectively, the long-run process mean and standard deviation
of the process. Of course, these formulas cannot be used directly since both and
must first be estimated from the available process data. The process average is estimated
by the average of k successive subgroup means:
1 k
x5 ^x
k i51 i
The estimate is denoted by x (read “x double bar”) because it is an average of several
averages; x is also called the grand mean of the subgroup means. To obtain a reason-
able estimate of , the following two-stage procedure is used. First, the chart for process
variation (the R chart) is brought into statistical control. This ensures that the process
variation is stable and, therefore, that the centerline of the R chart is a reliable estimate
of the average range of subgroups of size n from the process. Second, this centerline
is converted into an estimate of the process standard deviation , which is then put
into the expression x 6 3y1n to obtain the approximate control limits for the x chart.
Fortunately, the control limits of the R chart also turn out to be simple functions of the
centerline of the R chart.
The R Chart
To construct an R chart, we use the data from some number, k, of successive subgroups
of process measurements. It is usually recommended that about 20 to 25 subgroups be
used. If possible, the same sample size n is used to form each subgroup. The centerline
of the R chart is denoted by R and is calculated by averaging the sample ranges R1, R2,
R3, . . . , Rk of the k subgroups:
1 k
R5 ^R
k i51 i
R serves as an estimate of R, the mean of the sampling distribution of the ranges (for
samples of size n) from the process. Let R denote the standard deviation of this sam-
pling distribution; the 3-sigma limits, R 6 3R, are used to form the control limits for
the R chart. Assuming that the process measurements can be adequately described by
a normal distribution, it can be shown that the control limits for the R chart are given
by
UCL 5 D4R and LCL 5 D3R
where D3 and D4 are constants that depend on the subgroup size, n. Values of D3 and
D4 are found in Appendix Table XI, which lists such constants for a variety of different
types of control charts.
After finding the centerline and control limits, the R chart is constructed by simply
plotting the k subgroup ranges Ri(i 5 1, 2, . . . , k) versus the subgroup index, i, and then
drawing horizontal lines to represent the centerline R and control limits. Using the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
258 chapter 6 Quality and Reliability
“out-of-control” rules listed in Section 6.2, we examine the R chart to see whether these
k ranges seem to be in statistical control. If any out-of-control conditions are found, it is
recommended that the subgroup(s) associated with these problems be eliminated and
that the centerline and control limits be recalculated based on the reduced number of
subgroups. When doing this, subgroups should be eliminated only if definite assign-
able causes can be found for the out-of-control signal associated with these subgroups.
Out-of-control subgroups for which no assignable cause can be found should not be
eliminated.
When the R chart is deemed to be in a state of statistical control, the centerline
R can then be considered to be a reliable estimate of the average range (of samples of
size n) from a normal population. This estimate can then be converted into an esti-
mate for the process standard deviation by means of the formula
R
n5
d2
where d2 is found in the table of control chart constants (Appendix Table XI). The
estimate n of is used to calculate the control limits of the x chart and to assess the
capability of the process to meet the specification limits (see Section 6.4).
The x Chart
Once the R chart is in control, the x chart is then constructed. Any subgroups that were
eliminated during the construction of the R chart should automatically be eliminated
from the x chart calculations. Given that we have k valid subgroups of data, whose
subgroup means are denoted by x1, x2, x3, . . . , xk, the centerline of the x chart is just the
average of the subgroup means,
1 k
x5 ^x
k i51 i
as mentioned previously. The control limits are found by replacing and by the esti-
mates x and Ryd2 in the control limit formulas:
Ryd2 Ryd2
UCL 5 1 3 x13 and LCL 5 2 3 x23
1n 1n 1n 1n
Letting A2 5 3yd2 1n, we shall now use the following estimated limits:
where the constant A2 depends on the particular subgroup size, n, and is found in
Appendix Table XI. These formulas show how the centerline R of the R chart directly
affects the control limits of the x chart.
Example 6.1 The process of making ignition keys for automobiles consists of trimming and pressing
raw key blanks, cutting grooves, cutting notches, and plating. Some of the dimensions,
such as the depth of grooves and notches, are critical to the proper functioning of the
keys. Table 6.1 contains measurements (in inches) of a particular groove depth on the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.3 Control Charts for Mean and Variation 259
side of each key. Due to the high volume of keys processed per hour, the sampling
frequency is chosen to be five keys every 20 minutes. For convenience, the subgroup
means and standard deviations are also given in Table 6.1, along with the grand mean
x 5 .007966 and the average range R 5 .002400. The relevant control chart constants
for subgroups of size n 5 5 are D4 5 2.114, D3 5 0, and A2 5 .577 (Appendix Table XI).
The initial estimates of the control limits for the R chart are
UCL 5 D4R 5 (2.114)(.002400) 5 .005074
LCL 5 D3R 5 (0)(.002400) 5 0
The corresponding control chart is shown in Figure 6.6. Because there do not appear
to be any out-of-control points in the chart, no subgroups need be dropped, and we
can proceed immediately to the construction of the x chart.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
260 chapter 6 Quality and Reliability
Sample range
.004
.003
– = .002400
.002
.001
0 LCL = 0
Sample number
0 10 20
The x chart is shown in Figure 6.7. None of the points is outside the control units,
although there is a run of eight consecutive points above the centerline (subgroups
8–15). According to the extended list of “out-of-control” rules in Section 6.2, this run
of points is not quite long enough to signal an out-of-control condition.
Sample mean
.0095
UCL = .009351
.0085
–
= .007966
.0075
x and s Charts
Various alternatives to x and R charts have been proposed over the years. Because there
are many different statistics available for measuring central tendency, along with several
measures for variation, just about any combination of the two can be used to monitor
a process average and variation. One combination that is frequently used is the x and
s chart. The procedure for constructing x and s charts parallels that for x and R charts:
The variation chart (i.e., the s chart) is first brought into statistical control, then the x
chart is constructed using control limits formed from the centerline of the s chart.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.3 Control Charts for Mean and Variation 261
Starting with k subgroups, each of size n, we denote the individual subgroup stan-
dard deviations by s1, s2, s3, . . . , sk. Their average, s, forms the centerline of the s chart:
1 k
s5 ^s
k i51 i
s is an estimate of s, the mean of the sampling distribution of the sample standard
deviation based on samples of size n. Following the usual 3-sigma procedure, control
limits for the s chart can be shown to have the form
UCL 5 B4s and LCL 5 B3s
where B3 and B4 depend on the subgroup size, n, and are found in Appendix Table XI.
In addition, to calculate the capability of a process, the standard deviation of the process
measurements can be estimated by
s
n5
c4
where c4 is yet another control chart constant found in Appendix Table XI. The same
extended list of “out-of-control” rules used for x and R charts can be applied to x and s
charts (see Figure 6.5 on page 254).
For the x chart, the grand average of the subgroup means forms the centerline of
the chart, as follows:
1 k
x5 ^x
k i=1 i
Following the same procedure as with the x and R charts, we form the control limits for
the x chart by substituting an estimate of into the theoretical 3-sigma limits. In this
case, the estimate is syc4, which is based on the s chart:
syc4 syc4
UCL 5 1 3 x13 and LCL 5 2 3 x23
1n 1n 1n 1n
By letting A3 5 3yc4 1n, we can write these control limits in the simpler form
Example 6.2 In this example, we reanalyze the key groove data of Table 6.1, this time using
x and s charts. Using the average of the 20 subgroup standard deviations,
s 5 .0009672, along with the control chart constants B3 5 0 and B4 5 2.089 from
Appendix Table XI (for subgroups of size n 5 5), we calculate the control limits
for the s chart to be
UCL 5 B4s 5 (2.089)(.0009672) 5 .002020
and
LCL 5 B3s 5 (0)(.0009672) 5 0
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
262 chapter 6 Quality and Reliability
The s chart, shown in Figure 6.8, does not exhibit any out-of-control conditions.
With respect to the x chart, the centerline is still calculated as the average of the
subgroup averages:
1 k
x5 ^ x 5 .007966
k i=1 i
as in Example 6.1. For subgroups of size n 5 5, the factor A3 5 1.427 is found from
Appendix Table XI. This gives control limits of
UCL 5 x 1 A3s 5 .007966 1 (1.427)(.0009672) 5 .009346
LCL 5 x 2 A3s 5 .007966 2 (1.427)(.0009672) 5 .006586
Note that these limits are very close to the limits obtained from the R chart (UCL 5
.009351 and LCL 5 .006581). Consequently, the x chart is almost identical to that
of Example 6.1, and, in particular, it gives no out-of-control signals (see Figure 6.9
below).
Standard deviation
.001 – = .000967
0 LCL = 0
Sample number
0 10 20
Sample mean
.0095
UCL = .009346 Unless otherwise noted, all content on this page is © Cengage Learning.
.0085
– = .007966
.0075
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.3 Exercises 263
14. The control limits on x charts become closer 17. Refer to the data of Exercise 16.
together as the subgroup size n is increased (i.e., a. Construct an s chart for this data, and check for
the A2 factor decreases as n increases). For a process special causes.
that is in statistical control, does this imply that a b. Construct an x chart for this data. Why are the
control chart point is more likely to fall outside the control limits of this chart different from those in
control limits of an x chart based on a larger sub- Exercise 16(b)?
group size rather than a smaller subgroup size?
18. When installing a bath faucet, it is important to
15. Subgroups of four power units are selected once properly fasten the threaded end of the faucet stem
each hour from an assembly line, and the high- to the water-supply line. The threaded stem dimen-
voltage output of each unit is measured. Suppose sions must meet product specifications, otherwise
that the sum of the ranges of 30 such subgroups is malfunction and leakage may occur. Authors of
85.2. Calculate the centerline and control limits of “Improving the Process Capability of a Boring
an R chart for this data. Operation by the Application of Statistical Tech-
16. Hourly samples of size 3 are taken from a process niques” (Intl. J. Sci. Engr. Research, Vol. 3, Issue 5,
that produces molded plastic containers, and a May 2012) investigated the production process of a
critical dimension is measured. Data from the most particular bath faucet manufactured in India. The
recent 20 samples is given here: article reported the threaded stem diameter (target
value being 13 mm) of each faucet in 25 samples of
Hour x1 x2 x3 Hour x1 x2 x3 size 4 as shown here:
1 .36 .39 .36 11 .36 .32 .36
Subgroup x1 x2 x3 x4
2 .33 .35 .30 12 .38 .47 .35
1 13.02 12.95 12.92 12.99
3 .51 .41 .42 13 .29 .45 .39
4 .42 .37 .34 14 .44 .38 .43 2 13.02 13.10 12.96 12.96
5 .39 .38 .38 15 .38 .37 .37 3 13.04 13.08 13.05 13.10
6 .33 .41 .45 16 .31 .43 .38 4 13.04 12.96 12.96 12.97
7 .43 .39 .41 17 .39 .49 .35 5 12.96 12.97 12.90 13.05
8 .41 .32 .32 18 .43 .36 .38 6 12.90 12.88 13.00 13.05
9 .37 .42 .36 19 .40 .45 .32 7 12.97 12.96 12.96 12.99
10 .26 .42 .32 20 .40 .40 .32 8 13.04 13.02 13.05 12.97
a. Construct an R chart for this data. Are any out- 9 13.05 13.10 12.98 12.96
of-control signals indicated by this chart? 10 12.96 13.00 12.96 12.99
b. Construct an x chart for this data, and check for 11 12.90 13.05 12.98 12.88
signs of special causes. 12 12.96 12.98 12.97 13.02
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
264 chapter 6 Quality and Reliability
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.4 Process Capability Analysis 265
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
266 chapter 6 Quality and Reliability
made about the type of probability distribution that the process is thought to follow. Since
many process characteristics tend to follow normal distributions, the majority of capability
calculations are based on this distribution. In recent years, capability indexes have also been
developed for nonnormal process data. We do not discuss the calculations required for non-
normal data, which are much more laborious than the relatively simple computations for
normal processes, but we do provide references on this material for the interested reader.
Nonconformance Rates
The proportion of the process measurements that fall above the upper specification
limit or below the lower limit are called nonconformance rates or nonconformance
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.4 Process Capability Analysis 267
LSL ˆ USL
10.0 100,000
5.0 50,000
1.0 10,000
.1 1,000
.01 100
.001 10
.0001 1
Example 6.3 Because the control charts of the ignition key process in Example 6.1 do not indicate
any out-of-control conditions, the process appears to be in statistical control. Suppose
that the specification limits for the groove depth of the keys are .0072 6 .0020 inch.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
268 chapter 6 Quality and Reliability
Assuming that the process data is normally distributed, we can estimate the process
standard deviation using the centerline of the R chart,
R .002400
n5
5 5 .00103
d2 2.326
Alternatively, the variation can be estimated by syc4 from the centerline of the s chart.
The nonconformance rates can then be estimated by
USL 2
n
P(x . USL) Pa z . b
n
.0092 2 .007966
5 Paz . b 5 P(z . 1.20) 5 .1151
.00103
LSL 2
n
P(x , LSL) Pa z , b
n
.0052 2 .007966
5 Paz , b 5 P(z , 22.69) 5 .0036
.00103
In percentage terms, we estimate that about 11.51% of the output of this process ex-
ceeds the upper specification limit, whereas only .36% is below the lower limit. This
gives rise to a total percentage of 11.51% 6 .36% 5 11.87%, which is unacceptably
high. Thus statistical control alone does not necessarily guarantee that a process will
successfully meet its specification limits.
Capability Indexes
Process spread, as defined by the interval
n 6 3 n , gives a measure of how a process is cur-
rently performing. The width of this interval is 6
n . Alternatively, the distance between the
specification limits, USL 2 LSL, provides a measure of the maximum process spread we
are willing to tolerate. By comparing the two measures, it is possible to give a very succinct
summary of the capability of a process to meet its specification limits. We refer to the pro-
cess spread as the actual process spread and to USL 2 LSL as the allowable process spread.
The process capability index, denoted by Cp, is defined by the ratio
allowable spread USL 2 LSL
Cp 5 5
actual spread 6
n
where n is an estimate of the process standard deviation.
The Cp index is interpreted as follows. If Cp 5 1, then the process is said to be mar-
ginally capable of meeting its specification limits. This occurs when the process is ex-
actly centered midway between its specification limits (i.e., when n 5 (USL 1 LSL)y2
and the actual process spread uses all of the allowable spread. As you can see from
Figure 6.11, this is a fairly tenuous situation since even the slightest movement of the
process mean will lead to an increase in the overall nonconformance rate of the process.
Normally, we would like the Cp to exceed 1, since then there is a higher likelihood that
the process measurements will be able to stay within the specification limits, even if
the mean wanders a little. A Cp that exceeds 1.33 (i.e., an 8- spread that fits within the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.4 Process Capability Analysis 269
< 1.0
Process is not capable.
LSL USL
= 1.0
Process is marginally
capable.
LSL USL
> 1.0
Process is capable.
LSL USL
specification limits) is usually considered fairly good and is commonly used as a goal by
many companies. On the other hand, Cp values that are less than 1 imply that a process
is not capable of meeting the specification limits.
The Cp is one of four commonly used indexes, originally invented in Japan, which
are routinely used in modern quality improvement programs. The indexes derive their
usefulness from the fact that they convey much information in a very simple fashion.
Capability indexes also have the advantage of being unitless measures, making them
useful for comparing related and unrelated processes alike. For example, if the copper
plating thickness (in inches) from a chemical plating process has a Cp of .81, whereas
the resistance (in ohms) of certain electronic components has a Cp of 2.30, then we can
conclude that the electronic process is the more capable of the two, even though their
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
270 chapter 6 Quality and Reliability
= 2.0
LSL USL
= 2.0
LSL USL
For normally distributed data, n is taken to be the centerline of the x chart and
n is
chosen to be Ryd2, syc4, or perhaps the combined-subgroup estimate s mentioned previ-
ously. The k in the subscript of Cpk refers to the so-called k factor:
(USL 1 LSL)y2 2
n
k5
(USL 2 LSL)y2
which measures the extent to which the process location n differs from the midpoint of
the specification region. It can be shown that k lies between 0 and 1 and that Cp and Cpk
are related by the formula
Cpk 5 (1 2 k)Cp
Since 0 # k # 1, this formula shows that Cpk never exceeds Cp and that Cpk 5 Cp pre-
cisely when the process is centered midway between its specification limits. When used
together, Cp and Cpk give a clear picture of process performance as well as process
potential.
Unless otherwise noted, all content on this page is © Cengage Learning.
Example 6.4 Nonconformance rates for the groove dimension data (Table 6.1) are calculated in
Example 6.3, where we concluded that the process had poor capability. The reason for the
poor capability can be found by comparing the Cp and Cpk indexes. Using the estimates
n 5 .007966
and n 5 .00103
from Example 6.3 along with the specification limits USL 5 .0092 and LSL 5 .0052,
we calculate the k factor as follows:
(USL 1 LSL)y2 2 n (.0092 1 .0052)y2 2 .007966
k5 5 5 .383
(USL 2 LSL)y2 (.0092 2 .0052)y2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.4 Process Capability Analysis 271
The Cp and Cpk indexes are used for quality characteristics that have two-sided
tolerances, that is, processes with both upper and lower specification limits. Some char-
acteristics, however, can have one-sided tolerances. The breaking strength of a material,
for instance, usually has a lower specification limit, but no upper specification, since
we normally want materials to have a certain minimum strength but we do not care by
how much they exceed that minimum. One-sided capability indexes are used for such
processes. In fact, the definitions of upper and lower capability indexes are contained
within the definition of the Cpk. For processes having only a lower specification limit,
LSL, the lower capability index Cpl is defined by
n 2 LSL
Cpl 5
3n
Similarly, for processes having only an upper specification USL, the upper capability
index Cpu is given by the formula
USL 2 n
Cpu 5
3n
The reason 3 n rather than 6n appears in the denominators is that one-sided capability
indexes compare only one side of the process distribution, the upper or lower, to the
corresponding upper or lower specification limit.
Even when a process has both upper and lower specification limits, calculating
Cpu and Cpl is worthwhile because the smaller of the statistics indicates the direction in
which the process average has shifted away from the nominal value. In fact, from the
formulas it is apparent that Cpk is equal to the smaller of Cpl and Cpu:
Cpk 5 minimum 3 Cpl, Cpu 4
For data that is normally distributed, it is convenient to transform Cpu and Cpl into
their corresponding nonconformance rates using the following relationships:
USL 2 n
P(x . USL) Pa z . b 5 P(z . 3Cpu)
n
LSL 2
n
P(x , LSL) Pa z , b 5 P(z , 2 3Cpl)
n
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
272 chapter 6 Quality and Reliability
Example 6.5 Using the results of Examples 6.3 and 6.4, we calculate the Cpu and Cpl indexes for
the groove depth data of Table 6.1 as follows:
n 2 LSL
.007966 2 .0052
Cpl 5 5 5 .895
3n 3(.00103)
USL 2
n .0092 2 .007966
Cpu 5 5 5 .399
3
n 3(.00103)
This information could be used, if desired, to calculate Cpk:
Because both Cpu and Cpl are less than 1.0, we can conclude that the process is not
performing well with respect to meeting either of its specification limits. Further-
more, the fact that Cpl is the smaller of the two indexes means that the process aver-
age has shifted to the right of the midpoint of the specification region.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.5 Control Charts for Attributes Data 273
25. Why must a process be in a state of statistical con- 31. Use the data in Exercise 6 to calculate the Cp, Cpu,
trol before its capability can be measured? Cpl, and Cpk indexes. What do the indexes indicate
about the capability of the process?
26. A process has a Cp index of 1.2 and is centered on
its nominal value. What proportion of the specifica- 32. Using the data of Exercise 16, we estimated the
tion range is used by the process measurements? process standard deviation in Exercise 23. If the
specification limits for the process are .40 6 .08,
27. A computer printout shows that a certain process
calculate the Cp and Cpk indexes. What conclu-
has a Cp of 1.6 and a Cpk of .9. Assuming that the
sions can you draw about the capability of the
process is in control, what do these indexes say
process?
about the capability of this process?
28. A process with specification limits of 5 6 .01 has a Cp 33. The data of Exercise 21 was analyzed by first trans-
forming it into deviations from the nominal value
of 1.2 and a Cpk of 1.0. What is the estimated process
and then running x and R charts on the transformed
average x from which these indexes are calculated?
data. Suppose the specification limits on the pro-
29. It can be shown that the following equation always cess are .254 6 .01 inch.
holds for processes that can be described by a normal a. Describe a procedure for calculating capability
distribution (Farnum, N. R., Modern Statistical Qual- indexes from the transformed data.
ity Control and Improvement, Duxbury, Belmont, b. Calculate the Cp, Cpu, Cpl, and Cpk indexes from
CA, 1994: 235): the transformed data.
proportion out of specification 34. Based on your analysis in Exercise 18, if the speci-
5 P(z $ 3Cpk) 1 P(z $ 6Cp 2 3Cpk) fication limits for the process are 13 6 .2, calculate
Use this equation with the Cp and Cpk from the Cp and Cpk indexes. What conclusions can you
Exercise 27 to estimate the proportion of the pro- draw about the capability of the process?
cess that is not within the specification limits. 35. Using the data of Exercise 22, if the specification
limits for the process are 60 ± .4, calculate the Cp
30. Use the formula given in Exercise 29 to calculate
and Cpk indexes. What conclusions can you draw
the proportion of the process that is out of specifica-
about the capability of the process?
tion in Exercise 28.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
274 chapter 6 Quality and Reliability
with the np chart. For products that are created in distinct units, such as components or
appliances, the c chart is used to track the number of nonconformities in subgroups of
such items. When products are not made in distinct units, such as reels of wire, fabric,
or paper, then the u chart is used to monitor the number of nonconformities in speci-
fied “units” of such products.
p and np Charts
The proportions of nonconforming items in successive subgroups of size n are plotted
on p charts. If we assume that all the subgroups come from a stable process in which
the true proportion of nonconforming items is , each of the subgroup proportions p1,
p2, p3, . . . , pk is a statistic whose sampling distribution (see Section 5.5) has a mean and
standard deviation of
(1 2 )
p 5 and p 5
B n
In theory, the 3-sigma control limits for the p chart are formed by p 6 3p. In
practice, of course, we must first estimate and then substitute this estimate into
the formulas.
To estimate , the k subgroup proportions are averaged. This average is denoted by
1 k
p5 ^p
k i51 i
and is used as the centerline of the chart. Substituting p for , we find the control limits
for the p charts as
p(1 2 p)
UCL 5 p 1 3
B n
p(1 2 p)
LCL 5 p 2 3
B n
Sometimes, because of the small values of p that are encountered in practice, the LCL
can be negative. When this happens, we replace the LCL by 0 since it is impossible to
have negative nonconformance rates.
In the frequently occurring case where the subgroup sizes n1, n2, n3, . . . , nk are not
all equal, the calculations for the centerline and control limits are modified as follows.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.5 Control Charts for Attributes Data 275
Letting x1, x2, x3, . . . , xk denote the numbers of nonconforming items in each subgroup,
we estimate the centerline of the p chart by
x1 1 x2 1 x3 1 1 xk
p5
n 1 1 n 2 1 n 3 1 1 nk
This formula is conveniently remembered as “the total number of nonconforming items
over the total sample size.” The formula is more general than the equal-samples formula
and, for that reason, it is sometimes the only formula cited by some texts for estimating .
When the subgroup sizes are unequal, the control limits are calculated separately for each
subgroup. That is, the control limits for the ith subgroup are
p(1 2 p)
UCL 5 p 1 3
B ni
p(1 2 p)
LCL 5 p 2 3
B ni
Example 6.6 Aerospace contractors and subcontractors must often demonstrate, using control charts,
that their manufacturing processes are capable of meeting ever-increasing quality stan-
dards for military systems and hardware (“Department of Defense Renews Emphasis
on Quality,” Quality Progress, March 1988: 19–21). Many such systems include printed
circuit board (PCB) assemblies with various electronic components soldered to them.
Components are soldered in place by means of a wave solder machine, which passes
the PCBs on a conveyor over a surface of liquid solder. Soldered PCBs are then con-
nected to test stations, which electronically test the circuits and classify each board as
either conforming or nonconforming. Table 6.3 contains records of the daily numbers
of rejected (nonconforming) PCBs for a 30-day period. For this data,
14 1 22 1 9 1 1 12
p5 5 .054
286 1 281 1 310 1 1 289
Table 6.3 Daily records of numbers of tested and rejected circuit board assemblies
Day Rejects Tested Proportion Day Rejects Tested Proportion
1 14 286 .049 16 15 297 .051
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
276 chapter 6 Quality and Reliability
Since different numbers of PCBs are tested each day, the control limits for a p chart
of the data are calculated separately for each subgroup:
p(1 2 p)
UCL 5 p 1 3
B ni
(.054)(1 2 .054) .6781
5 .054 1 3 5 .054 1
B n i 2n i
p(1 2 p)
LCL 5 p 2 3
B ni
(.054)(1 2 .054) .6781
5 .054 2 3 5 .054 2
B ni 2n i
Figure 6.13 shows the p chart for using these control limits. Note that the smaller the
subgroup size ni, the wider the control limits. Since the chart shows no signs of any
out-of-control conditions, we conclude that the process is in control and currently
operating at about a 5.4% nonconforming rate.
Proportion
.10
Average UCL = .09368
– = .05385
.05
Figure 6.13 chart of the data in Table 6.3 Unless otherwise noted, all content on this page is © Cengage Learning.
If you have the ability to choose constant subgroup sizes in your particular
application, then the p chart calculations can be further simplified. In fact, with
a constant base of comparison (i.e., constant subgroup size n), there is no need to
even convert the numbers of nonconforming items into the subgroup proportions
p1, p2, p3, . . . , pk. Instead, we can simply plot the numbers of nonconforming items
x1, x2, x3, . . . , xk on the chart. This chart is called an np chart because the number
of nonconforming items in a subgroup is simply n times the proportion of noncon-
forming items.
If x1, x2, x3, . . . , xk denote the numbers of nonconforming items in k subgroups,
then the centerline of the np chart is simply np, where p is calculated by either of the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.5 Control Charts for Attributes Data 277
formulas for p used with the p chart. Similarly, the 3-sigma control limits for the np
chart are found by multiplying each of the control limits of the p chart by n:
UCL 5 np 1 32np(1 2 p)
LCL 5 np 2 32np(1 2 p)
Example 6.7 In complex systems, items are routed through a succession of different processes be-
fore emerging as finished products or completed services. In “build to order” systems,
for example, individual orders are routed through slightly different paths from other
orders, according to a customer’s specific design requirements. A common method for
tracking an item’s progress during this journey is to attach paperwork to each order that
describes the requirements for every step of the production process. These documents,
often called travelers, are created before an order is processed. It is imperative that they
be correct, since incorrect travelers are essentially recipes for nonconforming products!
To monitor the quality of such paperwork, suppose that periodic samples
of 100 travelers are examined for errors, where a nonconforming document
is defined to be one that contains at least one error. Table 6.4 shows data from
25 daily samples of size 100 travelers and the corresponding numbers of noncon-
forming ones. The total number of nonconformities in the 25 samples is 272, so
p5 272y[(25)(100)] 5 .1088 and, therefore, np 5 100(.1088) 5 10.88. The control
limits are
UCL 5 np 1 32np(1 2 p) 5 10.88 1 3210.88(1 2 .1088) 5 20.22
LCL 5 np 2 32n2
p (1 2 p) 5 10.88 2 3210.88(1 2 .1088) 5 1.54
4 11 100 17 11 100
5 6 100 18 6 100
6 7 100 19 10 100
7 12 100 20 10 100
8 10 100 21 11 100
9 6 100 22 11 100
10 11 100 23 11 100
11 9 100 24 6 100
12 14 100 25 9 100
13 16 100
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
278 chapter 6 Quality and Reliability
The np chart (Figure 6.14) shows one point (subgroup 14) above the UCL. Production
records for day 14 should be examined for a possible assignable cause. If one is found,
then subgroup 14 should be eliminated from the calculations, and an np chart with a
revised centerline and control limits should be used to monitor subsequent data.
Sample count
1
20 UCL = 20.22
– = 10.88
10
LCL = 1.538
0 Subgroup (day) number
0 5 10 15 20 25
Figure 6.14 chart from Minitab for the data of Table 6.4
(Minitab labels the first out-of-control point with a “1”)
c and u Charts
Because an object can have any number of flaws, or nonconformities, it is important
to establish an inspection unit when working with c and u charts. The inspection unit
defines the fixed unit of output that will be regularly sampled and examined for non-
conformities. Inspection units are often single units of product, such as a single printed
circuit board or a single television. Inspection units can also be collections of items,
which might be used when one examines accounting records for errors by looking at
batches of 100 accounting records per day. The inspection unit is then 100 records, and
the number of nonconformities for such an inspection unit is the total number of errors
found in each such batch. Products are usually grouped in batches like this when the
nonconformance rate is small and large samples are needed to detect nonconformities. Unless otherwise noted, all content on this page is © Cengage Learning.
Choosing an inspection unit is especially important with continuous processes, such
as the production of long rolls of paper, wire, fabric, or metal. To count the number of
surface flaws in long rolls of metal, for example, it would not be practical to look at every
square foot of the metal surface. Instead, we decide on a fixed-size inspection unit, say, a
2-square-foot section of metal, and count the number of nonconformities found therein.
The number of nonconformities per unit (i.e., per inspection unit) is denoted by c.
To create a c chart, a sample of k successive inspection units is examined, and the num-
bers of nonconformities c1, c2, c3, . . . , ck found in these units are counted. The centerline
of the chart, denoted by c, is the average
1 k
c5 ^c
k i51 i
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.5 Control Charts for Attributes Data 279
Example 6.8 One measure of software quality is the number of coding errors made by program-
mers per 1000 lines of computer code. Using K to denote 1000, the inspection unit
“a thousand lines of code” is usually abbreviated as KLOC (i.e., K Lines Of Code).
The data in Table 6.5 shows the defects per KLOC obtained from weekly test logs in
a software company. The average number of errors per KLOC is c 5 134y30 5 4.467.
The upper and lower control limits of the chart are then
11 2 26 1
12 5 27 5
13 5 28 5
14 4 29 8
15 3 30 8
Because the LCL is negative, we reset it to 0 and then construct the c chart shown in
Figure 6.15. Note that the two points at weeks 18 and 19 are touching the lower con-
trol limit and that there are several runs of points on the same side of the centerline.
According to the extended “out-of-control” rules in Section 6.2, these observations
do not quite qualify as out-of-control signals, but they are close. It might therefore
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
280 chapter 6 Quality and Reliability
be rewarding to conduct a small search for reasons why the error rate was so low in
weeks 18 and 19.
Sample count
UCL = 10.81
10
5 – = 4.467
0 LCL = 0
Subgroup (week) number
0 10 20 30
Note that the numbers of inspection units, ni, represented in a sample does not have to
be an integer.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.5 Control Charts for Attributes Data 281
For k subgroups of such data, the statistics u1, u2, u3, . . . , uk are plotted on the u
chart. The centerline on the chart is
total nonconformities in the k subgroups
u5
total number of inspection units
c1 1 c2 1 c3 1 1 ck
5
n 1 1 n 2 1 n 3 1 1 nk
Because the subgroup size ni usually varies from sample to sample, control limits for
the u chart are computed separately for each subgroup:
u
UCL 5 u 1 3
A ni
u
LCL 5 u 2 3
A ni
Example 6.9 The data in Table 6.6 shows the number of flaws found in 30 samples of fabric
and corresponding sizes of the samples examined (in square feet). Suppose
an inspection unit of 2 square feet is used to monitor the quality of this fabric.
Table 6.6 also shows the conversion of the raw nonconformity rates into the per unit
rates, ui. The u chart of this data (Figure 6.16) reveals several out-of-control points,
some bad (above the UCL) and some good (below the LCL). Before this control
chart can be used to monitor subsequent production, a search should be made
for possible assignable causes and then appropriate actions taken. A revised chart,
after eliminating out-of-control points, would then be used to monitor subsequent
samples from the process.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
282 chapter 6 Quality and Reliability
15
– = 6.496
5
36. Explain the difference in the actions taken on a a. From this information, calculate the centerline
process when a point on a p chart exceeds the upper and control limits for a p chart.
control limit versus the actions taken when a point b. The highest number of failures on a given day was
falls below the lower control limit. 39 and the lowest number was 13. Would either of
these points indicate an out-of-control condition?
37. For a fixed subgroup size n, find the smallest value
c. If your answer to part (b) is “yes,” then eliminate
of p that will give a positive lower control limit on a
the out-of-control point(s) from the data and re-
p chart.
compute the centerline and control limits of the
38. Control limits for attributes charts are never nega- p chart.
tive, and it is desirable that they be positive. For a
41. After assembly and wiring of the individual keys,
c chart, what values of the centerline c will ensure
computer keyboards are tested by an automated test
that the lower control limit is positive?
station that pushes each key several times. Daily re-
39. The following data shows the number of noncon- cords are kept of the number of keyboards inspected
forming items found in 30 successive lots, each of and the number that fail the inspection. Data from
size 50, of a finished product: 25 successive manufacturing days is given here.
Unless otherwise noted, all content on this page is © Cengage Learning.
4 3 0 2 2 2 0 1 1 0
Number Number Number Number
3 2 1 1 0 0 2 4 2 5 Day tested failed Day tested failed
0 0 1 1 0 3 2 1 2 4
1 2186 28 11 2141 31
a. Construct a control chart for the proportion of
2 2131 21 12 2019 18
nonconforming items per lot.
3 2158 22 13 2027 27
b. Interpret the chart in part (a).
4 2307 14 14 2376 25
40. On each of 25 days, 100 printed circuit boards are 5 2262 17 15 2118 27
subjected to thermal cycling; that is, they are sub- 6 2379 27 16 2251 14
jected to large changes in temperature, a procedure 7 2069 18 17 2068 31
known to cause failures in boards with weak circuit 8 2264 20 18 2242 23
connections. Of the boards tested, a total of 578 fail 9 2383 18 19 2089 23
to work properly after the thermal cycling test. 10 2350 19 20 2387 36
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.6 Reliability 283
6.6 Reliability
Implicit in our understanding of the term quality is a product’s ability to perform its
intended function for a reasonable period of time. Unless expressly designed for short-
term or one-time jobs, products that fail after only a brief period of use are not normally
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
284 chapter 6 Quality and Reliability
Failure Laws
The length of time that a product lasts until it fails, or ceases to operate correctly, is
called its lifetime. Lifetimes are measured in terms of how a product is used. Many
product lifetimes are simply measured in units of time (minutes, hours, etc.), as, for
example, in a wall clock battery that begins its useful life when installed in a clock and
fails sometime later when the clock stops. For items such as lightbulbs, that usually do
not operate continuously, lifetimes refer to the accumulated operating time a product
experiences before failure (i.e., the total number of hours during which the bulb was
on). With tires, the number of miles driven is usually a better indicator of product life
than simply the time that the tires have been on the car. Mechanical devices, such as
springs, have lifetimes measured in cycles of operation, where, for example, a cycle
might be defined to be one compression and release of the spring. Whatever units are
used, time or cycles, we define a product’s lifetime to be a measure of the total accu-
mulated exposure to failure, often called the time on test, that the product experiences
prior to failure.
Lifetimes are modeled as continuous random variables and, as such, their prob-
ability distributions are described by probability density functions (pdf’s). Lifetimes can
take on nonnegative numerical values, even zero (e.g., products that fail immediately),
so density functions such as the exponential, Weibull, and lognormal are frequently
used to model lifetimes. Distributions that allow negative values, such as the normal
distribution, can also be used as long as their parameters are chosen in a manner that
gives negligible probability to negative lifetimes. When used to model lifetimes, density
functions are also called failure laws.
Choosing an appropriate failure law for a particular product or set of data can be
done in several ways:
1. There may be a physical or mathematical reason that justifies the use of a
particular density (e.g., the Central Limit Theorem justifies using the normal
distribution for sums and averages).
2. Quantile plots (see Section 2.4) may show that a particular density provides a
good fit to available data.
3. A failure law may have already been used by others and found to work well.
Because of the vast amount of research that has already been done on many products
and materials, item (3) in the preceding list often leads to a good failure law choice. It
is also useful to keep in mind the following brief list of situations that may provide the
necessary justification needed in item (1):
Normal failure laws often apply in situations where lifetimes are the result of a
sum of many other variable quantities.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.6 Reliability 285
Exponential failure laws apply to products whose current ages do not have
much effect on their remaining lifetimes. This is the “memoryless” property of
exponential distributions (see Exercise 61). Typical applications: fuse lifetimes,
interarrival times, alpha ray arrivals, Geiger counter ticks).
Lognormal failure laws work well when the degradation in lifetime is propor-
tional to the previous amount of degradation (typical applications: corrosion,
crack growth, diffusion, metal migration, mechanical wear).
Weibull failure laws are good models for the failure time of the weakest compo-
nent of a system (e.g., capacitor, bearing, relay, and pipe joint failures).
Example 6.10 The lognormal distribution is often used to model tread wear of tires. To fit a log-
normal distribution to such data, suppose a tire manufacturer uses warranty data to
estimate that the mean time to failure (measured in total miles driven) for a certain
tire model is 40,000 miles with a standard deviation of 7500 miles. Denoting tire life-
times (in miles) by a random variable x, the parameters of the lognormal distribution
can be calculated using the formulas (see pages 69 and 77)
V(x) 5 e21 1 e 2 1 2
2 2 2
E(x) 5 e1( y2) and
which can be solved for the lognormal parameters and :
V(x) 75002
2 5 ln a 1 1 2 b 5 ln a 1 1 b 5 .034552
[E(x)] [40,000]2
so 5 .185882 and
2 .034552
5 ln(E(x)) 2 5 ln(40,000) 2 5 10.57936
2 2
Directly related to R(t) is a function Z(t) called the failure rate or hazard function:
f (t)
failure rate at time t 5 Z(t) 5
R(t)
Z(t) is interpreted as the instantaneous rate of failure at time t, meaning that of those
items that have not failed before time t, the proportion that will fail in the small interval
of time from t to t 1 Dt is approximately Dt ? Z(t). The failure rate function is very use-
ful for describing the manner in which failures occur.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
286 chapter 6 Quality and Reliability
The normal and lognormal distributions do not have closed-form expressions for
either the reliability or the hazard functions; however, the exponential and Weibull
distributions do have simple closed-form expressions for R(t) and Z(t):
Density R(t) Z(t)
Exponential: e 2x
( . 0) e 2t
(a constant)
21 2(xy) 21
Weibull: x e (, . 0) e2(ty) t
(recall that the exponential is a special case of the Weibull when 5 1y and 5 1).
Figure 6.17 shows graphs of Z(t) for various values of (the “shape” parameter) for
the Weibull distribution. Notice that for 0 , , 1 the failure rate decreases with time, for
5 1 (i.e., the exponential distribution) the failure rate is constant, and for . 1 the failure
rate increases with time. In the case of the exponential distribution ( 5 1), the fact that the
failure rate is constant is often interpreted as saying that products that have exponential fail-
ure law are “memoryless.” That is, no matter how old such products are, their failure rates
are always the same. This means, after any time t, such products are essentially “as good as
new.” In fact, this may be a good approximation to the behavior of items such as fuses—if
a fuse has not burned out by time t, then it is probably very nearly as good as a new fuse.
() () ()
Figure 6.17 Failure rates of Weibull distributions for various values of the shape
parameter
Unless otherwise noted, all content on this page is © Cengage Learning.
Example 6.11 In Example 6.10, warranty data on tire failures was used to estimate the parameters
5 10.57936 and 5 .185882 of a lognormal distribution that describes tread wear
(in miles). Denoting tread life by x, the reliability function for x can be calculated using
the fact that ln(X) follows a normal distribution with mean and standard deviation :
ln(t) 2 ln(t) 2
R(t) 5 P(x . t) 5 P(ln(x) . ln(t)) 5 Pa z . b 5 1 2 Fa b
where F(z) denotes the cumulative probability for the standard normal distribution
(see Appendix Table I). Although there is no closed-form expression for R(t), it is easy
to use Table I or statistical software to create a graph of R(t), as shown in Figure 6.18.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.6 Reliability 287
()
1.0
.5
0
0 10000 20000 30000 40000 50000 60000
Similarly, the hazard function Z(t) can be computed and plotted, as shown in Figure 6.19.
Notice that the failure rate is an increasing function for the lognormal distribution.
()
.0002
.0001
0
0 10000 20000 30000 40000 50000 60000
System Reliability
Products that consist of large assemblies of components can be at risk of failure if one or
more of their individual parts fails. Studying how a product’s components are connected
and how this affects product lifetime is referred to as topological or system reliability.
Systems or assemblies are usually comprised of successive levels of subsystems
whose individual reliabilities are easy to estimate. By finding the subsystem reliabilities
first, one can often combine these estimates into an overall estimate of product reliabil-
ity. The particular combination depends on how the subsystems are connected.
Series systems are defined to be systems whose individual components are connect-
ed end-to-end in a “series.” Figure 6.20 shows a diagram of the typical series system. The
main aspect of such systems is that they can only function as long as every component
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
288 chapter 6 Quality and Reliability
of the system functions correctly. Examples of series systems would be the tires on a
vehicle, batteries in a flashlight, and the power supply and CPU in a computer.
If we denote the reliability at time t of the ith component by Ri(t), then the fundamental
theorem of series systems can be summarized as follows:
If all components in a series system function of one another, then the reli-
ability function ( ) for the entire system is simply the product of the reliability functions
of the components. That is:
( )5 1( )? 2( )? 3( ) ()
Parallel systems are ones whose components function in parallel, that is, those
systems that will function as long as at least one of their components functions correctly.
Figure 6.21 shows a diagram of a typical parallel system comprised of n components.
Parallel systems are often used to build redundancy into a product; that is, the com-
ponents in parallel systems serve as “backups” for each other so that if one component
fails, then the entire system will not necessarily fail. Such systems are often used to
increase the reliability of a product. Examples of parallel systems include computer
routing systems, pacemakers, and safety systems on airplanes.
The fundamental result for computing the reliability R(t) of a parallel system in
terms of the reliabilities Ri(t) of its n components is
If all components in a parallel system function of one another, then the reli-
ability function ( ) for the entire system is given as
( ) 5 1 2 [1 2 1( )] ? [1 2 2( ) 4 ? [1 2 3( )] [1 2 ( )]
Component 1
Unless otherwise noted, all content on this page is © Cengage Learning.
Component 2
Component 3
.
.
.
Component
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6.6 Reliability 289
The concepts of series and parallel systems can be used either separately or in
combination when analyzing the reliability of complex systems. The basic method is,
when possible, to break down a complex system into various combinations of series and/
or parallel subsystems. The reliabilities of such subsystems can then be calculated from
the theorems in this section and they, in turn, can often be combined to calculate the
overall product or system reliability.
Example 6.12 Routers are used in the telecommunications industry to transmit data (in the form of
digitized electronic signals) from one location to another. Because many important
business and scientific organizations depend upon the continuous availability of data,
routing systems must be highly reliable. The usual way of increasing reliability in rout-
ing systems is to include various sources of redundancy in the form of parallel subsys-
tems. For example, Figure 6.22 shows a routing system that uses two identical routers
in parallel. In addition, each router contains four different power sources (arranged in
parallel) and two “supervisor cards” (also in parallel) that direct the router’s actions.
Assuming all power sources are of the same kind, each with reliability function
Rp(t), and that they act independently of one another, the reliability of each set of
four power sources is 1 2 [1 2 Rp(t)]4. Making the same assumptions for the super-
visor cards [cards are independent and have a common reliability function Rs(t)],
the reliability of each set of two cards is 1 2 [1 2 Rs(t)]2]. Since the power sources
in each router are connected in series to the supervisor cards, the reliability of a
single router must be the product of the power source reliability and the supervisor
card reliability: {1 2 [1 2 Rp(t)]4} ? {1 2 [1 2 Rs(t)]2}.
Since both routers are connected in parallel, the overall reliability for the rout-
ing system is 1 2 [1 2 {1 2 [1 2 Rp(t)]4} ? {1 2 [1 2 Rs(t)]2}2]. The final step would
be to determine the particular form of the failure laws for the power sources and
supervisor cards (e.g., exponential or Weibull) and substitute these numerical expres-
sions into the overall reliability formula.
Router 1
Unless otherwise noted, all content on this page is © Cengage Learning.
Router 2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
290 chapter 6 Quality and Reliability
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 291
corresponding mirror disk and that the three such b. Suppose any disk has an exponential lifetime
pairs of disks are connected in series. (in months) with parameter 5 .025. Calculate
a. Draw a diagram of this RAID system. the reliability of this system.
Supplementary Exercises
52. When affixed to an object, each piece of paper in a. Construct an s chart based on this data.
a pad of adhesive notepaper must stay in place but b. Check the chart in part (a) for any out-of-control
must also be easily removable. The strength of the points. If there are any, eliminate them from the
adhesive used is a critical quality characteristic of data and reconstruct the s chart. Repeat this
such pads. For this type of product, does adhesive process, if necessary, until there are no out-of-
strength have a one- or a two-sided tolerance? control signals in the s chart.
53. In a bottling process, a beam of light is passed 57. The deviations from nominal transformation in Exer-
through the necks of bottles passing by on a conveyor cise 21 can be used in so-called short-run processes.
system. Underfilled bottles, which allow the beam Even though small numbers of different-size parts are
of light to pass through, trip a sensor that routes the created by such processes, the deviations from the
bottles off the conveyor system. Bottles with liquid various nominal values of these parts provide informa-
levels above the level of the light beam do not trigger tion about the particular process, not the parts, that
the sensor, thereby meeting the required fill specifi- is common to all the parts. For example, consider a
cation; these bottles are then shipped to customers. milling process in which metal bars of various sizes
Describe the shape of the distribution of fill volumes are machined to specified lengths. The size of the bars
for the bottles that pass this inspection. submitted to the machining process may vary from
hour to hour, so there may be insufficient data to cre-
54. Instead of constructing x and R charts for 30 sub-
ate control charts on any particular bar size. However,
groups of size 4, a friend suggests the simpler al-
by subtracting the nominal value from each batch of
ternative of calculating the standard deviation s of
bars, the resulting subgroups of data are sufficient to
the 30 means to establish 3-sigma limits for a con-
create a control chart for the milling process itself. The
trol chart. That is, it is suggested that the 30 means
following table shows the raw length measurements of
be plotted on a chart with control limits x 6 3s .
milled steel bars of various sizes, denoted P1, P2, P3,
Sample means that fall outside these control limits
and P4. The nominal length for bars of type P1 is .125;
would indicate process problems. Explain what is
for bars of type P2, .250; for P3, .375; and for P4, .500.
wrong with this procedure.
Subgroup x1 x2 x3 x4 Part type
55. A tool that drills holes in metal parts eventually
wears out and periodically must be replaced. If the 1 .251 .252 .250 .249 P2
hole diameters drilled by this machine are moni- 2 .372 .378 .379 .375 P3
tored on a control chart, describe the type of pat- 3 .247 .249 .254 .251 P2
tern you would expect to see on the chart as the 4 .248 .247 .250 .252 P2
drill wears out. 5 .249 .249 .250 .249 P2
6 .125 .127 .125 .126 P1
56. A manufacturer of dustless chalk monitors the 7 .372 .374 .375 .376 P3
consistency of chalk by running an s chart on the 8 .499 .502 .495 .503 P4
density of chalk in subgroups of size 8. The most 9 .124 .121 .123 .126 P1
recent 24 such subgroups had the accompanying 10 .126 .126 .130 .122 P1
sample standard deviations (read across):
11 .375 .374 .378 .379 P3
.204 .315 .096 .184 .230 .212 .322 .287 12 .249 .249 .250 .247 P2
.145 .211 .053 .145 .272 .351 .159 .214 13 .250 .253 .251 .248 P2
.388 .187 .150 .229 .276 .118 .091 .056 14 .249 .250 .249 .249 P2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
292 chapter 6 Quality and Reliability
Bibliography
DeVor, R. E., T. Chang, and J. W. Sutherland, Statisti- Milwaukee, 1984. Classic text covering all aspects of
cal Quality Design and Control (2nd. ed.), Prentice reliability. Good explanations, with many examples.
Hall, New Jersey, 2006. Good discussion of several of Meeker, W. Q., and L. A. Escobar, Statistical Methods
the more advanced techniques of quality control. for Reliability Data, Wiley, New York, 1998. Com-
Farnum, N. R., Modern Statistical Quality Control plete, modern presentation of estimation, evaluation,
and Improvement, Duxbury Press, Belmont, CA, and graphing of reliability functions.
1994. A comprehensive overview of control charts, ac- Montgomery, D. C., Introduction to Statistical Qual-
ceptance sampling, experimental design, metrology, ity Control (6th ed.), Wiley, New York, 2012. Com-
and the modern approach to quality. prehensive and easy to read, with good examples and
Lloyd, D. K., and M. Lipow, Reliability: Manage- problems.
ment, Methods, and Mathematics, ASQC Press,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5.1 293
7
Max Earey/Shutterstock.com
Estimation and
Statistical Intervals
7.1 Point Estimation
7.2 Large-Sample Confidence Intervals for
a Population Mean
7.3 More Large-Sample Confidence Intervals
7.4 Small-Sample Intervals Based on a Normal
Population Distribution
7.5 Intervals for m1 m2 Based on Normal
Population Distributions
7.6 Other Topics in Estimation (Optional)
Introduction
The general objective of statistical inference is to use sample information as
a basis for drawing various types of conclusions. In an estimation problem, we
want to make an educated guess about the value of some population charac-
teristic or parameter, such as the population mean battery lifetime , the pro-
portion of all components of a certain type that need service while under
warranty, or the difference 1 2 2 between the population mean lifetimes for
two different types of batteries. The simplest type of estimate is a
a single number that represents our best guess for the value of the parameter.
Thus we might report a point estimate of 758 hours for the population mean
lifetime of all brand X 100-watt lightbulbs; we are not saying that 5 758, only
that sample data suggests 758 as a very plausible value for . Point estimation
is discussed in Section 7.1.
293
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
294 chapter 7 Estimation and Statistical Intervals
n 5 x 5 32.5
says that the point estimate of the population mean is 32.5 and that this estimate was
calculated using the sample mean x as the estimator.
Example 7.1 A commonly used method of estimating the size of a wildlife population is to perform
a capture/recapture experiment. Suppose a biologist wishes to estimate the number
of fish in a certain lake; that is, the parameter to be estimated is the population
size N. An initial sample of 100 fish is selected, each one is tagged, and the tagged
fish are returned to the lake. After a time period sufficient to allow the tagged fish to
mix with the other fish in the lake, a second sample of 250 fish is selected. If 25 of the
fish in the recapture sample are tagged, what is a sensible estimate for N? Because
10% of the fish in the recapture sample are tagged, it is reasonable to estimate that
10% of all fish in the lake are tagged. Since we know that a total of 100 fish were
initially tagged, this suggests that we use 1000 as a point estimate of N.
More generally, if M denotes the number of fish initially tagged, n the size of the
recapture sample, and x the number of tagged fish in the recapture sample (so x is a
random variable), the proposed estimator of N is N n 5 [Mnyx]. (The square bracket
notation [c] denotes the largest whole number that is at most c; this takes care of cases
where Mn@x is not a whole number.)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.1 Point Estimation 295
Frequently, there is more than one estimator that can sensibly be used to calculate
an estimate, as the following example shows.
Example 7.2 Consider a population of N 5 5000 invoices. Associated with each invoice is its “book
value,” the recorded amount of that invoice. Let T 5 $1,761,300 denote the known
total book value. Unfortunately, some of the book values are erroneous. An audit will
be carried out by randomly selecting n invoices and determining the audited (i.e.,
correct) value for each one. Suppose the sample gives the following results:
Invoice: 1 2 3 4 5
Book value: 300 720 526 200 127
Audited value: 300 520 526 200 157
Error: 0 200 0 0 230
Let y 5 sample mean book value 5 $374.60, x 5 sample mean audited value 5
$340.60, and e 5 sample mean error 5 $34.00. Each of the following estimators for
the total audited (i.e., correct) value and resulting estimates is sensible:
mean per unit statistic 5 N x; estimate 5 5000(340.60) 5 $1,703,000
difference statistic 5 T 2 Ne; estimate 5 1,761,300 2 (5000)(34)
5 $1,591,300
ratio statistic 5 T(xyy); estimate 5 (1,761,300)(340.6y374.6) 5 $1,601,438
The choice among these estimates is not clear-cut. In fact, all three of the estimators
have been advocated by those employing statistical methodology in auditing.
In situations where there is more than one sensible estimator available, criteria for
selecting an estimator are needed. We now turn to a brief discussion of desirable proper-
ties of estimators.
Properties of Estimators
One desirable property that a good estimator should possess is that it be unbiased. An
estimator is unbiased if, in repeated random samples, the numerical values of the es-
timator stack up around the population parameter that we are trying to estimate. An
often-used analogy is to think of each value of an estimator as a shot fired at a target, the
target being the population parameter of interest. As long as all the shots fall in a pattern
with the target value in the middle, we say that the shots are unbiased. Notice that we
do not require that any of the individual shots actually hit the target; we require only
that they be centered around the target value. If the majority of the shots are centered
somewhere else, then we say that they exhibit a certain amount of bias.
In terms of sampling distributions, an estimator is said to be unbiased if the mean
of its sampling distribution coincides with the parameter that is being estimated. For
instance, we know from Section 5.5 that the sampling distribution of the statistic x has a
mean value of x, which equals the mean of the population from which the samples
are taken. Then x is said to be an estimator of the parameter and, because x 5 ,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
296 chapter 7 Estimation and Statistical Intervals
x is also an unbiased estimator of . In general, for any population parameter and any
estimator n of that parameter, Figure 7.1 illustrates what it means for n to be unbiased
or biased.
ˆ is unbiased
(a)
Sampling
distribution
ˆ of ˆ
(b)
ˆ is biased
bias
definitions Denote a population parameter generically by the letter and denote any estima-
tor of this parameter by n. Then n is an unbiased estimator if n 5 . Otherwise,
n is said to be biased, and the quantity n 2 . is called the bias of n.
Some of the most important statistics we have studied are unbiased estimators of
certain population parameters. For example, it can be shown that the sample mean x is Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.1 Exercises 297
the sample, p 5 xy25 Þ .7 for any possible value of x. That is, even though p is unbiased
for estimating , the value of the estimate calculated from any particular sample will
inevitably differ from . Nevertheless, if sample after sample is selected and the value
of p calculated for each one, unbiasedness implies that the long-run average of these
estimates will be the correct value, .7.
A second desirable property that estimators often possess is consistency. If n de-
notes an estimator of some population parameter , then n is said to be consistent
if the probability that it lies close to increases to 1 as the sample size increases.
Simply stated, consistent estimators become more and more accurate as the sample
size increases. That is, as you increase n, it becomes more and more likely that such
estimators will be very close to the parameter they are intended to estimate. The most
common method for showing that an estimator is consistent is to show that its standard
error decreases as the sample size increases. For instance, because the standard error
of x is x 5 y1n, which must necessarily decrease as n increases, the sample mean
qualifies as a consistent estimator of . This means that for any interval around ,
no matter how small the interval, we can eventually select n large enough so that the
sampling distribution lies almost entirely within the interval. This property is illus-
trated in Figure 5.19. Although there are some estimators that are not consistent, such
examples are fairly rare. In fact, all of the statistical applications in this text involve
consistent estimators.
definition If the probability that an estimator n falls close to a population parameter can be
made as near to 1 as desired by increasing the sample size n, then n is said to be
a consistent estimator of .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
298 chapter 7 Estimation and Statistical Intervals
each sample is determined. The actual proportion minimum sample size n that satisfies this re-
of diseased trees, , is unknown. quirement.
a. For random samples of size n 5 10, calcu- b. Repeat the calculations in part (a) for areas of
late the area under the sampling distribution 80%, 95%, and 99%.
curve for p between the points 2 .10 and c. Plot the sample sizes found in parts (a) and (b)
1 .10. That is, find the probability that the versus their corresponding probabilities. What
sample proportion lies within 6.10 (i.e., 10%) can you conclude from this graph?
of the population proportion. Use the formula
6. Each of 150 newly manufactured items is exam-
for the upper bound on the standard error of p
ined, and the number of surface flaws per item is
(see Section 5.6) in your calculations.
recorded, yielding the following data:
b. Repeat the probability calculation in part (a) for
samples of size n 5 50, n 5 100, and n 5 1000. Number of flaws: 0 1 2 3 4 5 6 7
(Use the normal approximation to the binomial.) Observed frequency: 18 37 42 30 13 7 2 1
c. Graph the probabilities you found in parts
(a) and (b) versus their corresponding sample Let x denote the number of flaws on a randomly
sizes, n. What can you conclude from this graph? chosen item, and assume that x has a Poisson distri-
bution with parameter .
5. Random samples of size n are selected from a a. Find an unbiased estimator for and compute
normal population whose standard deviation is the estimate using the data. Hint: The mean of
known to be 2. a Poisson random variable equals .
a. Suppose you want 90% of the area under b. What is the standard error of the estimator in
the sampling distribution of x to lie within part (a)? Hint: The variance of a Poisson random
61 unit of a population mean . Find the variable also equals .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.2 Large-Sample Confidence Intervals for a Population Mean 299
knowledge of the value of the parameter is reasonably precise. A very wide confidence in-
terval, however, gives the message that there is a great deal of uncertainty concerning the
value of what we are estimating. Figure 7.2 shows 95% confidence intervals for true average
breaking strengths of two different brands of paper towels. One of these intervals suggests
precise knowledge about , whereas the other suggests a very wide range of plausible values.
Brand 1: ( ) Strength
Brand 2: ( ) Strength
Because there is sampling variability in this second standardized variable both in the nu-
merator (because of x) and in the denominator (the value of s will also vary from sample
to sample), it would seem as though its distribution should be more spread out than the
z curve. But appearances are deceiving! It turns out that when n is large, replacement of
by s does not add much variability; in this case, the variable z 5 (x 2 )y(sy1n) also
has approximately a standard normal distribution.
A confidence interval with a 95% confidence level is obtained by starting with a
central z curve area of .95. As Figure 7.3 illustrates, the z critical values 1.96 and 21.96
capture this area (consult Appendix Table I).
The foregoing facts justify the following probability statement:
x2
P a21.96 , , 1.96b .95
sy1n
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
300 chapter 7 Estimation and Statistical Intervals
Lower-tail Upper-tail
area = .025 area = .025
–1.96 0 1.96
Now let’s manipulate the inequalities inside the parentheses to isolate in the middle
and move everything else to the two extremes. This is achieved as follows:
1. Multiply all three terms by sy1n.
2. Subtract x from all three terms (leaving only 2 in the middle).
3. Multiply by 21 (causing the direction of each inequality to reverse).
The result is x 1 1.96(sy1n) . . x 2 1.96(sy1n), or, rewriting the terms in reverse
order,
s s
x 2 1.96 , , x 1 1.96
1n 1n
These new inequalities are algebraically equivalent to those we started with, so
the probability associated with the new inequalities is also (approximately) .95. That is,
think of x 2 1.96(sy1n) as the lower limit and x 1 1.96(sy1n) as the upper limit of an
interval. Both of these limits involve x and s, so the values of both limits will vary from
sample to sample. With a probability of approximately .95, the selected sample will be
such that the value of is captured between these two interval limits. Substituting the
values of n, x, and s from any particular sample into these expressions gives a confidence
interval for with a confidence level of approximately 95%.
The interval is centered at and extends out the same distance, 1.96 y1 , to each side,
so it can be written in abbreviated form as
6 1.96
1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.2 Large-Sample Confidence Intervals for a Population Mean 301
The two limits x 6 (1.96)sy1n can also be obtained by replacing each , inside the
parentheses in the probability statement by 5 and solving the two resulting equations
for .
Example 7.3 The alternating-current (AC) breakdown voltage of an insulating liquid indicates
its dielectric strength. The article “Testing Practices for the AC Breakdown Voltage
Testing of Insulation Liquids” (IEEE Electrical Insulation Magazine, 1995: 21–26)
gave the accompanying sample observations on breakdown voltage (kV) of a particu-
lar circuit under certain conditions:
62 50 53 57 41 53 55 61 59 64 50 53 64 62 50 68
54 55 57 50 55 50 56 55 46 55 53 54 52 47 47 55
57 48 63 57 57 55 53 59 53 52 50 55 60 50 56 58
Figure 7.4 shows the output from the JMP software’s Analyze/Distribution com-
mand. The boxplot of the data shows a high concentration in the middle half of the
data (narrow box width). There is a single outlier at the upper end, but this value
is actually a bit closer to the median (55) than is the smallest sample observation.
Distributions
Voltage
Quantiles Moments
100% maximum 68 Mean 54.708333
99.5% 68 Std Dev 5.230672
97.5% 67.1 Std Err Mean 0.7549825
90.0% 62.1 Upper 95% Mean 56.227162
75.0% quartile 57 Lower 95% Mean 53.189505
50.0% median 55 N 48
25.0% quartile 50.5
40 45 50 55 60 65 70 10.0% 47.9
2.5% 42.125
0.5% 41
0.0% minimum 41
Figure 7.4 Output from JMP for the breakdown voltage data from Example 7.3
Unless otherwise noted, all content on this page is © Cengage Learning.
Summary quantities include n 5 48, x 5 54.7, and s 5 5.23. The 95% confidence
interval is then
5.23
54.7 6 1.96 5 54.7 6 1.5 5 (53.2, 56.2)
248
That is,
53.2 , , 56.2
with a confidence level of approximately 95%. The interval is reasonably narrow,
indicating that we have precisely estimated . Note that our lower and upper interval
endpoints match JMP’s “Lower 95% Mean” and “Upper 95% Mean,” respectively.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
302 chapter 7 Estimation and Statistical Intervals
The 95% confidence interval for in the foregoing example is (53.2, 56.2). It is
tempting to say that there is a 95% chance that is between 53.2 and 56.2. Do not yield
to this temptation! The 95% refers to the long-run percentage of all possible samples re-
sulting in an interval that includes . That is, if we consider taking sample after sample
from the population and use each one separately to compute a 95% confidence interval,
in the long run roughly 95% of these intervals will capture . Figure 7.5 illustrates this
for 100 samples; 93 of the resulting intervals include , whereas 7 do not. Without
knowing the value of , we cannot tell whether our interval (53.2, 56.2) is one of the
good 95% or the bad 5% of all intervals that might result. The confidence level refers to
the method used to construct the interval rather than to any particular calculated interval.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.2 Large-Sample Confidence Intervals for a Population Mean 303
curve
.01 .01
= .005 = .005
2 2
Figure 7.6 Finding the critical value for a 99% confidence level
row and .08 column. Thus 2.576 (or 2.58, to be conservative) should be used in the CI
formula in place of 1.96 to obtain the higher confidence level.
It should be clear at this point that any confidence level can be achieved simply by
finding the z critical value that captures the corresponding z curve area. For example, it is
easily verified that the interval from 21.28 to 1.28 contains above it about 80% of the area
under the z curve, so using 1.28 in place of 1.96 gives a CI with confidence level 80%.
6( )
1
As a general rule, this interval is appropriate when the sample size exceeds 30. The three
most commonly used confidence levels, 90%, 95%, and 99%, use critical values of 1.645,
1.96, and 2.576, respectively.
Unless otherwise noted, all content on this page is © Cengage Learning.
Why settle for 95% confidence when 99% confidence is possible? The price of a
higher confidence level is that the resulting interval is wider. The width of the 95%
interval is 2(1.96sy1n), whereas the 99% interval has a width of 2(2.576sy1n). The
higher reliability of the 99% interval entails a loss in precision (as indicated by the wider
interval). Many investigators think that a 95% confidence level gives a reasonable com-
promise between reliability and precision.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
304 chapter 7 Estimation and Statistical Intervals
is achieved. For example, with representing the average fuel efficiency (mpg) for all
cars of a certain type, the objective of an investigation may be to estimate to within
1 mpg with 95% confidence. More generally, suppose we wish to estimate to within
an amount B (the specified bound on the error of estimation) with 95% confidence.
This implies that B 5 1.96sy1n, from which
1.96s 2
n5 c d
B
The difficulty with this formula is that calculating the value of n requires having s,
which is of course not available until a sample has been selected. Instead, prior infor-
mation about may be used as a basis for a reasonable guess for s. Alternatively, for a
population distribution that is not too skewed, dividing the range (difference between
the largest and smallest values) by 4 often gives a rough idea of what s might be.
Example 7.4 Refer to Example 7.3 on breakdown voltage. Suppose that the investigator believes
that almost all values in the population distribution are between 40 and 70. Then
(70 2 40)y45 7.5 gives a reasonable value for s. The appropriate sample size for
estimating true average breakdown voltage to within 1 kV with confidence level 95%
is now
(1.96)(7.5) 2
n5 c d 217
1
The sample size associated with an error bound B for any other confidence level, such
as 99%, results from replacing 1.96 in the formula for n by the corresponding critical
value, for example, 2.576.
x2
Pa , 1.645b .95
sy1n
Manipulating the inequality inside the parentheses to isolate on one side gives the
equivalent inequality . x 2 1.645sy1n; the expression on the right is the desired
lower confidence bound. Starting with P(21.645 , z) .95 and manipulating the
inequality results in the upper confidence bound. A similar argument gives a one-sided
bound associated with any other confidence level.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.2 Exercises 305
, 1 ( critical value)
1
and a large-sample lower confidence bound for m is
. 2 ( critical value)
1
The three most commonly used confidence levels, 90%, 95%, and 99%, use critical values
of 1.28, 1.645, and 2.33, respectively.
Example 7.5 Recently there has been increased use of titanium and its alloys in aerospace
and automotive applications. These alloys are highly durable and have a high
strength-to-weight ratio. However, machining of titanium is difficult due to its low
thermal conductivity. The authors of “Modelling and Multi-Objective Optimiaz-
tion of Process Parameters of Wire Electrical Discharge Machining Using Non-
Dominated Sorting Genetic Algorithm-II” (J. of Engr. Manuf., 2012: 1186–2001),
Investigated different settings that impact wire electrical discharge machining of
titanium 6-2-4-2. A characteristic of interest was surface roughness (in m) of
the metal after machining. In one particular investigation a sample of 54 surface
roughness observations gave a sample mean of 1.9042 m and a sample stan-
dard deviation of .1455 m. An upper confidence bound for true average surface
roughness with confidence 95% is
(.1455)
1.9042 1 1.645 5 1.9042 1 .0326 5 1.9368
154
That is, with a confidence level of 95%, the value of lies in the interval (2 , 1.9368).
Since negative values for surface roughness are not possible, we revise this interval
to (0, 1.9368).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
306 chapter 7 Estimation and Statistical Intervals
frequency (Hz) for all tennis rackets of a certain a. Calculate a 95% two-sided confidence inter-
type: val for population mean concentration for the
(114.4, 115.6) (114.1, 115.9) Mugil liza species.
b. Calculate a 99% two-sided confidence inter-
a. What is the value of the sample mean resonance
val for population mean concentration for the
frequency?
Pogonias cromis species. Why is this interval
b. Both intervals were calculated from the same
wider than the interval of part (a) even though it
sample data. The confidence level for one of
is based on a somewhat larger sample size?
these intervals is 90% and for the other is 99%.
Which of the intervals has the 90% confidence 13. Young people may feel they are carrying the weight
level, and why? of the world on their shoulders when in reality they
are too often carrying an excessively heavy back-
11. Suppose that a random sample of 50 bottles of a par-
pack. The article “Effectiveness of a School-Based
ticular brand of cough syrup is selected, and the al-
Backpack Health Promotion Program” (Work,
cohol content of each bottle is determined. Let de-
2003: 113–123) reported the following data for a
note the average alcohol content for the population
sample of 131 sixth graders: for backpack weight
of all bottles of the brand under study. Suppose that
(lbs), x 5 13.83, s 5 5.05; for backpack weight as a
the resulting 95% confidence interval is (7.8, 9.4).
percentage of body weight, a 95% CI for the popu-
a. Would a 90% confidence interval calculated from
lation mean was (13.62; 15.89).
this same sample have been narrower or wider
a. Calculate and interpret a 99% CI for population
than the given interval? Explain your reasoning.
mean backpack weight.
b. Consider the following statement: There is a
b. Obtain a 99% CI for population mean weight as
95% chance that is between 7.8 and 9.4. Is
a percentage of body weight.
this statement correct? Why or why not?
c. The American Academy of Orthopedic Surgeons
c. Consider the following statement: We can be
recommends that backpack weight be at most
highly confident that 95% of all bottles of this
10% of body weight. What does your calculation
type of cough syrup have an alcohol content that
of part (b) suggest and why?
is between 7.8 and 9.4. Is this statement correct?
Why or why not? 14. The article “Extravisual Damage Detection? Defining
d. Consider the following statement: If the process of the Standard Normal Tree” (Photogrammetric Engr.
selecting a sample of size 50 and then computing and Remote Sensing, 1981: 515–522) discusses the use
the corresponding 95% interval is repeated 100 of color infrared photography in identification of nor-
times, 95 of the resulting intervals will include . mal trees in Douglas fir stands. Among data reported
Is this statement correct? Why or why not? were summary statistics for green-filter analytic opti-
12. Heavy-metal pollution of various ecosystems is a se- cal densitometric measurements on samples of both
rious environmental threat, in part because of the healthy and diseased trees. For a sample of 69 healthy
potential transference of hazardous substances to trees, the sample mean dye-layer density was 1.028,
humans via food. The article “Cadmium, Zinc, and and the sample standard deviation was .163.
Total Mercury Levels in the Tissues of Several Fish a. Calculate a 95% two-sided CI for the true aver-
Species from La Plata River Estuary, Argentina” age dye-layer density for all such trees.
(Environmental Monitoring and Assessment, 1993: b. Suppose the investigators had made a rough guess
119–130) reported the following summary data on of .16 for the value of s before collecting data.
zinc concentration (g@g) in the liver of fish: What sample size would be necessary to obtain an
interval width of .05 for a confidence level of 95%?
Species n x s 15. The negative effects of ambient air pollution on
Mugil liza 56 9.15 1.27 children’s lung function has been well established,
Pogonias cromis 61 3.08 1.71 but less research is available about the effects of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 More Large-Sample Confidence Intervals 307
indoor air pollution. The authors of “Indoor Air large a sample would have been required to esti-
Pollution and Lung Function Growth Among mate to within .5 MPa with 95% confidence?
Children in Four Chinese Cities” (Indoor Air,
17. When the population distribution is normal and n
2012: 3–11) investigated the relationship between
is large, the statistic s has approximately a normal
indoor air pollution metrics and lung function
distribution with s , s y12n. Use this fact
growth among children ages 6–13 years living in
to develop a large-sample two-sided confidence
four Chinese cities. For each subject in the study,
interval formula for . Then calculate a 95% con-
the authors measured an important lung-capacity
fidence interval for the true standard deviation of
index known as FEV1, the forced volume (in ml) of
the fracture strength distribution based on the data
air that is exhaled in 1 second. Higher FEV1 values
given in Exercise 16 (the cited paper gave compel-
are associated with greater lung capacity.
ling evidence in support of assuming normality).
Burning coal inside houses can lead to increased
levels of indoor air toxins that may have negative effects 18. Determine the confidence level for each of the fol-
on lung function. Among the children in the study, lowing large-sample one-sided confidence bounds:
514 came from households that use coal for cooking a. Upper bound: x 1 .84sy1n
or heating or both. Their FEV1 mean was 1427 with b. Lower bound: x 2 2.05sy1n
standard deviation 325. (Using a complex statistical c. Upper bound: x 1 .67sy1n
procedure the authors went on to show that burning
19. The charge-to-tap time (min) for a carbon steel in
coal had a clear negative effect on mean FEV1 levels.)
one type of open hearth furnace was determined
a. Calculate and interpret a 95% (two-sided) con-
for each heat in a sample of size 36, resulting in a
fidence interval for true average FEV1 level in
sample mean time of 382.1 and a sample standard
the population of all children from which the
deviation of 31.5. Calculate a 95% upper confi-
sample was selected.
dence bound for true average charge-to-tap time.
b. Suppose the investigators had made a rough
guess of 320 for the value of s before collecting 20. A Brinell hardness test involves measuring the di-
data. What sample size would be necessary to ameter of the indentation made when a hardened
obtain an interval width of 50 ml for a confi- steel ball is pressed into material under a standard
dence level of 95%? test load. Suppose that the Brinell hardness is de-
termined for each specimen in a sample of size 32,
16. The article “Evaluating Tunnel Kiln Performance”
resulting in a sample mean hardness of 64.3 and a
(Amer. Ceramic Soc. Bull., August 1997: 59–63)
sample standard deviation of 6.0. Calculate a 99%
gave the following summary information for frac-
lower confidence bound for true average Brinell
ture strengths (MPa) of n = 169 ceramic bars fired
hardness for material specimens of this type.
in a particular kiln: x 5 89.10, s 5 3.73.
a. Calculate a two-sided confidence interval for true 21. The article “Ultimate Load Capacities of Expan-
average fracture strength using a confidence level sion Anchor Bolts” (J. of Energy Engr., 1993:
of 95%. Does it appear that true average fracture 139–158) gave the following summary data on
strength has been precisely estimated? shear strength (kip) for a sample of 3y8-in. anchor
b. Suppose the investigators had believed a priori bolts: n 5 78, x 5 4.25, s 5 1.30. Calculate a lower
that the population standard deviation was confidence bound using a confidence level of 90%
about 4 MPa. Based on this supposition, how for true average shear strength.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
308 chapter 7 Estimation and Statistical Intervals
or objects in a population or process that possess a particular characteristic, and also for
1 2 2, the difference between two population or process means. These intervals are
based on sampling distribution properties of appropriate statistics.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 More Large-Sample Confidence Intervals 309
Because n is in the denominator under the square root in the expression for p, the
standard deviation decreases and the sampling distribution becomes more and more
concentrated about as the sample size increases. The two inequality conditions in the
third property are designed to ensure that there is enough symmetry in the sampling dis-
tribution so that a normal curve with mean value and standard deviation p provides
a good approximation to a histogram of the actual distribution. For example, if n 5 100
but 5 .02, there is too much (positive) skewness for the approximation to work well
(much of the distribution is concentrated on the values 0, .01, .02, .03, and .04, and the
rest trails out to 1, so there is almost no lower tail).
The foregoing properties allow us to form a variable having approximately a stan-
dard normal distribution when n is large:
p2
z5
2(1 2 )yn
Using z to denote an appropriate z critical value (1.96, 1.645, etc.), we have that
p2
P a2 z* , , z* b 12
2(1 2 )yn
As suggested earlier in the derivation of our first confidence interval for , consider re-
placing each , inside the parentheses by 5 and solving the two resulting equations for
to obtain the confidence limits. Unfortunately, these equations are not as easy to solve
as were the earlier ones. This is because appears both in the numerator and in the
denominator. The equations are therefore both quadratic. Using the general formula for
the solution to a quadratic equation gives the following confidence interval.
where * denotes an appropriate critical value, the 2 sign in the numerator gives the lower
confidence limit, and the 1 sign gives the upper confidence limit. The critical values cor-
responding to the most frequently used confidence levels, 90%, 95%, and 99%, are 1.645,
1.96, and 2.576, respectively. A lower confidence bound for results from using only
the 2 sign in the formula (along with the appropriate *), and using only the 1 sign gives
an upper confidence bound.
Although the preceding interval was derived from the large-sample distribution
of p, recent research has shown that it performs well even when n is quite small.
Additionally, the actual confidence level achieved by the interval is almost always
quite close to the desired level corresponding to the choice of any particular z critical
value. For example, using 1.96 as the z critical value implies a desired confidence
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
310 chapter 7 Estimation and Statistical Intervals
level of 95%, and the actual confidence level (long-run capture percentage if the
formula is used repeatedly on different samples) will almost always be roughly 95%.
When n is quite large, the three terms in the CI formula involving z* are negligible
compared to the three remaining terms. In this case, the CI reduces to the traditional
interval
This latter interval has the same general form as our earlier large-sample interval
for .
Example 7.6 The article “Repeatability and Reproducibility for Pass/Fail Data” (J. of Testing and
Eval., 1997: 151–153) reported that in n 5 48 trials in a particular laboratory, 16
resulted in ignition of a particular type of substrate by a lighted cigarette. Let de-
note the long-run proportion of all such trials that would result in ignition. A point
estimate for is p 5 16y48 5 .333. A confidence interval for with a confidence
level of approximately 95% is
1.96 2
n 5 (1 2 )c d
B
If some other confidence level is desired, the corresponding z critical value replaces
1.96. The difficulty with using this formula is that it involves the unknown . A conser-
vative approach utilizes the fact that (1 2 ) is largest when 5 .5. The sample size
resulting from this choice of will be large enough so that the bound B is achieved with
the desired confidence level no matter what the value of .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 More Large-Sample Confidence Intervals 311
Example 7.7 A survey is to be carried out to estimate the proportion of all registered voters in a
particular state who favor certain term limits for their state legislators. How many
people should be included in a random sample to estimate this proportion to within
the amount .05 with 95% confidence? Substituting 5 .5 in the formula for n gives
n 5 .5(1 2 .5)(1.96y.05)2 5 384.16
so a sample size of 385 should be used. The resulting 95% confidence interval for
will have a half-width of at most .05 regardless of the value of p. Notice that this
sample size is far larger than what appeared in the previous example, which explains
why that interval was so wide.
Notation
Mean Standard
value Variance deviation
Population, process, or treatment 1 1 21 1
Population, process, or treatment 2 2 22 2
Sample
Sample Sample Sample standard
size mean variance deviation
Sample from population, process,
2
or treatment 1 1 1 1 1
It is assumed that the observations in the first sample were obtained completely inde-
pendently from those in the second sample. Notice that our notation allows for the
possibility that the two sample sizes might be different. This might happen because
one population, process, or treatment is more expensive to sample than the other, or
perhaps because observations are “lost” in the course of obtaining data; for example,
several animals receiving a first diet die (hopefully for reasons unrelated to the diet).
Example 7.8 A study was carried out to compare population mean lifetimes (hr) for two different
brands of AA alkaline batteries used in a particular manner. Here, 1 is the mean
lifetime of all brand 1 batteries and 1 is the population standard deviation of brand 1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
312 chapter 7 Estimation and Statistical Intervals
lifetimes; 2 and 2 are the mean value and standard deviation for the distribution of
brand 2 lifetimes. Values of the summary quantities calculated from the two resulting
samples are as follows:
Brand 1: n1 5 50 x1 5 4.15 s1 5 1.79
Brand 2: n2 5 45 x2 5 4.53 s2 5 1.64
Consider estimating the difference 1 2 2. The natural statistic for estimating
1 is x1, and the statistic x2 gives an estimate for 2. The difference between the two
x’s then gives an estimate of the difference between the two ’s. The point estimate
from the data is 4.15 2 4.53 5 2.38. That is, we estimate that, on average, brand 2
batteries last .38 hr longer than do brand 1 batteries. If the labels 1 and 2 on the two
brands had been reversed, the point estimate would be .38, and the interpretation
would be the same as with the original labeling.
Both x1 and x2 vary in value from sample to sample, and this will also be true of their
difference. For example, repeating the study described in Example 7.8 with the same
sample sizes might result in x1 5 4.02 and x2 5 4.75, giving the estimate 2.73. Just as a
confidence interval for a single was based on properties of the x sampling distribution,
a confidence interval for 1 2 2 is derived from properties of the sampling distribution
of the statistic x1 2 x2. These properties follow from the following general results:
1. For any two random variables x and y,
x2y 5 mean value of the difference 5 x 2 y
5 difference between the two means
2. If x and y are two independent random variables, then
2x2y 5 variance of a difference 5 2x 1 2y 5 sum of the variances
3. If x and y are independent random variables, each with a normal distribution, then
the difference x 2 y also has a normal distribution. If each variable is approximately
normal, then the distribution of the difference is also approximately normal.
21 22
5 1
12 2
C 1 2
3. If both population distributions are normal, the sampling distribution of 12 2 is normal.
4. If both the sample sizes are large, then the sampling distribution of 1 2 2 will be
approximately normal irrespective of the shapes of the two population distributions
(a consequence of the Central Limit Theorem).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 More Large-Sample Confidence Intervals 313
has approximately a standard normal distribution (the curve). Using this variable in the
same way that variables were used earlier to obtain confidence intervals for and for
gives the following large-sample confidence interval formula for estimating 1 2 2:
2 2
1 2
12 26( critical value) 1
C 1 2
This formula is valid irrespective of the shapes of the two underlying distributions. The
three most frequently used confidence levels of 95%, 99%, and 90% are achieved by using
the critical values 1.96, 2.576, and 1.645, respectively.
Example 7.9 An experiment carried out to study various characteristics of anchor bolts resulted in
78 observations on shear strength (kip) of 3y8-in. diameter bolts and 88 observations
on strength of 1y2-in. diameter bolts. Summary quantities from Minitab follow, and
a comparative boxplot appears in Figure 7.7. The sample sizes, sample means, and
sample standard deviations agree with values given in the article “Ultimate Load
Capacities of Expansion Anchor Bolts” (J. Energy Engr., 1993: 139–158). The sum-
maries suggest that the main difference between the two samples is in where they are
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
314 chapter 7 Estimation and Statistical Intervals
centered. Let’s now calculate a confidence interval for the difference between true
average shear strength for 3y8-in. bolts (1) and true average shear strength for 1y2-in.
bolts (2) using a confidence level of 95%:
(1.30)2 (1.68)2
4.25 2 7.14 6 (1.96) 1 5 22.89 6 (1.96)(.2318)
C 78 88
5 22.89 6 .45 5 (23.34,22.44)
2
Type
2 7 12
Unless otherwise noted, all content on this page is © Cengage Learning.
Strength
That is, with 95% confidence, 23.34 , 1 2 2 , 22.44. We can therefore be highly
confident that the true average shear strength for the 1y2-in. bolts exceeds that for
the 3y8-in. bolts by between 2.44 kip and 3.34 kip. Notice that if we relabel so that 1
refers to 1y2-in. bolts and 2 to 3y8-in. bolts, the confidence interval is now centered
at 12.89 and the value .45 is still subtracted and added to obtain the confidence
limits. The resulting interval is (2.44, 3.34), and the interpretation is identical to that
for the interval previously calculated.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 Exercises 315
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
316 chapter 7 Estimation and Statistical Intervals
of a 95% confidence level: Add 2 to both the 28. The article “The Effects of Cigarette Smoking and
number of successes and the number of fail- Gestational Weight Change on Birth Outcomes
ures and then use the traditional formula. Do in Obese and Normal-Weight Women” (Amer.
this for the data described in this exercise, and J. of Public Health, 1997: 591–596) reported on
compare the resulting interval to the one you a random sample of 487 nonsmoking women of
calculated in part (a). normal weight (body mass index between 19.8
and 26.0) who had given birth at a large metro-
27. Let 1 and 2 denote the proportion of successes
politan medical center. It was determined that
in population 1 and population 2, respectively. An
7.2% of these births resulted in children of low
investigator sometimes wishes to calculate a confi-
birth weight (less than 2500 g). The article also
dence interval for the difference 1 2 2 between
reported that 6.8% of a sample of 503 nonsmok-
these two population proportions. Suppose random
ing obese women (body mass index . 29) gave
samples of size n1 and n2, respectively, are indepen-
birth to children of low birth weight. Calculate a
dently selected from the two populations, and let p1
95% lower confidence bound for the difference
and p2 denote the resulting sample proportions of
between the population proportion of normal-
successes. If the sample sizes are sufficiently large
weight nonsmoking women and the population
(apply the rule of thumb appropriate for a single
proportion of obese nonsmoking women who give
proportion to each sample separately), the statistic
birth to children of low birth weight. Hint: Refer
p1 2 p2 has approximately a normal sampling dis-
to the previous problem.
tribution with mean value 1 2 2 and standard
deviation 11(1 2 1)yn1 1 2(1 2 2)yn2. The 29. Let 1 and 2 denote the proportions of successes
estimated standard deviation of this statistic results in two different populations. Rather than estimate
from replacing each π under the square root by the the difference 1 2 2 as described in Exercise 27,
corresponding p. an investigator will often wish to estimate the ratio
a. Use the foregoing facts to obtain a large-sample of the two ’s. If, for example, 1y2 5 3, then
two-sided 95% confidence interval formula for successes occur three times as frequently in popu-
estimating 1 2 2. lation 1 as they do in population 2. Alternatively,
b. Is the response rate for questionnaires affected if the ’s refer to success proportions for two dif-
by including some sort of incentive to respond ferent treatments, then a ratio of 3 implies that the
along with the questionnaire? In one experi- first treatment is three times as likely to result in a
ment, 110 questionnaires with no incentive success as is the second treatment. Consider inde-
resulted in 75 being returned, whereas 98 pendent random samples of sizes n1 and n2 from
questionnaires that included a chance to win the two different populations, which result in sam-
a lottery yielded 66 responses (“Charities, No; ple proportions p1 and p2, respectively. Also let u 5
Lotteries, No; Cash, Yes,” Public Opinion number of successes in the first sample and v 5
Quarterly, 1996: 542–562). Calculate a two- number of successes in the second sample. When
sided 95% CI for the difference between the the n’s are both large, the statistic ln(p1yp2)has ap-
true response proportions under these circum- proximately a normal sampling distribution with
stances. Does the interval suggest that, in fact, approximate mean value and standard deviation
the values of 1 and 2 are different? Explain ln(1y2) and 2(n1 2 u)y(un1) 1 (n2 2 v)yvn2),
your reasoning. respectively.
c. Recent research has shown that “coverage a. Use these facts to obtain a large-sample two-
probability” and small-sample behavior are sided 95% CI for ln(1y2) and a CI for 1y2
improved by adding one success and one fail- itself.
ure to each sample and then using the formula b. The article cited in Exercise 27 stated that in
you obtained in part (a). Do this for the data of addition to 75 of 110 questionnaires without
part (b). an incentive to respond being returned, 78 of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.3 Exercises 317
100 questionnaires that included a prepaid Estimate the difference between true average
cash amount of $5 were returned. Calculate densities for the two types of wood in a way
a 95% confidence interval for the ratio of the that conveys information about reliability and
proportion of questionnaires returned when precision.
such a cash incentive is included to the pro-
34. Is there any systematic tendency for part-time
portion returned in the absence of any incen-
college faculty to hold their students to different
tive. Does the interval suggest that such an
standards than full-time faculty do? The article
incentive may not increase the likelihood of
“Are There Instructional Differences Between
response?
Full-Time and Part-Time Faculty?” (College
30. A manufacturer of small appliances purchases plas- Teaching, 2009: 23–26) reported that for a
tic handles for coffeepots from an outside vendor. If sample of 125 courses taught by full-time faculty,
a handle is cracked, it is considered defective and the mean course GPA was 2.7186 and the stan-
must be discarded. A very large shipment of handles dard deviation was .63342, whereas for a sample
is received. The proportion of defective handles, , of 88 courses taught by part-timers, the mean
is of interest. How many handles from the shipment and standard deviation were 2.8639 and .49241,
should be inspected to estimate to within .1 with respectively.
99% confidence? Calculate a confidence interval at the 99%
level to estimate the true mean GPA difference be-
31. A manufacturer of exercise equipment is interested
tween full-time and part-time faculty. Does it ap-
in estimating the proportion of all purchasers of
pear that true average course GPA for part-time fac-
one of its products who still own the product two
ulty differs from that for faculty teaching full-time?
years after purchase. What sample size is required
Explain your reasoning.
to estimate this proportion to within .05 with a con-
fidence level of 90%? 35. An experiment was performed to compare the
32. Use the accompanying data to estimate with a 95% fracture toughness of high-purity Ni-maraging
confidence interval the difference between true aver- steel with commercial-purity steel of the same
age compressive strength (N/mm2) for 7-day-old con- type. For 32 high-purity specimens, the sample
crete specimens and true average strength for 28-day- mean toughness and sample standard deviation of
old specimens (“A Study of Twenty-Five-Year-Old toughness were 65.6 and 1.4, respectively, whereas
Pulverized Fuel Ash Concrete Used in Foundation for 32 commercialpurity specimens, the sample
Structures,” Proc. Inst. Civil Engrs., 1985: 149–165): mean and sample standard deviation were 59.2
and 1.1, respectively. Estimate the difference be-
7-day old: n1 5 68 x1 5 26.99 s1 5 4.89 tween true average toughness for the high-purity
steel and that for the commercial steel using a
28-day old: n2 5 74 x2 5 35.76 s2 5 6.43
lower 95% confidence bound. Does your estimate
33. Relative density was determined for one sample of demonstrate conclusively that this difference ex-
second-growth Douglas fir 2 3 4s with a low per- ceeds 5? Explain your reasoning.
centage of juvenile wood and another sample with
36. An investigator wishes to estimate the difference
a moderate percentage of juvenile wood, resulting
between population mean lifetimes of two different
in the following data (“Bending Strength and Stiff-
brands of batteries under specified conditions. If the
ness of Second-Growth Douglas Fir Dimension
population standard deviations are both roughly
Lumber,” Forest Products J., 1991: 35–43):
2 hr and equal sample sizes are to be selected, what
value of the common sample size n will be neces-
Type n x s
sary to estimate the difference to within .5 hr with
Low 35 .523 .0543
95% confidence?
Moderate 54 .489 .0450
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
318 chapter 7 Estimation and Statistical Intervals
1. An interval of plausible values for the population mean lifetime, that is, a confi-
dence interval for
2. An interval of plausible values for the lifetime of a single component of this type
that you are planning to buy at some time in the near future, that is, a prediction
interval for a single x value
3. An interval of values that includes a specified percentage, for example, 90%, of
the lifetime values for components in the population, that is, a tolerance interval
for a chosen percentage of x values in the population distribution
We have already seen how to calculate a z confidence interval for when n is large.
In this section, we assume that the sample has been selected from a normal population
distribution, and show how each of the three types of intervals can be obtained.
Proposition Let x1, x2, . . . , xn be a random sample from a normal distribution. Then the standard-
ized variable
x2
t5
sy1n
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.4 Small-Sample Intervals Based on a Normal Population Distribution 319
Properties of Distributions
1. Any particular distribution is specified by the value of a parameter called the
abbreviated df. There is one distribution with 1 df, another with
2 df, yet another one with 3 df, and so on. The number of df for a distribution can be
any positive integer.
2. The density curve corresponding to any particular distribution is bell-shaped and cen-
tered at 0, just like the curve.
3. Any curve is more spread out than the curve.
4. As the number of df increases, the spread of the corresponding curve decreases. Thus
the most spread out of all curves is the one with 1 df, the next most spread out is the
one with 2 df, and so on.
5. As the number of df increases, the sequence of curves approaches the curve. (The
curve is sometimes referred to as the curve with df 5 .)
curve
–3 –2 –1 0 1 2 3
Formulas for the large-sample z intervals utilized z critical values, numbers like
Unless otherwise noted, all content on this page is © Cengage Learning.
1.96 and 2.33, that captured certain central or cumulative areas under the z curve.
Formulas for t intervals require t critical values, which play the same role for various t
curves. Appendix Table IV gives a tabulation of such values. Each row of the table cor-
responds to a different number of df, and each column gives critical values that capture
a particular central area and the corresponding cumulative area. For example, the t criti-
cal value at the intersection of the 12 df row and the .95 central area column is 2.179,
so the area under the 12 df t curve between 22.179 and 2.179 is .95. The cumulative
area under this t curve all the way to the left of 2.179 is the central area .95 plus the
lower tail area .025, or .975. This is illustrated in Figure 7.9. The critical value 2.179
can then be used to calculate a two-sided confidence interval with a confidence level
of 95%. A one-sided interval, which gives either an upper confidence bound or a lower
confidence bound, with confidence level 95% necessitates going to the .95 cumulative
area column; for 12 df, the required critical value is 1.782.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
320 chapter 7 Estimation and Statistical Intervals
curve
Shaded cumulative area .95 for 12 df Shaded area .95
As we move from left to right in any particular row of the table, the critical values in-
crease. This is because capturing a larger central or cumulative area requires going farther
out into the tail of the t curve. Starting with 1 df, the rows increase by 1 df until reaching
30 df, and then they jump to 40, 60, 120, and finally to ; this last row contains z critical
values. Once past 30 df, there is little difference between the t curves and the z curve as far
as the areas of interest to us are concerned. Rather than using the 30, 40, 60, and 120 rows
or trying to interpolate, we recommend that z critical values be used whenever df . 30.
The large-sample z CI for was obtained by using the (approximate) standard
normal variable z 5 (x 2 )y(sy1n) as the basis for a probability statement and then
manipulating inequalities to isolate . An analogous derivation, based on the fact
that t 5 (x 2 )y(sy1n) has a t distribution with n 2 1 df, gives the following one-
sample t CI.
6 ( critical value)
1
Unless otherwise noted, all content on this page is © Cengage Learning.
critical values for the most frequently used confidence levels, corresponding to particular
central curve areas, are given in Appendix Table IV. An upper confidence bound results
from replacing 6 in the given formula by 1, whereas a lower confidence bound uses 2 in
place of 6. For such a one-sided interval, a critical value in the cumulative area column
corresponding to the desired confidence level is used.
Example 7.10 As part of a larger project to study the behavior of stressed-skin panels, a structural
component being used extensively in North America, the article “Time-Dependent
Bending Properties of Lumber” (J. of Testing and Eval., 1996: 187–193) reported
on various mechanical properties of Scotch pine lumber specimens. Consider the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.4 Small-Sample Intervals Based on a Normal Population Distribution 321
17000
16000
15000
Modulus
14000
13000
12000
11000
10000
2 1 0 1 2
N quantile
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
322 chapter 7 Estimation and Statistical Intervals
(x2x) 5 x 2 x 5 2 5 0
Since the xi values in the sample are assumed independent of the “future” x value, the
variance of the prediction error is the sum of the variance of x and the variance of x:
2 1
2(x2x) 5 2x 1 2x 5 1 2 5 2 a 1 1 b
n n
Statistical theory then says that if we use these results to standardize the prediction error
(with s2 used in place of 2), we obtain a t variable based on n 2 1 df.
2
5
1
11
A
has a distribution based on 2 1 df. This implies that a two-sided prediction interval for
has the form
1
6 ( critical value) ? 11
A
An upper prediction bound and a lower prediction bound result from using 1 and 2,
respectively, in place of 6 and selecting the appropriate critical value from the corre-
sponding cumulative area column of the table rather than the central area column.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.4 Small-Sample Intervals Based on a Normal Population Distribution 323
Example 7.11 Reconsider the modulus of elasticity data introduced in the previous example. Sup-
pose that one more specimen of lumber is to be selected for testing. A 95% prediction
interval for the modulus of elasticity of this single specimen uses the same t critical
value and values of n, x, and s used in the confidence interval calculation:
1
14,532.5 6 (2.131)(2055.67) 11 5 14,532.5 6 4515.5
A 16
5 (10,017.0, 19,048.0)
This interval is extremely wide, indicating that there is great uncertainty as to what
the modulus of elasticity for the next lumber specimen will be. Notice that the 6
factor for the confidence interval is 1095.2, so the prediction interval is roughly four
times as wide as the confidence interval.
Tolerance Intervals
Consider a population of automobiles of a certain type, and suppose that under specified
conditions, fuel efficiency (mpg) has a normal distribution with 5 30 and 5 2. Then
since the interval from 21.645 to 1.645 captures 90% of the area under the z curve, 90%
of all these automobiles will have fuel efficiency values between 2 1.645 5 26.71
and 1 1.645 5 33.29. But what if the values of and are not known? We can
take a sample of size n, determine the fuel efficiencies, x, and s, and form the interval
whose lower limit is x 2 1.645s and whose upper limit is x 1 1.645s. However, because
of sampling variability in the estimates of and , there is a good chance that the result-
ing interval will include less than 90% of the population values. Intuitively, to have an
a priori 95% chance of the resulting interval including at least 90% of the population
values, when x and s are used in place of and , we should also replace 1.645 by some
larger number. For example, when n 5 20, the value 2.310 is such that we can be 95%
confident that the interval x 6 2.310s will include at least 90% of the fuel efficiency
values in the population.
Let be a number between 0 and 100. A tolerance interval for capturing at least
% of the values in a normal population distribution with a confidence level 95% has
the form
Tolerance critical values for 5 90, 95, and 99 in combination with various sample sizes
are given in Appendix Table V. This table also includes critical values for a confidence level
of 99% (these values are larger than the corresponding 95% values). Replacing 6 by 1
gives an upper tolerance bound and using 2 in place of 6 results in a lower tolerance
bound. Critical values for obtaining these one-sided bounds also appear in Appendix
Table V.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
324 chapter 7 Estimation and Statistical Intervals
Example 7.12 Let’s return to the modulus of elasticity data discussed in Examples 7.10 and 7.11,
where n 5 16, x 5 14,532.5, s 5 2055.67, and a normal quantile plot of the data in-
dicated that population normality was quite plausible. For a confidence level of 95%,
a two-sided tolerance interval for capturing at least 95% of the modulus of elasticity
values for specimens of lumber in the population sampled uses the tolerance critical
value of 2.903. The resulting interval is
14,532.5 6 (2.903)(2055.67) 5 14,532.5 6 5967.6 5 (8564.9, 20,500.1)
We can be highly confident that at least 95% of all lumber specimens have modulus
of elasticity values between 8564.9 and 20,500.1.
The 95% CI for was (13,437.3, 15,627.7), and the 95% prediction interval for
the modulus of elasticity of a single lumber specimen was (10,017.0, 19,048.0). Both
the prediction interval and the tolerance interval are substantially wider than the
confidence interval.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.4 Exercises 325
39. Determine the t critical value for a lower or an up- contained the following observations on degree of
per confidence bound for each of the situations de- polymerization for paper specimens for which vis-
scribed in Exercise 38. cosity times concentration fell in a certain range:
40. According to the article “Fatigue Testing of Con- 418 421 421 422 425 427
doms” (Polymer Testing, 2009: 567–571), “tests 431 434 437 439 446 447
currently used for condoms are surrogates for the 448 453 454 463 465
challenges they face in use,” including a test for a. Construct a boxplot of the data and comment
holes, an inflation test, a package seal test, and on any interesting features.
tests of dimensions and lubricant quality. The in- b. Is it plausible that the given sample observations
vestigators developed a new test that adds cyclic were selected from a normal distribution?
strain to a level well below breakage and deter- c. Calculate a two-sided 95% confidence interval
mines the number of cycles to break. A sample of for true average degree of polymerization (as did
20 condoms of one particular type resulted in a the authors of the article). Does the interval sug-
sample mean number of 1584 and a sample stan- gest that 440 is a plausible value for true average
dard deviation of 607. Calculate and interpret a degree of polymerization? What about 450?
confidence interval at the 99% confidence level
43. Haven’t you always wanted to own a Porsche? We
for the true average number of cycles to break.
investigated the Boxster (their cheapest model)
(Note: The article presented the results of hypoth-
and performed an online search at www.cars
esis tests based on the t distribution; the validity
.com on December 30, 2012. Asking prices were
of these depends on assuming normal population
well beyond our meager professorial salaries, so
distributions.)
instead we focused on odometer readings (mile-
41. Ultra high performance concrete (UHPC) is a rela- age). Here are reported readings for a sample of
tively new construction material that offers strong 16 Boxsters:
adhesive properties with other materials. The
1445 25,822 26,892 29,860
authors of “Adhesive Power of Ultra High Perfor-
35,285 47,874 49,544 64,763
mance Concrete from a Thermodynamic Point of
View” (J. of Materials in Civil Engr., 2012: 1050– 72,698 75,732 84,457 91,577
1058) investigated the intermolecular forces for 93,000 109,538 113,399 137,652
UHPC connected to various substrates. As reported
A normal quantile plot supports the assumption
in the article, here are the work of adhesion mea-
that mileage is at least approximately normally dis-
surements (in mJ/m2) for five samples of UHPC
tributed. The R software reports the following sum-
adhered to steel: mary statistics for this data:
107.1 109.5 107.4 106.8 108.1 > summary(odometer, digits=6)
Min 1st Qu Median Mean
a. Is it plausible that the given sample observations 1445.0 33928.8 68730.5 66221.1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
326 chapter 7 Estimation and Statistical Intervals
44. A new concrete structure that experiences crack- b. SAS reports the following summary information
ing within the first seven days after setting is often for this data:
said to have experienced “early-age cracking.” The MEANS Procedure
This is usually a precursor to later-age cracking Analysis Variable : pressure
Lower 95% Upper 95%
and other problems that lead to an overall weak- CL for Mean CL for Mean Mean Std Error
ening of the structure. According to the article 36.1892782 39.0373884 37.6133333 0.6639612
“Early-Age Cracking Tendency and Ultimate De-
Calculate a two-sided 95% confidence interval
gree of Hydration of Internally Cured Concrete”
for the population mean of maximum pressure
(J. of Materials in Civil Engr., 2012: 1025–1033),
and confirm the lower and upper endpoints re-
more than 60% of surveyed transportation agen-
ported by SAS.
cies regard early-age transverse cracking to be
c. Calculate an upper confidence bound with con-
problematic. The authors investigated the ef-
fidence level 95% for the population mean of
fectiveness of a process known as internal curing
maximum pressure.
to mitigate early-age cracking of bridge deck
d. Calculate an upper prediction bound with level
concretes.
95% for the maximum pressure of a single obser-
One important mechanical property of con-
vation. How does the prediction compare to the
crete is its modulus of elasticity (in GPa), which
estimate calculated in part (b)?
is the material’s tendency to be deformed elasti-
cally when subjected to an applied force. A higher 46. A study of the ability of individuals to walk in a
modulus of elasticity indicates a stiffer material. As straight line (“Can We Really Walk Straight?” Amer.
reported in the article, the following are modulus J. of Physical Anthro., 1992: 19–27) reported the ac-
of elasticity measurements for seven specimens of companying data on cadence (strides per second)
internally cured concrete that have been set for one for a sample of n 5 20 randomly selected healthy
week: men:
27.0 25.5 28.5 34.0 31.0 34.5 32.5 .95 .85 .92 .95 .93 .86 1.00 .92 .85 .81
.78 .93 .93 1.05 .93 1.06 1.06 .96 .81 .96
a. Is it plausible that this sample was selected from
A normal quantile plot gives substantial support
a normal population distribution?
to the assumption that the population distribution
b. Estimate true average modulus of elasticity for
of cadence is approximately normal. A descriptive
these mixtures in a way that conveys informa-
summary of the data from Minitab follows:
tion about precision and reliability.
Variable N Mean Median TrMean StDev SEMean
c. Predict the modulus of elasticity for a single cadence 20 0.9255 0.9300 0.9261 0.0809 0.0181
mixture in a way that conveys information about Variable Min Max Q1 Q3
precision and reliability. How does the predic- cadence 0.7800 1.0600 0.8525 0.9600
tion compare to the estimate calculated in a. Calculate and interpret a 95% confidence inter-
part (b)? val for population mean cadence.
b. Calculate and interpret a 95% prediction inter-
45. The article “Concrete Pressure on Formwork”
val for the cadence of a single individual ran-
(Mag. of Concrete Res., 2009: 407–417) gave the
domly selected from this population.
following observations on maximum concrete pres-
c. Calculate an interval that includes at least 99%
sure (kN/m2):
of the cadences in the population distribution
33.2 41.8 37.3 40.2 36.7 using a confidence level of 95%.
39.1 36.2 41.8 36.0 35.2
47. A sample of 25 pieces of laminate used in the
36.7 38.9 35.8 35.2 40.1
manufacture of circuit boards was selected and
a. Is it plausible that this sample was selected from the amount of warpage (in.) under particular
a normal population distribution? conditions was determined for each piece, resulting
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.5 Intervals for 1 2 2 Based on Normal Population Distributions 327
in a sample mean warpage of .0635 and a sample 48. A more extensive tabulation of t critical values than
standard deviation of .0065. what appears in this book shows that for the t distribu-
a. Calculate a prediction for the amount of warpage tion with 20 df, the areas to the right of the values .687,
of a single piece of laminate in a way that pro- .860, and 1.064 are .25, .20, and .15, respectively.
vides information about precision and reliability. What is the confidence level for each of the following
b. Calculate an interval for which you can have three confidence intervals for the mean of a normal
a high degree of confidence that at least 95% population distribution? Which of the three intervals
of all pieces of laminate result in amounts of would you recommend be used, and why?
warpage that are between the two limits of the a. (x 2 .687sy121, x 1 1.725sy121)
interval. b. (x 2 .860sy121, x 1 1.325sy121)
c. (x 2 1.064sy121, x 1 1.064sy121)
Proposition Consider two normal distributions with mean values 1 and 2, respectively. Suppose
a random sample of size n1 is selected from the first distribution, resulting in a sample
mean of x1 and a sample standard deviation of s1. A random sample from the second
distribution, selected independently of that from the first one, yields sample mean x2
and sample standard deviation s2. Then the standardized variable
x1 2 x2 2 (1 2 2)
t5
s21 s22
1
C n1 n2
has approximately a t distribution with df estimated from the sample by the following
formula:
(se1)2 1 (se2)2 2
df 5
(se1)4 (se2)4
1
n1 2 1 n2 2 1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
328 chapter 7 Estimation and Statistical Intervals
t critical values corresponding to the most frequently used confidence levels appear in
Appendix Table IV.
The standardized variable in the box is identical to the one used in our previous de-
velopment of the large-sample interval; it is labeled t here simply to emphasize that it
now has approximately a t rather than a z distribution. The only difference between the
formulas for the two intervals is that the formula here uses a t critical value instead of a z
critical value. Separate normal quantile plots of the observations in the two samples can
be used as a basis for checking that the normality assumption is plausible.
Example 7.13 Which way of dispensing champagne, the traditional vertical method or a tilted beer-
like pour, preserves more of the tiny gas bubbles that improve flavor and aroma? The
following data was reported in the article “On the Losses of Dissolved CO2 during
Champagne Serving” (J. Agr. Food Chem., 2010: 8768–8775).
Temp (°C) Type of Pour n Mean (g/L) SD
18 Traditional 4 4.0 .5
18 Slanted 4 3.7 .3
12 Traditional 4 3.3 .2
12 Slanted 4 2.0 .3
Assuming the sampled distributions are normal, let’s calculate confidence intervals
for the difference between true average dissolved CO2 loss for the traditional pour
and that for the slanted pour at each of the two temperatures. For the 18°C tempera-
ture, the number of degrees of freedom for the interval is
2 2
1 .54 1 .34 22 .007225
df 5 5 5 4.91
(.52y4)2 (.32y4)2 .00147083
3 1 3
Rounding down, the CI will be based on 4 df. For a confidence interval of 99%,
Appendix Table IV gives t critical value 5 4.604. The desired interval is
.52 .32
4.0 2 3.7 6 (4.604) 1 5 .3 6 (4.604)(.2915) 5 .3 6 1.3 5 (21.0, 1.6)
B 4 4
Thus, we can be highly confident that 21,0 , 122 , 1.6, where 1 and 2 are
true average losses for the traditional and slant methods, respectively. Notice that this
CI contains 0; so at the 99% confidence level, it is plausible that 1 2 2 5 0—that
is, that 1 5 2. Note that if the 1 and 2 labels had been reversed, the resulting inter-
val would have been (21.6, 1.0), with exactly the same interpretation.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.5 Intervals for 1 2 2 Based on Normal Population Distributions 329
The df formula for the 12°C comparison yields df 5 .00105625y .00020208 5 5.23.
The required df is 5, and Appendix Table IV gives t critical value 5 4.032 for a 99%
CI. The resulting interval is (.6, 2.0). Thus, 0 is not a plausible value for this dif-
ference. It appears from the CI that the true average loss when the slant method is
used is smaller than that when the traditional method is used, so the slant method is
better at this temperature. This, in fact, was the conclusion reported in the popular
media.
There is a special confidence interval formula for the case of normal population
distributions having 1 5 2. It is called the pooled t confidence interval; “pooled”
refers to the fact that s1 and s2 are combined to estimate the common population stan-
dard deviation. Recent studies have shown that the behavior of this interval is rather
sensitive to the assumption of equal population standard deviations. If they are not in
fact the same, the actual confidence level may be quite different from the nominal
level (e.g., the actual level may deviate substantially from an assumed 95% level). For
this reason we recommend the use of the two-sample t interval we have described un-
less there is compelling evidence for at least approximate equality of the population
standard deviations.
d 5 1 2 2
where 1 is the population mean value of all first numbers within pairs and 2 is defined
similarly for all second numbers. The importance of this relationship is that if we can
obtain a CI for d, it will also be a CI for 1 2 2. A CI for d can be calculated from
the differences for pairs in the sample. In particular, if the population distribution of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
330 chapter 7 Estimation and Statistical Intervals
the differences can be assumed to be normal, then a one-sample t interval based on the
sample differences is appropriate.
6 ( critical value)
1
The critical value is based on 2 1 df. If is large, the Central Limit Theorem ensures the
validity of the interval without the normality assumption.
Example 7.14 Example 7.10 in the previous section gave data on the modulus of elasticity obtained
1 minute after loading in a certain configuration. The cited article also gave the
values of modulus of elasticity obtained 4 weeks after loading for the same lumber
specimens. The data is presented here.
The normal quantile plot of the differences shown in Figure 7.11 appears to be
reasonably straight, though the point on the far left deviates somewhat from a line
determined by the other points. (Use of a formal inferential procedure presented in
Chapter 8 indicates that it is reasonable to assume that the population distribution of
the differences is approximately normal.)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.5 Intervals for 1 2 2 Based on Normal Population Distributions 331
Difference
3500
2500
1500
Normal quantile
–2 –1 0 1 2
Figure 7.11 Normal quantile plot of the differences from Example 7.14
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
332 chapter 7 Estimation and Statistical Intervals
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.5 Exercises 333
output for comparing a laboratory method to a new average outputs for the two brands with a 95%
relatively quick and inexpensive field method (from confidence interval.
the article “Evaluation of a New Field measurement c. Estimate the difference between the two >s
Method for Arsenic in Drinking Water Samples,” using the two-sample t interval discussed in
J. of Envir. Engr., 2008: 382–388). this section, and compare it to the interval of
part (b).
Two-Sample T-Test and CI
Sample N Mean StDev SE Mean 55. Along any major freeway we often encounter
1 3 19.70 1.10 0.64 service (or logo) signs that give information on
2 3 10.90 0.60 0.35 attractions, camping, lodging, food, and gas ser-
Estimate for difference: 8.800 vices in advance of the off-ramp that leads to such
95% CI for difference: (6.498, 11.102)
services. These signs typically do not provide in-
Calculate a two-sided 95% confidence interval formation on distances. Researchers in Virginia,
for the difference in population means and con- with cooperation from the Virginia Department
firm the lower and upper endpoints reported by of Transportation, performed an experiment to
Minitab. Based on the interval, what conclusion see if the addition of distance information on the
you can draw about the two methods? Why? service signs would affect drivers. The results of
this experiment were reported in “Evaluation of
54. Suppose not only that the two population or treat- Adding Distance Information to Freeway-Specific
ment response distributions are normal but also Service (Logo) Signs” (J. of Transp. Engr., 2011:
that they have equal variances. Let 2 denote the 782–788).
common variance. This variance can be estimated In one investigation, the authors selected six
by a “pooled” (i.e., combined) sample variance as sites along Virginia interstate highways where ser-
follows: vice signs are posted. For each site, crash data was
n1 2 1 n2 2 1 obtained for a three-year period before distance in-
s2p 5 a bs21 1 a bs22
n1 1 n2 2 2 n1 1 n2 2 2 formation was added to the service signs and for a
one-year period afterward. The number of crashes
(n1 1 n2 2 2 is the sum of the df’s contributed by
per year before and after the sign changes were
the two samples). It can then be shown that the
made are given here:
standardized variable
Before: 15 26 66 115 62 64
(x1 2 x2) 2 (1 2 2) After: 16 24 42 80 78 73
t5
1 1
sp 1 a. Calculate a confidence interval for the popula-
A n1 n2
tion mean difference in the number of crashes
per year before and after the sign changes were
has a t distribution with n1 1 n2 22 df.
made. Provide an interpretation for this interval.
a. Use the t variable above to obtain a pooled t
b. If a seventh site were to be randomly selected
confidence interval formula for 1 2 2.
among locations bearing service signs, between
b. A sample of ultrasonic humidifiers of one par-
what values would you predict the difference in
ticular brand was selected for which the observa-
number of crashes to lie?
tions on maximum output of moisture (oz) in a
controlled chamber were 14.0, 14.3, 12.2, and 56. Lactation promotes a temporary loss of bone mass to
15.1. A sample of the second brand gave out- provide adequate amounts of calcium for milk pro-
put values 12.1, 13.6, 11.9, and 11.2 (“Multiple duction. The paper “Bone Mass Is Recovered from
Comparisons of Means Using Simultaneous Lactation to Postweaning in Adolescent Mothers
Confidence Intervals,” J. of Quality Technology, with Low Calcium Intakes” (Amer. J. of Clinical
1989: 232–241). Use the pooled t formula from Nutr., 2004: 1322–1326) gave the following data on
part (a) to estimate the difference between true total body bone mineral content (TBBMC) (g) for
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
334 chapter 7 Estimation and Statistical Intervals
a sample both during lactation (L) and in the post- Pos Dom Tr Pos ND Tr Pit Dom Tr Pit ND Tr
weaning period (P). 1 30.31 32.54 27.63 24.33
Subject L P 2 44.86 40.95 30.57 26.36
3 22.09 23.48 32.62 30.62
1 1928 2126
4 31.26 31.11 39.79 33.74
2 2549 2885
5 28.07 28.75 28.50 29.84
3 2825 2895
6 31.93 29.32 26.70 26.71
4 1924 1942 7 34.68 34.79 30.34 26.45
5 1628 1750 8 29.10 28.87 28.96 21.49
6 2175 2184 9 25.51 27.59 31.19 20.82
7 2114 2164 10 22.49 21.01 36.00 21.75
8 2621 2626 11 28.74 30.31 31.58 28.32
9 1843 2006 12 27.89 27.92 32.55 27.22
10 2541 2627 13 28.48 27.85 29.56 28.86
14 25.60 21.95 28.64 28.58
a. Construct a comparative boxplot of TBBMC for
15 20.21 21.59 28.58 27.15
the lactation and postweaning periods and com-
16 33.77 32.48 31.99 29.46
ment on any interesting features.
17 32.59 32.48 27.16 21.26
b. Estimate the difference between true average
18 32.60 31.61
TBBMC for the two periods of concrete in a
way that conveys information about precision 19 29.30 27.46
and reliability. Does it appear plausible that the 58. Dentists make many people nervous (even more
true average TBBMCs for the two periods are so than statisticians!). To assess any effect of such
identical? Why or why not? nervousness on blood pressure, the systolic blood
pressure of each of 60 subjects was measured both
57. The paper “Quantitative Assessment of Glenohu-
in a dental setting and in a medical setting (“The
meral Translation in Baseball Players” (Amer. J. of
Effect of the Dental Setting on Blood Pressure
Sports Med., 2004: 1711–1715) considered various
Measurement,” Amer. J. of Public Health, 1983:
aspects of shoulder motion for a sample of pitch-
1210–1214). For each subject, the difference
ers and another sample of position players [gleno-
between dental setting pressure and medical set-
humeral refers to the articulation between the hu-
ting pressure was computed; the resulting sample
merus (ball) and the glenoid (socket)]. The authors
mean difference and sample standard deviation of
kindly supplied the following data (for 19 position
the differences were 4.47 and 8.77, respectively.
players and 17 pitchers) on anteroposterior transla-
Estimate the true average difference between
tion (mm), a measure of the extent of anterior and
blood pressures for these two settings using a 99%
posterior motion, for both dominant nondominant
confidence interval. Does it appear that the true
arms.
average pressure is different in a dental setting
a. Estimate the true average difference in transla-
than in a medical setting?
tion between dominant and nondominant arms
for pitchers in a way that conveys information 59. Antipsychotic drugs are widely prescribed for condi-
about reliability and precision. Interpret the re- tions such as schizophrenia and bipolar disease. The
sulting estimate. article “Cardiometabolic Risk of Second-Generation
b. Repeat part (a) for position players. Antipsychotic Medications During First-Time Use in
c. The authors asserted that “pitchers have greater Children and Adolescents” (J. of the Amer. Med. As-
difference in side-to-side anteroposterior transla- soc., 2009: 1765–1773) reported on body composition
tion of their shoulders compared with position and metabolic changes for individuals who had taken
players.” Do you agree? Explain. various antipsychotic drugs for short periods of time.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Other Topics in Estimation (Optional) 335
a. The sample of 41 individuals who had taken total cholesterol under these circumstances (the
aripiprazole had a mean change in total cho- cited article included this CI).
lesterol (mg/dL) of 3.75, and the estimated b. For the sample of 45 individuals who had taken
standard error sd y1n was 3.878. Calculate a olanzapine, the article reported (7.38, 9.69) as a
confidence interval with confidence level ap- 95% CI for true average weight gain (kg). What
proximately 95% for the true average increase in is a 99% CI?
Example 7.15 In a random sample of ten electronic components, suppose that the first, third,
and tenth components fail to function correctly when tested. Using the 0–1 coding
scheme introduced in Section 5.6, we can write the data in this sample as x1 5 1,
x2 5 0, x3 5 1, x4 5 0, . . . , x10 5 1, where a “0” indicates that the component func-
tioned correctly and a “1” indicates that it did not work correctly.
Since this data comes from a random sample, we can assume that the outcome
involving the first item sampled is independent of the outcome involving the second
component sampled, and so forth. Therefore, if denotes the unknown proportion
of defective components in the manufacturing process from which the sample was
obtained, then the probability of getting the particular sample can be written as
P(x1 5 1 and x2 5 0 and x3 5 1 and … and x10 5 1)
5 P(x1 5 1) P(x2 5 0) P(x3 5 1) … P(x10 5 1)
5 (1 2 ) … 5 3(1 2 )7
The expression 3(1 2 )7 represents the likelihood of our sample result occurring,
and it is abbreviated as L() 5 3(1 2 )7. We now ask, For what value of is the
observed sample most likely to have occurred? That is, we want to find the value of
that maximizes the probability 3(1 2 )7. This requires setting the derivative of
L() equal to 0 and solving for . However, to simplify the calculations, we first take
the natural logarithm of L() 5 3(1 2 )7:
ln(L()) 5 ln3 3(1 2 )7 4 5 3 ln() 1 7 ln(1 2 )
and then take the derivative1:
d 3 7
ln(L()) 5 2
d 12
1
Since ln(x) is an increasing function of x, the value of that maximizes ln(L()) will be the same value
that maximizes L().
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
336 chapter 7 Estimation and Statistical Intervals
Setting this expression equal to 0 and solving for , we find that the solution equals
3/10 5 .30. The value .30 is said to be the maximum likelihood estimate of the pro-
cess proportion defective . Notice that this estimate happens to be the ratio of the
number of defective components in the sample divided by the sample size, that is,
the sample proportion, p. In fact, this is true in general, regardless of the particular
sample data, so we can also say that the sample proportion is a maximum likelihood
estimator for a population or process proportion.
The technique in the previous example can be put into a general form that applies
to any mass or density function. Let f (x) denote either a mass or density function that
is defined by a set of parameters 1, 2,…, k. Given the data x1, x2, x3, . . . , xn in any
random sample from a population whose distribution is described by f (x), we form the
likelihood function
where each f (xi) is formed by simply substituting the ith data point xi into the func-
tion f (x). When f (x) is a mass function, L can be interpreted as the probability that the
sample result occurs. When f (x) is a density, L is not a probability and, in this case, we
simply call it a likelihood function.
The maximum likelihood estimators of the parameters 1, 2,…, k are the par-
ticular values of 1, 2,…, k that maximize the function L(1, 2,…, k). The usual
method for finding these parameter values is to treat L(1, 2,…, k) as a function of
k variables and use calculus to find the extreme points of the function. For k 5 1,
ordinary differentiation is required; for k $ 2, partial derivatives are needed. Because
L(1, 2,…, k) is a product of several functions, it is usually easier to work with its natu-
ral logarithm ln(L(1, 2,…, k)), which facilitates differentiation by converting L into
a sum of functions:
ln(L(1, 2,…, k)) 5 ln( f (x1)) 1 ln( f (x2)) 1 ln( f (x3))1… 1 ln( f (xn))
Because ln(x) is an increasing function of x, the values of 1, 2,…, k that maximize
ln(L(1, 2,…, k)) are the same ones that maximize L(1, 2,…, k).
Example 7.16 The exponential distribution is commonly used to describe the lifetimes of certain
products (see Example 5.12). Suppose that a sample of n 5 12 electric appliances
are tested continuously until each ceases to function. The length of time that each
appliance lasted (in hours) follows:
10,502 9560 11,671 12,825 8987 7924
9508 8875 14,439 11,320 6549 10,654
To use maximum likelihood estimation to find the parameter of the exponential
distribution that describes this data, we proceed as follows. Suppose x1, x2, x3,…, xn
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Other Topics in Estimation (Optional) 337
is any random sample from an exponential distribution with parameter . Since the
exponential density function is of the form f (x) 5 e2x, the likelihood function as-
sociated with the sample data is
Taking logarithms,
ln(L()) 5 n ln() 2 ^ xi
d n 1
(ln(L()) 5 ny 2 ^ xi 5 0 so 5 5
d ^ xi x
Thus the maximum likelihood estimator of is n =1yx. For the lifetime of the appli-
ances, this estimate is n 5 1y10,234.5 5 .0000977 5 9.77 3 1025.
1 1 x1 2 2 1 1 x2 2 2
e2 2 a b
e2 2 a
b
5a b a b
222 222
1 1 xn 2 2
a e2 2 a
b
b
222
ny2 1 xi 2 2
1 2 ^a b
5a b e 2
22
Taking logarithms,
n 1
ln(L(, 2)) 5 2 ln(22) 2 2 ^ (xi 2 )2
2 2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
338 chapter 7 Estimation and Statistical Intervals
Since this is a function of two variables, the partial derivatives with respect to and
2 must be set to 0 and the resulting two equations solved. Omitting the details, we
find the maximum likelihood estimators to be
1
n^ i
n 5x
n2 5
(x 2 x)2
Example 7.18 In Example 2.18, the following data on tensile strength for multi-wall carbon nano-
tubes was thought to follow a Weibull distribution:
17.4 22.3 23.7 30.0 44.2 49.3 52.7 54.8 62.1
66.2 84.9 90.1 90.3 91.1 99.5 101.6 108.5 109.5
119.1 127.0 132.9 140.8 141.0 175.0 231.8 259.7
In general, let x1, x2, x3,…, xn be a random sample from a Weibull distribution with
parameters and and density function
21 2(x@)
f (x) 5 x e
5 £
^ xi ln(xi) ^ ln(xi)
2 § 5 a ^ i b
x
1y
^ xi n n
These two equations cannot be solved explicitly for the maximum likelihood es-
n and n . Instead, for each sample x1, x2, x3,…, xn, the equations must be
timates
solved using an iterative numerical procedure. For the tensile strength data, the max-
imum likelihood estimates are n 5 1.727 and n 5 109.304. These estimates can be
obtained by using the survival package in R or by using the optimization procedure
PROC NLP in SAS.
As you can see from these examples, maximum likelihood estimators are not always
unbiased. In many cases, however, this bias can be removed by using a simple mul-
tiplicative correction factor. In Example 7.17, for instance, the maximum likelihood
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Other Topics in Estimation (Optional) 339
estimator of 2 in a normal distribution is slightly biased, but that bias can be corrected
by simply multiplying the estimator by the factor ny(n 2 1). Note that as n increases,
the bias becomes negligible and the correction factor is essentially equal to 1. Beyond
some slight problems with unbiasedness, maximum likelihood estimators have several
properties that make them highly useful in practice. The two most important properties
are listed in the following box.
Example 7.19 In Example 7.16, we showed that the MLE of in an exponential distribution is
n 5 1yx. Since the mean of an exponential distribution is related to by the equation
5 1y, the MLE of is simply 1y n 5 x. That is, given g() 5 1y, then since n is
the MLE for , g(n ) is the MLE for g().
Density Estimation
In many applications, populations or processes can be described by normal density
curves. Given a random sample of size n from a normal population, the density curve
can be approximated by simply using the sample statistics x and s in place of the param-
eters and in the formula for the density curve:
1 1 x2x 2
f(x) e2 2 a s
b
22s2
Although this function can be graphed by itself, it is often good practice to superimpose
a plot of f (x) over a histogram of the sample data from which x and s were calculated.
When the bars in the histogram represent densities (see p. 19), the graph of f (x) will
be of the same scale as the histogram, because both will have a total area of 1. When
the histogram bars are simply frequencies, then f (x) must be multiplied by an appropri-
ate factor so that its area coincides with the area under the histogram. If w denotes the
width of each histogram bar and there are n data points in the sample, then the total
area encompassed by a frequency histogram is w ? n. Therefore, to make the approxi-
mate density function plot correctly over such a histogram, we must plot the function
w ? n ? f (x) instead of f (x).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
340 chapter 7 Estimation and Statistical Intervals
( )
0 1 2 3 4
To put a normal density curve around each point in a set of data x1, x2, x3,…, xn, we
must determine the appropriate mean and variance to use. Let
n
1
s2 5
n21
^ (xi 2 x)2 Unless otherwise noted, all content on this page is © Cengage Learning.
i51
denote the sample variance of the n data points; the normal density centered at xi has a
mean and standard deviation of
5 xi 5 s
where is a positive number called the smoothing parameter or window width. The
smoothing parameter controls the spread of each of the normal distributions centered at
the data points. These distributions have densities defined by
1 1 x 2 xi 2
fi(x) 5 e2 2 a s
b
for 2 ,x,
s22
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Other Topics in Estimation (Optional) 341
1 n
k(x) 5 ^ fi(x)
n i51
for 2 ,x,
Example 7.20 The tragedy that befell the space shuttle Challenger and its astronauts in 1986 led to
a number of studies to investigate the reasons for mission failure. Attention quickly
focused on the behavior of the rocket engine’s O-rings. Here is data consisting of ob-
servations on x 5 O-ring temperature (°F) for each test firing or actual launch of the
shuttle rocket engine (Presidential Commission on the Space Shuttle Challenger
Accident, Vol. 1, 1986: 129–131).
31 40 45 49 52 53 57 58 58
60 61 61 63 66 67 67 67 67
68 69 70 70 70 70 72 73 75
75 76 76 78 79 80 81 83 84
1 1 x 2 31 2
f1(x) 5 e2 2 a 6.0795 b
6.079522
Proceeding to the next largest data point, x2 5 40, we create a density curve with
mean 5 40 and 5 s 5 6.0795:
1 1 x 2 40 2
f2(x) 5 e2 2 a 6.0795 b
6.079522
After continuing in this manner through all n 5 36 data points, we take the
average of all 36 density functions to form the kernel function k(x). Fig-
ure 7.13(a) shows the plot of k(x) along with a histogram of the O-ring data. For
comparison, Figure 7.13(b) shows a kernel function based on a value of 5 .2.
Although the choice of is subjective, the value of 5 .5 provides a smoother fit
to the data.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
342 chapter 7 Estimation and Statistical Intervals
Density
.04
.03
.02
.01
0 Temperature
10 20 30 40 50 60 70 80 90
(a)
Density
.04
.03
.02
.01
0 Temperature
10 20 30 40 50 60 70 80 90
(b)
Figure 7.13 Kernel functions fit to the O-ring data of Example 7.20:
(a) 5 .5; (b) 5 .2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Other Topics in Estimation (Optional) 343
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
344 chapter 7 Estimation and Statistical Intervals
Example 7.21 In Example 7.3 (Section 7.2), the large-sample confidence interval formula
x 6 1.96sy1n was used to find a 95% confidence interval for the mean breakdown
voltage (in kV) for a particular electronic circuit. Using the sample of n = 48 observa-
tions, this interval was determined to be (53.2, 56.2). For comparison, we now use this
data to find a 95% bootstrap interval for the mean.
A histogram of B 5 1000 samples, drawn with replacement from the original
sample, is shown in Figure 7.14. Since 1 2 5 .95, the upper and lower endpoints
of the confidence interval are found by counting in B(y2) 5 1000(.05/2) 5 25
units from each end of the sorted list of 1000 sample means. For the sample
means shown in Figure 7.14, the 25th largest value is 53.21 and the 975th largest
value is 56.13, giving a confidence interval of (53.2, 56.1). Note how close the
bootstrap interval is to the earlier interval (53.2, 56.2). This is not an accident.
Bootstrap intervals usually agree closely with traditional confidence intervals
when all the assumptions necessary for the traditional interval are met.
Frequency
100
50
0 Sample mean
52 53 54 55 56 57 58
Figure 7.14 5 1000 bootstrapped sample means from the data in Example 7.3
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
7.6 Exercises 345
Comments
Since its inception in 1979, the bootstrap method has been successfully applied to many
different situations, including regression and correlation analysis, as well as other ad-
vanced statistical procedures. During this time, computer power and availability have
also dramatically increased, making the bootstrap a realistic option for data analysis. It
is now relatively easy to write macros in any statistical or spreadsheet software program
to carry out bootstrap computations.
As a rule, bootstrap intervals generally agree fairly well with traditional confidence
interval results when the assumptions necessary for the traditional interval are met. In
those cases where the assumptions are not met (e.g., when populations are not normally
distributed), bootstrap intervals offer the additional advantage of giving more realistic
results than traditional confidence intervals. For further reading on this subject, the
book by Efron and Tibshirani entitled An Introduction to the Bootstrap offers a useful
guide to applying the bootstrap (Efron, B., and R. J. Tibshirani, An Introduction to the
Bootstrap, Chapman and Hall, New York, 1993).
60. Refer to Exercise 42 of Section 7.4. b. Use the procedure outlined in this exercise to
a. Use the bootstrap method to find a 95% boot- generate a 95% bootstrap interval for the aver-
strap interval for the mean of the population age dye-layer density.
from which the data of Exercise 42 was obtained. c. Compare your result in part (b) to the 95%
b. Compare your result in part (a) to the 95% con- confidence interval found in Exercise 14(a).
fidence interval found in Exercise 42(c).
63. A random sample of n electronic assemblies is
61. Refer to Exercise 46 of Section 7.4. selected from a large shipment, and each assem-
a. Use the bootstrap method to find a 95% boot- bly is tested on an automatic test station. The
strap interval for the mean of the population number x of assemblies that do not perform cor-
from which this data was obtained. rectly is determined. Let denote the propor-
b. Compare your result in part (a) to the 95% con- tion of assemblies in the entire shipment that
fidence interval found in Exercise 46(a). are defective.
a. In terms of x, what is the maximum likelihood
62. In Exercise 14 (Section 7.2), the sample mean and
estimator of ?
standard deviation of the dye-layer density of aerial
b. Is the estimator in part (a) unbiased?
photographs of 69 forest trees were found to be 1.028
c. What is the MLE of (1 2 )5, the probability
and .163, respectively. Because the raw data is not
that none of the next five assemblies tested is
available, a researcher suggests using a computer to
defective?
generate a random sample of 69 observations from
a normal distribution whose mean and standard de- 64. Let x denote the proportion of an allotted time
viation are 1.028 and .163, respectively. If necessary, frame that a randomly selected worker spends per-
after obtaining the sample, the data are adjusted so forming a manufacturing task. Suppose the prob-
that their sample mean and standard deviation co- ability density function of x is
incide exactly with 1.028 and .163. A 95% bootstrap
( 1 1)x for 0 # x # 1
interval is then generated using this simulated data. f(x) 5 e
0 otherwise
a. Under what conditions will this procedure pro-
vide a reliable interval estimate? where the value of must be larger than 1.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
346 chapter 7 Estimation and Statistical Intervals
a. Derive the maximum likelihood estimator of point. The random variable x = time headway
for a random sample of size n. has been modeled by a shifted exponential dis-
b. A random sample of ten workers yielded the fol- tribution. For a random sample of ten headway
lowing data on x: .92, .79, .90, .65, .86, .47, .73, times—3.11, .64, 2.55, 2.20, 5.44, 3.42, 10.39,
.97, .94, .77. Use this data to obtain an estimate 8.93, 17.82, and 1.30—use the results from part
of . (a) to find estimates of and .
65. The shear strength x of a random sample of spot 68. A specimen is weighed twice on the same scale. Let
welds is measured. Shear strengths (in psi) are as- x and y denote the two measurements. Suppose x
sumed to follow a normal distribution. and y are independent of one another and are as-
a. Find the maximum likelihood estimator of the sumed to follow normal distributions with the same
strength that is exceeded by 5% of the popula- mean (the true weight of the specimen) and the
tion of welds. That is, find a maximum likeli- same variance 2.
hood estimator for the 95th percentile of the a. For a random sample of n specimens, show
normal distribution based on a random sample that the maximum likelihood estimator
of size n. Hint: Determine the relationship of 2 is given by (1y4n) ^ (xi 2 yi)2, where
between the 95th percentile and and , then (x1, y1), (x2, y2),…,(xn, yn) denote the n pairs of
use the invariance property of MLEs. scale measurements. Hint: The sample vari-
b. A random sample of ten spot-weld strengths ance of two measurements z1 and z2 equals
yields the following data (in psi): 392, 376, (z1 2 z2)2y2.
401, 367, 389, 362, 409, 415, 358, 375. Use b. Five randomly chosen specimens are weighed,
the result in part (a) to find an estimate of the yielding the following data: (3.10, 3.12), (3.52,
95th percentile of the distribution of all weld 3.45), (4.22, 4.30), (2.98, 3.06), and (5.43, 5.38).
strengths. Use the result in part (a) to find an estimate of 2.
66. Refer to Exercise 65. Suppose the strength x of an- 69. Suppose someone suggests using a smoothing
other randomly selected spot weld is measured. parameter of 5 2 to create a kernel density graph.
a. Find a maximum likelihood estimator of the Do you expect the graph to provide a useful picture
probability that x is less than 400. That is, find of the data? Why?
the MLE for P(x , 400). 70. Refer to the data in Exercise 46 of Section 7.4.
b. Use the result in part (a) with the data from a. Use a smoothing parameter of 5 .5 to create a
Exercise 65(b) to estimate P(x , 400). kernel density plot for this data.
67. A random sample x1, x2, x3,…, xn is selected from b. Repeat part (a) using a smoothing parameter of
a shifted exponential distribution whose probability 5 .3.
density function is given by c. Which of the plots in parts (a) and (b) appears to
fit the data better?
e2(x2) for x $
f (x) 5 e 71. Suppose the smallest distance d between any two
0 otherwise
successive measurements in an ordered set of data
When 5 0, this probability density function re- (i.e., measurements sorted from smallest to largest)
duces to the probability density function of the ex- is 3 units.
ponential distribution. a. If s denotes the sample standard deviation of the
a. Obtain maximum likelihood estimators of both measurements in a sample of size n, would
and . 5 dy(3s) lead to a kernel density graph with a
b. In traffic flow research, time headway is defined choppy appearance or a smooth appearance? Why?
to be the elapsed time between the moment b. Will values of that are greater than dy(3s) lead
that one car finishes passing a fixed point and to choppier- or smoother-looking kernel density
the instant that the next car begins to pass that estimates?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 347
72. Refer to the data in Exercise 42 of Section 7.4. The researcher wants to fit a new kernel function
a. Use a smoothing parameter of 5 .5 to create a to the reduced sample of n 2 1 data points. To pro-
kernel density plot for this data. duce a graph that has about the same smoothness
b. Repeat part (a) using a smoothing parameter of as the original kernel function, will the value of
5 .3. have to be raised or lowered?
c. Which of the plots in parts (a) and (b) appears to
74. In Example 1.8 (Chapter 1), a histogram was fit
fit the data better?
to the energy consumption data (in BTUs) from a
73. A kernel function is fit to the data in a sample of sample of 90 homes. Using this data, experiment
size n. Later, a researcher realizes that the larg- with different values of until you find a value that
est observation in the sample was actually a typo- gives a kernel density estimate that approximates
graphical error and, because the original lab data the shape of the histogram of this data shown in
no longer exists, this data point is removed from the Figure 1.7.
sample, leaving a sample of n 2 1 measurements.
Supplementary Exercises
75. Exercise 4 of Chapter 1 presented a sample of n = distribution volume) was read from a graph in
153 observations on ultimate tensile strength. the paper.
a. Obtain a lower confidence bound for popula- PTSD: 10, 20, 25, 28, 31, 35, 37, 38, 38, 39,
tion mean strength. Does the validity of the 39, 42, 46
bound require any assumptions about the popu- Healthy: 23, 39, 40, 41, 43, 47, 51, 58, 63,
lation distribution? Explain. 66, 67, 69, 72
b. Is any assumption about the tensile strength dis- a. Is it plausible that the population distribu-
tribution required prior to calculating a lower tions from which these samples were se-
prediction bound for the tensile strength of the lected are normal?
next specimen selected using the method de- b. Calculate an interval for which you can
scribed in this section? Explain. be 95% confident that at least 95% of all
c. Use a statistical software package to investi- healthy individuals in the population have
gate the plausibility of a normal population adjusted distribution volumes lying be-
distribution. tween the limits of the interval.
d. Calculate a lower prediction bound with a pre- c. Predict the adjusted distribution volume of
diction level of 95% for the ultimate tensile a single healthy individual by calculating a
strength of the next specimen selected. 95% prediction interval. How does this in-
terval’s width compare to the width of the
76. Anxiety disorders and symptoms can often be ef-
interval calculated in part (b)?
fectively treated with benzodiazepine medications.
d. Estimate the difference between the true
It is known that animals exposed to stress exhibit
average measures in a way that conveys in-
a decrease in benzodiazepine receptor binding in
formation about reliability and precision.
the frontal cortex. The paper “Decreased Benzo-
diazepine Receptor Binding in Prefrontal Cortex 77. The article “Quantitative MRI and Electro-
in Combat-Related Posttraumatic Stress Disorder” physiology of Preoperative Carpal Tunnel
(Amer. J. of Psychiatry, 2000: 1120–1126) described Syndrome in a Female Population” (Ergonom-
the first study of benzodiazepine receptor binding ics, 1997: 642–649) reported that (2 473.3,
in individuals suffering from PTSD. The accompa- 1691.9) was a large-sample 95% confidence in-
nying data on a receptor binding measure (adjusted terval for the difference between true average
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
348 chapter 7 Estimation and Statistical Intervals
thenar muscle volume (mm3) for sufferers of carpal 80. As reported by the Pew Research Center’s Social and
tunnel syndrome and true average volume for non- Demographic Trends Project in September 2012, a
sufferers. Calculate a 90% confidence interval for survey of 6500 American households revealed that
this difference. a record 19% owed student loan debt in 2010 (a
sharp increase from the 15% that owed such debt
78. Acrylic bone cement is commonly used in to-
in 2007).
tal joint arthroplasty as a grout that allows for the
a. Calculate and interpret a 95% CI for the pro-
smooth transfer of loads from a metal prosthesis to
portion of all American households in 2010 that
bone structure. The paper “Validation of the Small-
owed student loan debt.
Punch Test as a Technique for Characterizing the
b. What sample size is required if the desired width
Mechanical Properties of Acrylic Bone Cement”
of the 95% CI is to be at most .04, irrespective of
(J. of Engr. In Med., 2006: 11–21) gave the follow-
the sample results?
ing data on breaking force (N):
c. Does the upper limit of the interval in part (a)
Temp Medium n x s specify a 95% upper confidence bound for the
37° Dry 6 325.73 34.97 proportion being estimated? Explain.
37° Wet 6 306.09 41.97
81. Torsion during hip external rotation (ER) and ex-
Assume that all population distributions are normal.
tension may be responsible for certain kinds of in-
a. Estimate true average breaking force in a dry
juries in golfers and other athletes. The article “Hip
medium at 37° in a way that conveys informa-
Rotational Velocities During the Full Golf Swing”
tion about reliability and precision. Interpret
(J. of Sports Sci. and Med., 2009: 296–299) reported
your estimate.
on a study in which peak ER velocity and peak IR
b. Estimate the difference between true average
(internal rotation) velocity (both in deg ? sec21)
breaking force in a dry medium at 37° and true
were determined for a sample of 15 female colle-
average force at the same temperature in a wet
giate golfers during their swings. The following data
medium, and do so in a way that conveys infor-
was supplied by the article’s authors:
mation about precision and reliability. Then in-
terpret your estimate. Golfer ER IR diff z quan
1 2130.6 298.9 231.7 21.28
79. An experiment was carried out to compare various 2 2125.1 2115.9 29.2 20.97
properties of cotton/polyester spun yarn finished
3 251.7 2161.6 109.9 0.34
with softener only and yarn finished with softener
4 2179.7 2196.9 17.2 20.73
plus 5% DP-resin (“Properties of a Fabric Made with
5 2130.5 2170.7 40.2 20.34
Tandem Spun Yarns,” Textile Res. J., 1996: 607–
6 2101.0 2274.9 173.9 0.97
611). One particularly important characteristic of
fabric is its durability, that is, its ability to resist wear. 7 224.4 2275.0 250.6 1.83
For a sample of 40 softener-only specimens, the 8 2231.1 2275.7 44.6 20.17
sample mean stoll-flex abrasion resistance (cycles) 9 2186.8 2214.6 27.8 20.52
in the filling direction of the yarn was 3975.0, with a 10 258.5 2117.8 59.3 0.00
sample standard deviation of 245.1. Another sample 11 2219.3 2326.7 107.4 0.17
of 40 softener-plus specimens gave a sample mean 12 2113.1 2272.9 159.8 0.73
and sample standard deviation of 2795.0 and 293.7, 13 2244.3 2429.1 184.8 1.28
respectively. Calculate a confidence interval with 14 2184.4 2140.6 243.8 21.83
confidence level 99% for the difference between 15 2199.2 2345.6 146.4 0.52
true average abrasion resistances for the two types a. Is it plausible that the differences came from a
of fabric. Does your interval provide convincing evi- normally distributed population?
dence that true average resistances differ for the two b. Estimate the true average difference in peak
types of fabric? Why or why not? ER and IR velocities in a way that conveys
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 349
information about reliability and precision. observation is smaller than the median and all
Interpret the resulting estimate. others exceed the median?
b. What is the probability that only x2 is smaller
82. It is important that face masks used by firefighters be
than the median and all other n 2 1 observations
able to withstand high temperatures. In a test of one
exceed the median?
type of mask, the lenses in 11 of the 35 masks popped
c. What is the probability that exactly one of the
out at a temperature of 250°F. Calculate a lower con-
xi’s is less than ?
fidence bound for the proportion of all such masks
d. What is P( , y2), where y2 denotes the second
whose lenses would pop out at this temperature us-
smallest xi? Hint: , y2 occurs if either all n of
ing both the method suggested in Section 7.3 and
the observations exceed the median or all but
the method suggested in Exercise 26(b).
one of the xi’s does.
83. Suppose an investigator wants a confidence interval e. With yn21 denoting the second largest xi, what is
for the median of a continuous distribution based P( . yn21)?
on a random sample x1,…, xn without assuming f. Using the results of parts (d) and (e), what is
anything about the shape of the distribution. P(y2 , , yn21)? What does this imply about
a. What is P(x1 , ), the probability that the first the confidence level associated with the inter-
observation is smaller than the median? val (y2, yn21)? Determine the interval and as-
b. What is the probability that both the first and sociated confidence level for the data given in
the second observations are smaller than the Exercise 83.
median?
85. Suppose we have obtained a random sample
c. Let yn 5 max {x1,…, xn}. What is P(yn , )?
x1,…, xn from a continuous distribution and wish
Hint: The condition that yn is less than is
to use it as a basis for predicting a single new ob-
equivalent to what about x1,…, x2?
servation xn11 without assuming anything about the
d. With y1 5 min {x1,…, xn}, what is P( , y1)?
shape of the distribution. Let y1 and yn denote the
e. Using the results of parts (c) and (d), what is
smallest and largest, respectively, of the n sample
P(y1 , , yn)? Regarding (y1, yn) as a confi-
observations.
dence interval for , what is the associated con-
a. What is P(xn11 , x1)?
fidence level?
b. What is P(xn11 , x1 and xn11 , x2), that is, the
f. An experiment carried out to study the curing
probability that xn11 is the smallest of these
time (hr) for a particular experimental adhesive
three observations?
yielded the following observations:
c. What is P(xn11 , y1)? What is P(xn11 . yn)?
31.2 36.0 31.5 28.7 37.2 d. What is P(y1 , xn11 , yn), and what does this
35.4 33.3 39.3 42.0 29.9 say about the prediction level associated with
Referring back to part (e), determine the con- the interval (y1, yn)? Determine the interval and
fidence interval and the associated confidence associated prediction level for the curing time
level. data given in Exercise 83.
g. Assuming that the data in part (f) was selected
86. The derailment of a freight train due to the
from a normal distribution (is this assumption
catastrophic failure of a traction motor arma-
justified?), calculate a confidence interval for
ture bearing provided the impetus for a study
(which for a normal distribution is identical to )
reported in the article “Locomotive Traction
using the same confidence level as in part (f ),
Motor Armature Bearing Life Study” (Lubri-
and compare the two intervals.
cation Engr., Aug. 1997: 12–19). A sample of
84. Consider the situation described in Exercise 83. 17 high-mileage traction motors was selected
a. What is P(x1 , , x2 . , x3 . ,…, xn . ), and the amount of cone penetration (mm@10)
that is, the probability that only the first was determined both for the pinion bearing and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
350 chapter 7 Estimation and Statistical Intervals
for the commutator armature bearing, resulting The article “High-Performance Wire Electrodes
in the following data: for Wire Electrical-Discharge Machining—A Review”
Motor: 1 2 3 4 5 6 (J. of Engr. Manuf., 2012: 1757–1773) gave the follow-
Commutator: 211 273 305 258 270 209 ing sample observations on total coating layer thick-
ness (in m) of eight wire electrodes used for WEDM:
Pinion: 226 278 259 244 273 236
Motor: 7 8 9 10 11 12 21 16 29 35 42 24 24 25
Commutator: 223 288 296 233 262 291 a. Is it plausible that the given sample observations
Pinion: 290 287 315 242 288 242 were selected from a normal distribution?
Motor: 13 14 15 16 17 b. Calculate and interpret a 95% CI for true average
Commutator: 278 275 210 272 264 total coating layer thickness in all such electrodes.
c. Predict the total coating layer thickness for a
Pinion: 278 208 281 274 268
single electrode in a way that conveys informa-
Calculate an estimate of the population mean dif- tion about precision and reliability.
ference between penetration for the commutator
armature bearing and penetration for the pinion 89. Nine Australian soldiers were subjected to extreme
bearing, and do so in a way that conveys infor- conditions that involved a 100-min walk with a 25-lb
mation about the reliability and precision of the pack when the temperature was 40°C (104°F). One of
estimate. (Note: A normal quantile plot validates them overheated (above 39°C) and was removed from
the necessary normality assumption.) Would you the study. Here are the rectal Celsius temperatures of
say that the population mean difference has been the other eight at the end of the walk (“Neural Net-
precisely estimated? Does it look as though popu- work Training on Human Body Core Temperature
lation mean penetration differs for the two types Data,” Combatant Protection and Nutrition Branch,
of bearings? Explain. Aeronautical and Maritime Research Laboratory of
Australia, DSTO TN-0241, 1999):
87. The article cited in Exercise 86 also included the
38.4 38.7 39.0 38.5 38.5 39.0 38.5 38.6
following data on percentage of oil remaining for
the commutator bearings: We would like to get a 95% confidence interval for
71.02 86.49 81.14 84.89 87.42 the population mean.
84.49 82.09 80.97 69.80 89.29 a. Compute the t-based confidence interval of
86.10 86.80 83.41 60.56 88.80 Section 7.4.
86.41 86.19
b. Use the bootstrap method to find a 95% bootstrap
Would you use the one-sample t confidence in- interval for the population mean.
terval to estimate the population mean and medi-
an? Estimate the population median percentage c. Compare your results in parts (a) and (b).
of oil left using the interval suggested in Exer 90. Suppose that samples of size n1, n2, and n3 are in-
cise 84, and determine the corresponding confi- dependently selected from three different popula-
dence level. tions. Let i and i (i 5 1, 2, 3) denote the popula-
88. Wire electrical-discharge machining (WEDM) is tion means and standard deviations, and consider
a process used to manufacture conductive hard estimating 5 a11 1 a22 1 a33, where the ai >s
metal components. It uses a continuously mov- are specified numerical constants. A point estimate
ing wire that serves as an electrode. Coated wires of is n 5 a1x1 1 a2x2 1 a3x3. When the sample
have been used to substantially increase the cut- sizes are all large, n has approximately a normal
ting speed and precision of the process. Coating distribution with variance
on the wire electrode allows for cooling of the
wire electrode core and provides an improved 21 22 23
2n 5 a21 ? 1 a22 ? 1 a23 ?
cutting performance. n1 n2 n3
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Bibliography 351
An estimated variance s2n results from replacing the 92. The one-sample CI for a normal mean and PI for a
2>s; by the s2>s; n can then be standardized to ob- single observation from a normal distribution were
tain a z variable from which the confidence interval both based on the central t distribution. A CI for a
n 6 (z crit)s n is obtained. Suppose that samples of particular percentile (e.g., the 1st percentile or the
three different brands of tires with identical lifetime 95th percentile) of a normal population distribution
ratings—a store brand (1) and two national brands is based on the noncentral t distribution. A particular
(2 and 3)—are selected, and the lifetime of each distribution of this type is specified by both df and
tire is determined, resulting in the following data: the value of the noncentrality parameter ( 5 0
gives the central t distribution). The key result is that
Sample
the variable
Sample Sample standard
Brand size mean deviation x2
2 (z percentile) 1n
1 40 38,376 1522 y1n
t5
2 32 41,569 1711 sy
3 32 42,123 1645
Calculate and interpret a confidence interval with has a noncentral t distribution with df 5 n 2 1 and
confidence level 95% for 5 1 2 (2 1 3)y2. 5 (2z percentile) 1n. Let t.025.v, and t.975,v, de-
note the critical values that capture lower tail area
91. Recent information suggests that obesity is an in- .025 and upper tail area .025, respectively, under
creasing problem in America among all age groups. the noncentral t curve with v df and noncentrality
The Associated Press (October 9, 2002) reported that parameters (when 5 0, t.025 5 2t.975, since central
1276 individuals in a sample of 4115 adults were t distributions are symmetric about 0).
found to be obese (a body mass index exceeding 30; a. Use the given information to obtain a formula
this index is a measure of weight relative to height). for a 95% confidence interval for some particular
a. Estimate the proportion of all American adults percentile of a normal population distribution.
who are obese in a way that conveys informa- b. For 5 6.58 and df 5 15, t.025 and t.975 are
tion about the reliability and precision of the (from Minitab) 4.1690 and 10.9684, respectively.
estimate. Use this information to obtain a 95% CI for the
b. A 1998 survey based on people’s own assess- 5th percentile of the modulus of elasticity distri-
ments revealed that 20% of all adult Americans bution considered in Example 7.10.
consider themselves obese. Does the estimate of
part (a) suggest that the 2002 percentage is more
than 1.5 times the 1998 percentage? Explain.
Bibliography
DeGroot, Morris, and Mark Schervish, Probability and New York, 2012. An excellent survey of general con-
Statistics (4th ed.), Addison-Wesley, Reading, MA, cepts of inference.
2011. A very good exposition of the general principles Hahn, Gerald, and William Meeker, Statistical Inter-
of statistical inference at a level somewhat above that vals, Wiley, New York, 2011. Everything you ever
of our book. wanted to know about statistical intervals—confidence,
Devore, Jay and Kenneth Berk, Modern Mathemati- prediction, tolerance, and others.
cal Statistics with Applications (2nd ed.), Springer,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
352 chapter 8 Testing Statistical Hypotheses
8
BSIP/UIG/Getty images
Testing Statistical
Hypotheses
8.1 Hypotheses and Test Procedures
8.2 Tests Concerning Hypotheses About Means
8.3 Testing Concerning Hypotheses
About a Categorical Population
8.4 Testing the Form of a Distribution
8.5 Further Aspects of Hypothesis Testing
Introduction
Estimation of a parameter does not explicitly involve making a decision; instead we
wish to determine the most plausible value (a point estimate) or a range of plausible
values (a confidence interval). In contrast, the objective of a hypothesis-testing analysis
is to decide which of two competing claims (hypotheses) is true. We have already
encountered an informal situation of this sort in the context of quality control: At
each time point, we used sample information to decide whether a process was
out of control. The decision rule involved control limits, with the out-of-control
conclusion justified only if the value of some quality statistic fell outside the limits.
In Section 8.1, we discuss the forms of hypotheses about parameters and the
general nature of for deciding between the two relevant hypotheses.
Test procedures based on distributions are developed in Section 8.2 for testing
hypotheses about a single mean or about the difference 1 2 2 between two
means. Sections 8.3 and 8.4 introduce procedures for hypotheses about certain
population proportions and population distributions. Finally, in Section 8.5, we con-
sider a variety of issues and concepts relating to the behavior of test procedures.
Hypothesis testing methods, as well as estimation methods, will be used extensively
throughout the remainder of the book.
352
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.1 Hypotheses and Test Procedures 353
definitions The null hypothesis, denoted by H0, is the assertion that is initially assumed to be
true. The alternative hypothesis, denoted by Ha, is the claim that is contradictory to
H0. The null hypothesis will be rejected in favor of the alternative hypothesis only if
sample evidence suggests that H0 is false. If the sample does not strongly contradict
H0, we will continue to believe in the truth of the null hypothesis. The two possible
conclusions from a hypothesis-testing analysis are then reject H0 or fail to reject H0.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
354 chapter 8 Testing Statistical Hypotheses
Example 8.1 Because of machining process variability, bearings produced by a certain machine
do not have identical diameters. Let denote the true average diameter for bearings cur-
rently being produced. The machine was initially calibrated to achieve the design spec-
ification 5 .5 in. However, the manufacturer is now concerned that the diameters no
longer conform to this specification. That is, the hypothesis Þ .5 must now be consid-
ered a possibility. If sample evidence suggests that this latter hypothesis is indeed correct,
the production process will have to be halted while recalibration takes place. Stopping
the process is quite costly, so the manufacturer wants to be sure that recalibration is nec-
essary before this is done. Under these circumstances, a sensible choice of hypotheses is
H0: 5 .5 (the specification is being met, so recalibration is unnecessary)
Ha: Þ .5
Only compelling sample evidence would then result in H0 being rejected in favor
of Ha.
In many hypothesis-testing problems that we will consider, the null and alternative
hypotheses assume particular forms. H0 will be
Ha then results from replacing the “5” in H0 by one of the three possible inequalities:
.,,, or Þ; the relevant inequality again depends on the research objectives. One
example of this is H0: 5 .002 versus Ha: , .002, where is the process standard
deviation of bearing diameter.
Example 8.2 A pack of a certain brand of cigarettes displays the statement “1.5 mg nicotine aver-
age per cigarette by FTC method.” Let denote the mean nicotine content per
cigarette for all cigarettes of this brand. The advertised claim is that 5 1.5. People
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.1 Hypotheses and Test Procedures 355
who smoke this brand would probably be disturbed if it turned out that true average
nicotine content exceeded the claimed value, since excessive nicotine ingestion is a
known health hazard. Suppose a sample of cigarettes of this brand is selected and the
nicotine content of each cigarette is determined. Evidence from this sample against
the company’s claim would have to be quite strong before the accusation is made
that the claim is false, since serious financial and legal consequences could ensue
from any such action. This suggests that we test
H0: 5 1.5 (the advertised claims is correct)
against the alternative hypothesis
Ha: . 1.5 (true average nicotine level exceeds the advertised value)
and reject H0 in favor of Ha only if sample evidence is very compelling for this conclusion.
Since the alternative hypothesis in Example 8.2 asserted that . 1.5, it might have
seemed sensible to state H0 as the inequality # 1.5. This assertion is in fact the implicit
null hypothesis, but we will state H0 explicitly as a claim of equality. There are several
reasons for this. First of all, the development of a test procedure is most easily understood
if there is a unique value of (or , or whatever other parameter is under consideration)
when H0 is true. Second, suppose sample data gives much more support to . 1.5
than to 5 1.5. Then there would also be more support for . 1.5 than for # 1.5.
If, on the other hand, 5 1.5 is much more plausible than . 1.5 in light of the data,
then # 1.5 would also be deemed more plausible than . 1.5. So the conclusion
when testing H0: 5 1.5 versus Ha: . 1.5 should be identical to that when consider-
ing the more realistic null hypothesis # 1.5 against this alternative. Similarly, what-
ever conclusion reached when testing H0: 5 .1 versus Ha: , .1 would also apply to
the implicit null hypothesis H0: $ .1.
No reasonable test procedure can guarantee complete protection against either type of
error; this is the price we pay for basing our inference on sample data.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
356 chapter 8 Testing Statistical Hypotheses
Example 8.3 Suppose you have to purchase tires for your vehicle and have narrowed your choice
to a certain name-brand tire and another tire sold only through a particular chain of
stores. The name-brand tire is more expensive to purchase than the store-brand tire, but
the extra expense would be justified if the lifetime of the former significantly exceeded
that of the latter. Let 1 denote true average tire lifetime for the brand-name tire under
specified testing conditions, and let 2 denote true average lifetime for the store-brand
tire under these conditions. You have decided that the extra expense can be justified
only if 1 exceeds 2 by more than 10,000 miles, and you want to see persuasive evi-
dence before incurring this extra expense. The natural choice of hypotheses is then
H0: 1 2 2 5 10,000
Ha: 1 2 2 . 10,000
A type I error here involves rejecting H0 and purchasing the name-brand tire when
its true average mileage does not exceed that of the store-brand tire by more than
10,000 miles. A type II error consists of not rejecting H0 and purchasing the less
expensive tire when the true average lifetime of the name-brand tire actually does
exceed that of the store brand by more than 10,000 miles.
Recall that when sampling a population or a process, sampling variability will
virtually always be present. In particular, the value of a sample mean x may be rather
different from the value of . In the tire situation, even if 1 2 2 does equal 10,000,
the name-brand tires in the sample may be unusually good and the store-brand
sample unusually bad, yielding data for which H0 should be rejected. On the other
hand, perhaps 1 2 2 5 12,000, so H0 is false; yet there is some chance that the
store-brand sample would be unusually good and the name-brand sample not so
impressive, suggesting that H0 should not be rejected.
definition The probability of making a type I error is denoted by and is called the level of
significance or significance level of the test. Thus a test with 5 .01 is said to
have a significance level of .01. This means that if H0 is actually true and the test
procedure is used repeatedly on different samples selected from the population
or process, in the long run H0 would be incorrectly rejected only 1% of the time.
The probability of a type II error is denoted by .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.1 Hypotheses and Test Procedures 357
evidence to make it less likely that an innocent person will be convicted also makes it
more likely that a guilty person will go free). If a type I error is much more serious than
a type II error, a very small value of is reasonable. When a type II error could have
quite unpleasant consequences, it is better to use a larger to keep under control.
This leads to the following general principle for specifying a test procedure:
After thinking about the relative consequences of type I and type II errors, decide on
the largest that is tolerable for the situation under consideration. Then employ a test
procedure that uses this maximum acceptable value—rather than anything smaller—
as the significance level (because using a smaller level would increase ). In following
this principle, we are making as small as possible subject to keeping a clamp on .
Thus if you decide that 5 .05 is tolerable, you should not use a test with 5 .01 or .001,
because doing so would inflate . The significance levels used most frequently in practice
are .05 and .01 (a 1-in-20 or 1-in-100 chance of rejecting H0 when it is true), but the level
that you decide to employ should reflect the seriousness of errors in your specific situation.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
358 chapter 8 Testing Statistical Hypotheses
Having decided on a test statistic and calculated its value for the given sample, we now
ask the following key question: If H0 is true, how likely is it that a test statistic value at least as
contradictory to H0 as the one obtained would result? If the likelihood of this is very small,
then the test statistic value is quite extreme relative to what the null hypothesis suggests and
very contradictory to H0. On the other hand, if there is a large chance of a value at least this
extreme occurring when H0 is true, then what was observed is reasonably consistent with H0.
definition The P-value, or observed significance level (OSL), is the probability, calculated
assuming H0 is true, of obtaining a test statistic value at least as contradictory to H0
as the value that actually resulted. The smaller the P-value, the more contradic-
tory is the data to H0. The null hypothesis should then be rejected if the P-value
is sufficiently small. In particular, the following decision rule specifies a test with
the desired significance level (type I error probability) :
Reject H0 if P@value # .
Do not reject H0 if P@value . .
Example 8.4 The recommended daily dietary allowance (RDA) for zinc among males older than
50 years is 15 mg/day (World Almanac, 1992). The article “Nutrient Intakes and Dietary
Patterns of Older Americans: A National Study” (J. of Gerontology, 1992: M145–M150)
reported the following data on zinc intake for a sample of males age 65–74 years:
n 5 115 x 5 11.3 s 5 6.43
Does this data suggest that , the average daily zinc intake for the entire population
of males age 65–74, is less than the RDA? The relevant hypotheses are
H0: 5 15
Ha: , 15
Figure 8.1 shows a boxplot of data consistent with the given summary quantities.
Roughly 75% of the sample observations are smaller than 15 (the top edge of the box is
at the upper quartile). Furthermore, the observed x value, 11.3, is certainly smaller than
15, but this could be just the result of sampling variability when H0 is true. Is it plausible
that a sample mean this much smaller than what was expected if H0 were true occurred
as a result of chance variation, or is , 15 a better explanation for what was observed?
Unless otherwise noted, all content on this page is © Cengage Learning.
Zinc intake
30
20
10
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.1 Hypotheses and Test Procedures 359
In Example 8.4, given that the alternative hypothesis asserted , 15, it might
seem reasonable to state H0 as $ 15, previously referred to as the implicit null hypoth-
esis. However, our null hypothesis is explicitly stated as a claim of equality (H0: 5 15).
On page 355 we asserted the conclusion using H0: 5 15 versus H0: , 15 would be
identical to that when considering H0: $ 15 versus Ha: , 15. Let us see why this is
the case.
In the previous example, we tested H0: 5 15 versus Ha: , 15 and rejected H0
in favor of Ha. Thus, we believe that , 15 is a much more plausible assertion than
5 15. It follows logically that we would also believe that , 15 is a much more plau-
sible than the claim that 5 16, or the claim that 5 17, and so on. In other words,
when we reject H0: 5 15 in favor of Ha: , 15, we are also implicitly saying that
, 15 is much more plausible than any value of that exceeds 15. This is why explicit
consideration of the null hypothesis with a claim of equality is equivalent to considering
the more realistic H0 that includes an appropriate inequality.
Let 0 denote the value of asserted by the null hypothesis (0 5 15 in Example 8.4).
The test statistic for testing hypotheses about when the sample size n is large is
x 2 0
z5
sy1n
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
360 chapter 8 Testing Statistical Hypotheses
When H0 is true, this test statistic will have approximately a standard normal distribu-
tion (this will be true for any test statistic labeled z in this book). The P-value is then a
z-curve area that depends on the inequality in H0:
Inequality in H0 P-value Type of test
. Area to the right of the calculated z Upper-tailed
, Area to the left of the calculated z Lower-tailed
Þ 2 ? (tail area captured by calculated z) Two-tailed
These three cases are illustrated in Figure 8.2.
curve
-value area in upper tail
1. Upper-tailed test
a contains the inequality >
0
Calculated
curve
-value area in lower tail
2. Lower-tailed test
a contains the inequality <
0
Calculated
3. Two-tailed test
a contains the inequality
Calculated
Figure 8.2 Determination of the -value when the test statistic is Unless otherwise noted, all content on this page is © Cengage Learning.
where denotes true average bearing diameter. The large-sample test statistic is
x 2 .5
z5
sy1n
In this situation, values of x either much larger or much smaller than .5, corresponding
to z values far from zero in either direction, are inconsistent with H0 and give support to
Ha. If, for example, z 5 22.76, then
P@value 5 P(observing a z value at least as contradictory to H0 as 22.76
when H0 is true)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.1 Exercises 361
The P-value would also be .0058 if z 5 2.76. Using a significance level of .05, H0 would
be rejected because P-value # .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
362 chapter 8 Testing Statistical Hypotheses
has been developed in an attempt to reduce warpage. have approximately a standard normal distribution
The regular laminate will be used on one sample of when H0 is true. Determine the value of z and the
specimens and the special laminate on another sam- P-value in each of the following cases:
ple; the amount of warpage will then be determined a. n 5 50, x 5 34.43, s 5 1.06
for each specimen. The manufacturer will then b. n 5 50, x 5 33.57, s 5 1.06
switch to the special laminate only if it can be dem- c. n 5 32, x 5 33.25, s 5 1.89
onstrated that the true average amount of warpage d. n 5 36, x 5 34.66, s 5 2.53
for that laminate is less than for the regular laminate.
State the relevant hypotheses, and describe the type I 13. It is specified that a certain type of iron should
and type II errors in the context of this situation. contain .85 gm of silicon per 100 gm of iron
(.85%). The silicon content of each of 32 randomly
8. a. Use the definition of a P-value to explain why selected iron specimens was determined, and the
H0 would certainly be rejected if P-value 5 accompanying Minitab output resulted from a test
.0003. of the appropriate hypotheses:
b. Use the definition of a P-value to explain why
Variable N Mean StDev SE Mean Z P-Value
H0 would definitely not be rejected if P-value 5
sil cont 32 0.8228 0.1894 0.0335 -0.81 0.42
.350.
a. What hypotheses were tested?
9. For which of the given P-values will the null hy-
b. What conclusion would be reached for a signifi-
pothesis be rejected when using a test with a signifi-
cance level of .05, and why? Answer the same
cance level of .05?
question for a significance level of .10.
a. .001 b. .021 c. .078
d. .047 e. .156 14. Lightbulbs of a certain type are advertised as having
an average lifetime of 750 hours. The price of these
10. For each of the given pairs of P-values and signifi- bulbs is very favorable, so a potential customer has
cance levels, state whether H0 should be rejected. decided to go ahead with a purchase arrangement
a. P@value 5 .084, 5 .05 unless it can be conclusively demonstrated that the
b. P@value 5 .003, 5 .001 true average lifetime is smaller than what is adver-
c. P@value 5 .048, 5 .05 tised. A random sample of 50 bulbs was selected,
d. P@value 5 .084, 5 .10 the lifetime of each bulb determined, and the ap-
e. P@value 5 .039, 5 .01 propriate hypotheses were tested using Minitab,
f. P@value 5 .017, 5 .10 resulting in the accompanying output:
11. Let denote the true average reaction time to a cer- Variable N Mean StDev SEMean Z P-Value
tain stimulus. A test of H0: 5 5 versus Ha: . 5 lifetime 50 738.44 38.20 5.40 -2.14 0.016
will be based on a large sample size so that when a. How can you tell from the output that the alter-
H0 is true, the test statistic z 5 (x 2 5)y(sy1n) native hypothesis was not Ha: . 750?
has approximately a standard normal distribution b. What conclusion would be appropriate for a
(the z curve). Determine the value of z and the cor- significance level of .05? A significance level
responding P-value in each of the following cases: of .01? What significance level and conclusion
a. n 5 50, x 5 5.23, s 5 .89 would you recommend?
b. n 5 35, x 5 5.72, s 5 1.01
15. A sample of 40 speedometers of a particular type is se-
c. n 5 40, x 5 5.35, s 5 1.67
lected, and each speedometer is calibrated for accura-
12. Newly purchased automobile tires of a certain type cy at 55 mph, resulting in a sample mean and sample
are supposed to be filled to a pressure of 34 psi. Let standard deviation of 53.87 and 1.36, respectively.
denote the true average pressure. A test of H0: 5 34 Does this data suggest that the true average reading
versus Ha: Þ 34 will be based on a large sample of when speed is 55 mph is in fact something other than
tires so that the test statistic z 5 (x 2 34)y(s 1n) will 55? State the relevant hypotheses, calculate the value
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 363
of the appropriate z statistic, determine the P-value, A measure of the accuracy of the automatic region
and state the conclusion for a significance level of .01. is the average linear displacement (ALD). The
paper gave the following ALD observations for a
16. To obtain information on the corrosion-resistance
sample of 49 kidneys (units of pixel dimensions).
properties of a certain type of steel conduit, 35
specimens are buried in soil for an extended pe- 1.38 0.44 1.09 0.75 0.66 1.28 0.51
riod. The maximum penetration (in mils) is then 0.39 0.70 0.46 0.54 0.83 0.58 0.64
measured for each specimen, yielding a sample 1.30 0.57 0.43 0.62 1.00 1.05 0.82
mean penetration of 52.7 and a sample standard 1.10 0.65 0.99 0.56 0.56 0.64 0.45
deviation of 4.8. The conduits were manufactured 0.82 1.06 0.41 0.58 0.66 0.54 0.83
with the specification that true average penetration 0.59 0.51 1.04 0.85 0.45 0.52 0.58
be at most 50 mils. Does the sample data indicate 1.11 0.34 1.25 0.38 1.44 1.28 0.51
that specifications have not been met? State the rel- a. Summarize and describe the data.
evant hypotheses, calculate the value of the appro- b. Is it plausible that ALD is at least approximately
priate z statistic, determine the P-value, and state normally distributed? Must normality be as-
the conclusion for a significance level of .05. sumed prior to testing hypotheses about true
17. Automatic identification of the boundaries of sig- average ALD? Explain.
nificant structures within a medical image is an c. The authors commented that in most cases the
area of ongoing research. The article “Automatic ALD is better than or on the order of 1.0. Does
Segmentation of Medical Images Using Image the data in fact provide strong evidence for
Registration: Diagnostic and Simulation Applica- concluding that true average ALD under these
tions” (J. of Medical Engr. and Tech., 2005: 53–63) circumstances is less than 1.0? Carry out an ap-
discussed a new technique for such identification. propriate test of hypotheses.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
364 chapter 8 Testing Statistical Hypotheses
0
Calculated
curve for relevant df
0
Calculated
3. Two-tailed test
a contains the inequality
Calculated
Figure 8.3 -values for tests: (1) upper-tailed; (2) lower-tailed; (3) two-tailed
8 df t distribution. If the calculated value of the test statistic is t 5 1.6, then the P-value
for this upper-tailed test is .074. Because .074 exceeds .05, we would not be able to
reject H0 at significance level .05. If the alternative hypothesis is Ha: , 100 and a test
based on 20 df yields t 5 23.2, then Appendix Table VI shows that the P-value is the
captured lower-tail area .002. The null hypothesis can be rejected at either level .05 or
.01. Consider testing H0: 1 2 2 5 0 versus Ha: 1 2 2 Þ 0; the null hypothesis states
that the means of the two populations are identical, whereas the alternative hypothesis
states that they are different without specifying a direction of departure from H0. If the
test is based on 20 df and t 5 3.2, then the P-value for this two-tailed test is 2(.002) 5
.004. This would also be the P-value for t 5 23.2. The tail area is doubled because
values both larger than 3.2 and smaller than 23.2 are more contradictory to H0 than
what was calculated (values farther out in either tail of the t curve). Notice that if the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 365
calculated value of t exceeds 4.0, for all but very small df’s the captured tail area is negli-
gible. Also note that the table jumps from 40 df to 60 df to 120 df to (the z or standard
normal curve). For example, for 45 df, one could either interpolate between 40 df and
60 df, or use the z-curve area as an approximation.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
366 chapter 8 Testing Statistical Hypotheses
Example 8.5 Glycerol is a major by-product of ethanol fermentation in wine production and
contributes to the sweetness, body, and fullness of wines. The article “A Rapid and
Simple Method for Simultaneous Determination of Glycerol, Fructose, and Glucose
in Wine” (American J. of Enology and Viticulture, 2007: 279–283) includes the fol-
lowing observations on glycerol concentration (mg/mL) for samples of standard-
quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired
concentration value is 4. Does the sample data suggest that true average concen-
tration is something other than the desired value? The normal quantile plot in
Figure 8.4 provides strong support for assuming that the population distribution of
glycerol concentration is normal. Let’s carry out a test of appropriate hypotheses
using the one-sample t test with a significance level of .05.
4.5
4.0
Concentration
3.5
3.0
2.5
2 1 0 1 2
quantile
Figure 8.4 Normal quantile plot for the data of Example 8.5
Our analysis employs a sequence of steps that we advocate using for any hy-
pothesis-testing investigation:
Unless otherwise noted, all content on this page is © Cengage Learning.
1. Parameter of interest: 5 true average glycerol concentration
2. Null hypothesis: H0: 5 4
3. Alternative hypothesis: Ha: Þ 4
24
4. Test statistic formula: t 5 xsy1n (do not substitute sample quantities yet)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 367
Therefore the area under the 4 df curve to the left of 2.6 is .290. Because
the test is two-tailed, P-value 5 2(.290) 5 .580.
7. Conclusion: The specified significance level is 5 .05. Since P-value 5
.580 . .05 5 , we cannot reject H0 at this (or any other reasonable) signifi-
cance level. The data does not provide strong evidence for concluding that
population mean glycerol concentration differs from 4. Notice that in not
rejecting H0, we may be committing a type II error (not rejecting the null
hypothesis when it is false); we hope, though, we came to this conclusion
for the right reason!
The R output from a request to carry out the test follows. The P-value differs
slightly from ours because R uses more decimal accuracy in computing t. Thus,
if H0 were true, about 59% of all samples would yield a value of t more extreme
than what we obtained. We decided not to reject H0 because 2.58 is not in the
most extreme 5% of all t values.
One Sample t-test
data: concentration
t = -0.5789, df = 4, p-value = 0.5937
alternative hypothesis: true mean is not equal to 4
95 percent confidence interval: 2.921875 4.706125
sample estimates: mean of x 3.814
Suppose the sample size in Example 8.5 had been 45 rather than 5, with the same
values of x and s. The normality assumption for glycerol concentration becomes un-
necessary. The test statistic would be labeled z, and its value would be z 5 21.74.
Appendix Table I shows that the area under the z curve to the left of 21.74 is .0409, so
the P-value is 2(.0409) 5 .0818 and H0 would be rejected at level .10 but not at levels
.05 or .01.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
368 chapter 8 Testing Statistical Hypotheses
Inferences about the value of 1 relative to 2 are based on two independently obtained
random samples, one from the first population, process, or treatment and the other
from the second. Let
n1 5 number of observations in the first sample
x1 5 sample mean of these n1 observations
s21 5 sample variance of these n1 observations
and n2, x2, and s22 are defined analogously with respect to the second sample. Assume that
both population, process, or treatment response distributions are normal. A confidence
interval for the difference 1 2 2 was based on the fact that the standardized variable
x1 2 x2 2 (1 2 2)
t5
s21 s22
1
C n1 n2
has approximately a t distribution. Suppose the null hypothesis is H0: 1 2 2 5 4 (i.e.,
the value of 1 is 4 larger than the value of 2). A test statistic results from replacing
1 2 2 in the numerator of t by the null value 4. The test statistic then has approxi-
mately a t distribution when the null hypothesis is true. The test will be upper-tailed if
the alternative hypothesis is Ha: 1 2 2 . 4, lower-tailed if the alternative contains the
inequality ,, and two-tailed if Þ appears in Ha.
A general description of the test procedure requires the use of a symbol for the null
value; we use the Greek letter D for that purpose. Most frequently, in practice, D 5 0,
in which case the null hypothesis says there is no difference between the two ’s.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 369
Example 8.6 The deterioration of many municipal pipeline networks across the country is a grow-
ing concern. One technology proposed for pipeline rehabilitation uses a flexible lin-
er threaded through existing pipe. The article “Effect of Welding on a High-Density
Polyethylene Liner” (J. of Materials in Civil Engr., 1996: 94–100) reported the fol-
lowing data on tensile strength (psi) of liner specimens both when a certain fusion
process was used and when this process was not used:
1. No fusion: 2748 2700 2655 2822 2511
3149 3257 3213 3220 2753
n1 5 10 x1 5 2902.8 s1 5 277.3 se1 5 87.69
2. Fused: 3027 3356 3359 3297 3125 2910 2889 2902
n2 5 8 x2 5 3108.1 s2 5 205.9 se2 5 72.80
Figure 8.5 shows normal probability plots from Minitab. These plots employ a
probability scale rather than the normal quantiles discussed previously, but the criti-
cal issue is the same: Is the pattern of plotted points reasonably close to linear? There
certainly is some wiggling in these plots, but not enough to suggest that the normal-
ity assumption is implausible. Furthermore, the P-values that appear along with the
plots are for formal tests of the assertion that the underlying distributions are normal
(we discuss this test in Section 8.4). Because each P-value exceeds .1, the hypothesis
of normality cannot be rejected.
99
Mean 2903
95 StDev 277.3
90 N 10
RJ 0.944
Unless otherwise noted, all content on this page is © Cengage Learning.
80
70 P-Value >0.100
Percent
60
50
40
30
20
10
5
1
2200 2400 2600 2800 3000 3200 3400 3600
NotFused
99
Mean 3108
95 StDev 205.9
90 N 8
RJ 0.939
80
70 P-Value >0.100
60
Percent
50
40
30
20
10
5
1
2500 2750 3000 3250 3500
Fused
Figure 8.5 ( )
The authors of the article stated that the fusion process increased the average
tensile strength. The message from the comparative boxplot of Figure 8.6 is not all
that clear. Let’s carry out a test of hypotheses to see whether the data supports this
conclusion.
1. Let 1 be the true average tensile strength of specimens when the no-fusion
treatment is used and 2 denote the true average tensile strength when the
fusion treatment is used.
2. H0: 1 2 2 5 0 (no difference in the true average tensile strengths for
the two treatments)
Type 2
Unless otherwise noted, all content on this page is © Cengage Learning.
Type 1
Strength
2500 2600 2700 2800 2900 3000 3100 3200 3300 3400
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 371
Suppose the issue in Example 8.6 had been whether fusing increased true aver-
age strength by more than 100 psi. Then the relevant hypotheses would have been
H0 5 1 2 2 5 2100 versus Ha: 1 2 2 , 2 100; that is, the null value would have
been D 5 2100.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
372 chapter 8 Testing Statistical Hypotheses
be applied to one plot within each pair and the second formulation used on the other
plot. This pairing is really a special case of blocking, as discussed in Chapter 4. The
homogeneity of experimental units within each block (pair) makes it easier to detect a
difference between the treatments if a difference actually exists.
Again, let 1 and 2 denote the two population, process, or treatment response
means. The pairs in a sample can be viewed as having been selected from a much larger
population of pairs. Now conceptualize subtracting the second number in each such
pair from the first number to obtain a population of differences. If we let d denote the
population mean difference, it follows that
d 5 1 2 2
This relationship implies that any hypothesis about 1 2 2 is equivalent to a hypothe-
sis about d. For example, the assertion that 1 2 2 5 10 is the same as the claim
d 5 10. But hypotheses about d can be tested by using the sample differences. In
particular, assuming that the underlying distribution of differences is normal, we can
use a one-sample t test based on these sample differences.
Example 8.7 Musculoskeletal neck-and-shoulder disorders are all too common among office
staff who perform repetitive tasks using visual display units. The article “Upper-
Arm Elevation During Office Work” (Ergonomics, 1996: 1221–1230) reported
on a study to determine whether more varied work conditions would have any
impact on arm movement. The accompanying data was obtained from a sample
of n 5 16 subjects. Each observation is the amount of time, expressed as a pro-
portion of total time observed, during which arm elevation was below 30°. The
two measurements from each subject were obtained 18 months apart. During
this period, work conditions were changed, and subjects were allowed to engage
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Tests Concerning Hypotheses About Means 373
in a wider variety of work tasks. Does the data suggest that true average time dur-
ing which elevation is below 30° differs after the change from what it was before
the change?
Subject: 1 2 3 4 5 6 7 8
Before: 81 87 86 82 90 86 96 73
After: 78 91 78 78 84 67 92 70
Difference: 3 24 8 4 6 19 4 3
Subject: 9 10 11 12 13 14 15 16
Before: 74 75 72 80 66 72 56 82
After: 58 62 70 58 66 60 65 73
Difference: 16 13 2 22 0 12 29 9
Figure 8.7 shows a normal probability plot of the 16 differences; the pattern in the
plot is quite straight, supporting the normality assumption. A boxplot of these differ-
ences appears in Figure 8.8; the boxplot is located considerably to the right of zero,
suggesting that perhaps d . 0 (note also that 13 of the 16 differences are positive
and only two are negative).
99
Mean 6.75
95 StDev 8.234
90 N 16
RJ 0.992
80
70 P-Value >0.100
Percent
60
50
40
30
20
10
5
1
10 0 10 20 30
Difference
Difference
–10 0 10 20
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
374 chapter 8 Testing Statistical Hypotheses
Let’s now use the recommended sequence of steps to test the appropriate
hypotheses.
1. Let d denote the true average difference between elevation time before the
change in work conditions and time after the change.
2. H0: d 5 0 (there is no difference between true average time before the
change and true average time after the change)
3. Ha: d Þ 0
d20 d
4. t 5 5
sdy1n sdy1n
5. n 5 16, ^ di 5 108, ^ d2i 5 1746, from which d 5 6.75, sd 5 8.234, and
6.75
t5 5 3.28 3.3
8.234y116
6. Appendix Table VI shows that the area to the right of 3.3 under the t curve
with 15 df is .002. The inequality in Ha implies that a two-tailed test is appro-
priate, so the P-value is approximately 2(.002) 5 .004 (Minitab gives .0051).
7. Since .004 , .01, the null hypothesis can be rejected at either significance
level .05 or .01. It does appear that the true average difference between times
is something other than zero; that is, true average time after the change is
different from that before the change.
Suppose the question posed had been, Does it appear that the change in work
conditions decreases true average time by more than 5? The relevant hypoth-
eses would then be H0: d 5 5 versus Ha: d . 5, for which the test statistic is
t 5 (d 2 5)y(sdy1n ).
In Section 8.4, we show how a test of the null hypothesis that a population distribu-
tion is normal can be based on a normal quantile or probability plot. In Section 8.5, we
discuss several further aspects of hypothesis testing, including the determination of type
II error probabilities for t tests.
18. Give as much information as you can about the the true average reflectometer reading for a new type of
P-value of a t test in each of the following situations: paint under consideration. A test of H0: 5 20 versus
a. Upper-tailed test, df 5 8, t 5 2.0 Ha: . 20 will be based on a random sample of size n
b. Lower-tailed test, df 5 11, t 5 22.4 from a normal population distribution. What conclu-
c. Two-tailed test, df 5 15, t 5 21.6 sion is appropriate in each of the following situations?
d. Upper-tailed test, df 5 19, t 5 2.4 a. n 5 15, t 5 3.2, 5 .05
e. Upper-tailed test, df 5 5, t 5 5.0 b. n 5 9, t 5 1.8, 5 .01
f. Two-tailed test, df 5 40, t 5 24.8 c. n 5 24, t 5 2.2
19. The paint used to make lines on roads must reflect 20. A certain pen has been designed so that true
enough light to be clearly visible at night. Let denote average writing lifetime under controlled conditions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Exercises 375
(involving the use of a writing machine) is at least 10 b. A normal quantile plot of the data was quite
hours. A random sample of 18 pens is selected, the straight. Use the descriptive output to test the
writing lifetime of each is determined, and a normal appropriate hypotheses.
quantile plot of the resulting data supports the use of
23. Exercise 5 in Chapter 2 gave n 5 12 observations
a one-sample t test.
on daily energy demand readings (kW h) for
a. What hypotheses should be tested if the investi-
remote telecommunications stations throughout
gators believe a priori that the design specifica-
Cameroon, from which the sample mean and
tion has been satisfied?
sample standard deviation are 32.59 and 10.66, re-
b. What conclusion is appropriate if the hypoth-
spectively. Suppose the investigators had believed a
eses of part (a) are tested, t 5 22.3, and 5 .05?
priori that true average daily energy demand would
c. What conclusion is appropriate if the hypothe-
be at most 30 kW h. Does the data contradict this
ses of part (a) are tested, t 5 21.8, and 5 .01?
prior belief? Assuming normality, test the appropri-
d. What should be concluded if the hypotheses of
ate hypotheses using a significance level of .05.
part (a) are tested and t 5 23.6?
24. Reconsider the sample observations introduced in
21. The true average diameter of ball bearings of a cer-
Exercise 15 in Chapter 2 on the required force (N)
tain type is supposed to be .5 in. A one-sample t test
to cause initial cracks in a thin enclosure for a sub-
will be carried out to see whether this is the case.
dermally implanted biotelemetry device:
What conclusion is appropriate in each of the fol-
lowing situations? 2006.1 2065.2 2118.9 1686.6 1966.9 1792.5
a. n 5 13, t 5 1.6, 5 .05
Suppose the device will not be used unless the
b. n 5 13, t 5 21.6, 5 .05
true average required force to cause initial cracks
c. n 5 25, t 5 22.6, 5 .01
exceeds 1800 N. Does this requirement appear to
d. n 5 25, t 5 23.9
have been satisfied? State and test the appropriate
22. The article “The Foreman’s View of Quality Con- hypotheses.
trol” (Quality Engr., 1990: 257–280) described
25. Poly(3-hydroxybutyrate) (PHB), a semicrystalline
an investigation into the coating weights for large
polymer that is fully biodegradable and biocompat-
pipes resulting from a galvanized coating process.
ible, is obtained from renewable resources. From a
Production standards call for a true average weight
sustainability perspective, PHB offers many attrac-
of 200 lb per pipe. The accompanying descriptive
tive properties though it is more expensive to pro-
summary and boxplot are from Minitab.
duce than standard plastics. The authors of “The
Variable N Mean Median TrMean StDev SEMean Melting Behaviour of Poly(3-Hydroxybutyrate) by
ctg wt 30 206.73 206.00 206.81 6.35 1.16 DSC. Reproducibility Study” (Polymer Testing,
2013: 215–220) wanted to investigate various physi-
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
376 chapter 8 Testing Statistical Hypotheses
measured by DSC is at least approximately normal. Assuming that both zinc mass distributions are at
The sample mean and standard deviation are 181.4 least approximately normal, carry out a test at sig-
and .7242, respectively. Is there compelling evi- nificance level .05 to decide whether true average
dence for concluding that true average melting zinc mass is different for the two types of batteries.
point exceeds 181°C? Carry out a test of hypotheses
29. Quantitative noninvasive techniques are needed
using a significance level of .05.
for routinely assessing symptoms of peripheral
26. The relative conductivity of a semiconductor neuropathies, such as carpal tunnel syndrome
device is determined by the amount of impurity (CTS). The article “A Gap Detection Tactility Test
“doped” into the device during its manufacture. for Sensory Deficits Associated with Carpal Tun-
A silicon diode to be used for a specific purpose nel Syndrome” (Ergonomics, 1995: 2588–2601)
requires an average cut-on voltage of .60 V, and if reported on a test that involved sensing a tiny gap
this is not achieved, the amount of impurity must in an otherwise smooth surface by probing with a
be adjusted. A sample of diodes was selected and finger; this functionally resembles many work-re-
the cut-on voltage was determined. The accompa- lated tactile activities, such as detecting scratches
nying SAS out-put resulted from a request to test or surface defects. When finger probing was not
the appropriate hypotheses. allowed, the sample average gap detection thresh-
N Mean Std Dev T Prob>|T| old for n1 5 8 normal subjects was 1.71 mm, and
15 0.0453333 0.0899100 1.9527887 0.0711 the sample standard deviation was .53; for n2 5
10 CTS subjects, the sample mean and sample
(Note: SAS explicitly tests H0: 5 0, so to test
standard deviation were 2.53 and .87, respectively.
H0: 5 .60, the null value .60 must be subtracted
Does this data suggest that the true average gap
from each xi; the reported mean is then the average
detection threshold for CTS subjects exceeds that
of the (xi 2 .60) values. Also, SAS’s P-value is always
for normal subjects? State and test the relevant hy-
for a two-tailed test.) What would be concluded for
potheses using a significance level of .01.
a significance level of .01? .05? .10?
30. According to the article “Fatigue Testing of Con-
27. Determine the number of degrees of freedom
doms” (Polymer Testing, 2009: 567–571), “tests
for the two-sample t test in each of the following
currently used for condoms are surrogates for the
situations:
challenges they face in use,” including a test for
a. n1 5 10, n2 5 10, s1 5 5.0, s2 5 6.0
holes, an inflation test, a package seal test, and tests
b. n1 5 10, n2 5 15, s1 5 5.0, s2 5 6.0
of dimensions and lubricant quality. The investiga-
c. n1 5 10, n2 5 15, s1 5 2.0, s2 5 6.0
tors developed a new test that adds cyclic strain to a
d. n1 5 12, n2 5 24, s1 5 5.0, s2 5 6.0
level well below breakage and determines the num-
28. Urban storm water can be contaminated by many ber of cycles to break.
sources, including discarded batteries. When rup- The article reported that for a sample of
tured, these batteries release metals of environmen- 20 natural latex condoms of a certain type, the sample
tal significance. The article “Urban Battery Litter” mean and sample standard deviation of the number
(J. of Environ. Engr., 2009: 46–57) presented sum- of cycles to break were 4358 and 2218, respectively,
mary data for characteristics of a variety of batteries whereas a sample of 20 polyisoprene condoms gave a
found in urban areas around Cleveland. sample mean and sample standard deviation of 5805
and 3990, respectively. Is there strong evidence for
Here are data on zinc mass (g) for two different
concluding that the true average number of cycles to
brands of size D batteries:
break for the polyisoprene condom exceeds that for
Sample Sample Sample the natural latex condom by more than 1000 cycles?
Brand Size Mean SD Carry out a test using a significance level of .01.
Duracell 15 138.52 7.76 (Note: The cited paper reported P-values of t tests for
Energizer 20 149.07 1.52 comparing means of the various types considered.)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Exercises 377
31. Fusible interlinings are being used with increasing and another sample wired with EC aluminum. Does
frequency to support outer fabrics and improve the the accompanying SAS output suggest that the true
shape and drape of various pieces of clothing. The average potential drop for alloy connections (type 1)
article “Compatibility of Outer and Fusible Inter- is higher than that for EC connections (as stated in
lining Fabrics in Tailored Garments” (Textile Res. J., the article)? Carry out the appropriate test using a sig-
1997: 137–142) gave the accompanying data on ex- nificance level of .01. In reaching your conclusion,
tensibility (%) at 100 gm/cm for both high-quality what type of error might you have committed? Note:
fabric (H) and poor-quality fabric (P) specimens: SAS reports the P-value for a two-tailed test.
H: 1.2 .9 .7 1.0 1.7 1.7 1.1 .9 1.7 Type N Mean Std Dev Std Error
1.9 1.3 2.1 1.6 1.8 1.4 1.3 1.9 1.6 1 20 17.4990 0.55012821 0.12301241
.8 2.0 1.7 1.6 2.3 2.0 2 20 16.9000 0.48998389 0.10956373
P: 1.6 1.5 1.1 2.1 1.5 1.3 1.0 2.6 Type Variances T DF Prob>|T|
1 Unequal 3.6362 37.5 0.0008
a. Construct normal quantile plots to verify the 2 Equal 3.6362 38.0 0.0008
plausibility of both samples having been selected
from normal population distributions. 34. The article “Evaluation of a Ventilation Strategy
b. Construct a comparative boxplot. Does it sug- to Prevent Barotrauma in Patients at High Risk for
gest that there is a difference between true Acute Respiratory Distress Syndrome” (New England
average extensibility for high-quality fabric spec- J. of Medicine, 1998: 355–358) reported on an ex-
imens and that for poor-quality specimens? periment in which 120 patients with similar clinical
c. The sample mean and standard deviation for features were randomly divided into a control group
the high-quality sample are 1.508 and .444, re- and a treatment group, each consisting of 60 of the
spectively, and those for the poor-quality sample patients. The sample mean ICU stay (days) and sam-
are 1.588 and .530. Use the two-sample t test to ple standard deviation for the treatment group were
decide whether true average extensibility differs 19.9 and 39.1, respectively, whereas these values for
for the two types of fabrics. the control group were 13.7 and 15.8.
a. Calculate a point estimate for the difference be-
32. The article cited in Exercise 41 in Chapter 7
tween true average ICU stay for the treatment
gave the following data on work of adhesion mea-
and control groups. Does this estimate suggest
surements (in mJ/m2) for samples of ultra-high
that there is a significant difference between
performance concrete adhered to two types of
true average stays under the two conditions?
substrates:
b. Answer the question posed in part (a) by carry-
Substrate Observations ing out a formal test of hypotheses. Is the result
Steel: 107.1 109.5 107.4 106.8 108.1 different from what you conjectured in part (a)?
Glass: 122.4 124.6 121.6 120.6 123.3 c. Does it appear that ICU stay for patients
Assuming that both samples were selected from given the ventilation treatment is normally
normal distributions, carry out a test of hypotheses distributed? Explain your reasoning.
to decide whether the true average work of adhe-
35. According to the article “Modelling and Predict-
sion for the glass substrate is more than 12 mJ/m2
ing the Effects of Submerged Arc Weldment Pro-
higher than that for the steel substrate.
cess Parameters on Weldment Characteristics and
33. The article “The Influence of Corrosion Inhibitor Shape Profiles” (J. of Engr. Manuf., 2012: 1230–
and Surface Abrasion on the Failure of Aluminum- 1240), the submerged arc welding (SAW) process
Wired Twist-on Connections” (IEEE Trans. on Com- is commonly used for joining thick plates and
ponents, Hybrids, and Manuf. Tech., 1984: 20–25) re- pipes. During welding, the SAW electrode causes
ported data on potential drop measurements for one a slight deformation on and in the surface of the
sample of connectors wired with alloy aluminum base metal. This deformation is known as the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
378 chapter 8 Testing Statistical Hypotheses
SAW weldment profile; research has shown that its 37. Exercise 54 in Chapter 7 presented a t variable
shape could be related to plate melting efficiency. appropriate for making inferences about 1 2 2
Authors of the article wanted to investigate how when both population distributions are normal
certain settings of the welding process affect macro- and, in addition, it can be assumed that 1 5 2.
structure zones of the SAW weldment profile. The a. Describe how this variable can be used to form a
heat affected zone (HAZ), a band created within the test statistic and test procedure, the pooled t test,
base metal during welding, was of particular interest. for testing H0: 1 2 2 5 D.
The article reported the impact of various b. Use the pooled t test to test the relevant hypotheses
SAW process settings (including current, voltage, based on the SAS output given in Exercise 33.
and welding speed) on characteristics of the weld- c. Use the pooled t test to reach a conclusion in
ment profile. In one investigation, the SAW pro- Exercise 35.
cess was run on various current settings (A) and the
38. The drug diethylstilbestrol was used for years by
depth (mm) of the HAZ was recorded. The data be-
women as a nonsteroidal treatment for pregnancy
low is partitioned across high (525 A) and nonhigh
maintenance, but it was banned in 1971 when
(,525 A) current settings:
research indicated a link with the incidence of
NonHigh: 1.04 1.15 1.23 1.69 1.92 1.98 2.36 2.49 2.72 cervical cancer. The article “Effects of Prenatal
1.37 1.43 1.57 1.71 1.94 2.06 2.55 2.64 2.82 Exposure to Diethylstilbestrol (DES) on Hemi-
spheric Laterality and Spatial Ability in Human
High: 1.55 2.02 2.02 2.05 2.35 2.57 2.93 2.94 2.97
Males” (Hormones and Behavior, 1992: 62–75) dis-
cussed a study in which ten males exposed to DES
Does it appear that true average HAZ depth is larger
and their unexposed brothers underwent various
for the high current condition than for the nonhigh
tests. This is the summary data on the results of a
current condition? Carry out a test of appropriate
spatial ability test:
hypotheses using a significance level of .01.
exposed mean 5 12.6
36. Which factors are relevant to the time a consumer
unexposed mean 5 13.8 sd
spends looking at a product on the shelf prior to
standard error of difference 5 5 .5
selection? The article “Effects of Base Price Upon 1n
Search Behavior of Consumers in a Supermarket” Does DES exposure appear to be associated with
(J. Econ. Psychol., 2003: 637–652) reported the fol- reduced spatial ability? State and test the appropri-
lowing data on elapsed time (sec) for fabric softener ate hypotheses using 5 .05. Does the conclusion
purchasers and washing-up liquid purchasers; the change if 5 .01 is used?
former product is significantly more expensive than
39. Parents often urge their children to “sit up
the latter. These products were chosen because
straight” when dining to practice good table
they are similar with respect to allocated shelf space
manners. Although proper posture is part of
and number of alternative brands.
maintaining good etiquette, research has shown
Sample Sample Sample that it can also help in reducing musculoskeletal
Product Size Mean SD disorders (MSDs). The authors of “Reducing
Fabric softener 15 30.47 19.15 Musculoskeletal Disorders Among Computer
Operators: Comparison Between Ergonomics
Washing-up liquid 19 26.53 15.37
Interventions at the Workplace” (Ergonomics,
a. What if any assumptions are needed before the 2012: 15711–1585) investigated the impact of
t inferential procedure can be used to compare a workplace intervention for reducing MSDs
true average elapsed times? for computer workers. For one group of workers
b. Carry out a test of hypotheses to decide whether the intervention was in the form of a short oral
the true average difference in elapsed times dif- presentation on how to sit; the preferred heights
fers from zero. of chairs, tables, keyboards, and screens; and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.2 Exercises 379
optimal positions of the back, shoulders, elbows, various doses and contact times. Observations are
and wrists. in mg/L.
Both an MSD score and a rapid upper limb
assessment (RULA) score were obtained for each Sample
participant. The MSD score is the total number of 1 2 3 4
painful body parts reported by the individual. The MSI method .39 .84 1.76 3.35
RULA score is a rating of the individual’s posture, SIB method .36 1.35 2.56 3.92
with lower numbers indicating better posture. Each 5 6 7 8
score was determined both before and after the oral
MSI method 4.69 7.70 10.52 10.92
presentation intervention. (The textbook author
SIB method 5.35 8.33 10.70 10.91
who found this article did find that his own posture
improved at least while he was typing this exercise
Does the true average content measured by one
in the manuscript.)
method appear to differ from that measured by the
other method? State and test the appropriate hy-
Mean
potheses. Does the conclusion depend on whether
Sample Difference SD of
a significance level of .05, .01, or .001 is used?
Measurement Size (After–Before) Difference
MSD Score 21 .19 1.03 41. Shoveling is not exactly a high-tech activity but will
continue to be a required task even in our informa-
RULA Score 21 21.52 1.56
tion age. The article “A Shovel with a Perforated
Blade Reduces Energy Expenditure Required for
a. Assuming that the difference in MSD scores
Digging Wet Clay” (Human Factors, 2010: 492–502)
(After–Before) is approximately normal, carry
reported on an experiment in which each of 13
out a test at significance level .05 to decide
workers was provided with both a conventional
whether true average difference in MSD scores
shovel and a shovel whose blade was perforated
is different from zero.
with small holes. The authors of the cited article
b. Assuming that the difference in RULA scores
provided the following data on stable energy
(After–Before) is approximately normal, carry
expenditure [kcal/kg(subject)/lb(clay)]:
out a test at significance level .05 to decide
whether true average difference in RULA scores Worker: 1 2 3 4 5 6 7
is different from zero. Conventional: .0011 .0014 .0018 .0022 .0010 .0016 .0028
c. From parts (a) and (b) you should have found Perforated: .0011 .0010 .0019 .0013 .0011 .0017 .0024
that for one score the intervention had a sig- Worker: 8 9 10 11 12 13
nificant impact but not for the other score. Conventional: .0020 .0015 .0014 .0023 .0017 .0020
Keeping in mind what the scores measure, can Perforated: .0020 .0013 .0013 .0017 .0015 .0013
you offer an explanation of why this may have
occurred? (For a group of computer workers Carry out a test of hypotheses at significance level
who were exposed to a more rigorous type of .05 to see if true average energy expenditure using
intervention, the article reported that interven- the conventional shovel exceeds that using the per-
tion was beneficial for both MSD and RULA forated shovel.
scores.)
42. The article “Supervised Exercise Versus Non-
40. The article “Selection of a Method to Determine Supervised Exercise for Reducing Weight in
Residual Chlorine in Sewage Effluents” (Water Obese Adults” (J. Sport. Med. Phys. Fit., 2009:
and Sewage Works, 1971: 360–364) reported the 85–90) reported on an investigation in which
results of an experiment in which two different participants were randomly assigned either to a
methods of determining chlorine content were supervised exercise program or a control group.
used on samples of Cl2-demand-free water for Those in the control group were told only that
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
380 chapter 8 Testing Statistical Hypotheses
they should take measures to lose weight. After Foods” (J. of the Amer. Dietetic Assoc., 2010: 116–
4 months, the sample mean decrease in body fat for 123) presented the accompanying data on vendor-
the 17 individuals in the experimental group was stated gross energy and measured value (both in kcal)
6.2 kg with a sample standard deviation of 4.5 kg, for 10 different supermarket convenience meals):
whereas the sample mean and standard deviation for
the 17 people in the control group were 1.7 kg and Meal: 1 2 3 4 5 6 7 8 9 10
3.1 kg, respectively. Assume normality of the two Stated: 180 220 190 230 200 370 250 240 80 180
body fat loss distributions (as did the investigators). Meas.: 212 319 231 306 211 431 288 265 145 228
Does it appear that true average decrease in Carry out a test of hypotheses to decide whether the
body fat is more than 2 kg larger for the experimen- true average % difference from that stated differs from
tal condition than for the control condition? Carry zero. (Note: The article stated “Although formal
out a test of appropriate hypotheses using a signifi- statistical methods do not apply to convenience
cance level of .01. samples, standard statistical tests were employed to
43. The article “The Accuracy of Stated Energy Con- summarize the data for exploratory purposes and to
tents of Reduced-Energy, Commercially Prepared suggest directions for future studies.”)
Chi-Squared Distributions
Just as with t distributions, there is not a single chi-squared distribution. Rather there
is an entire family of distributions. A particular member of the family is identified by
specifying some number of degrees of freedom. Thus there is one chi-squared distribu-
tion with 1 df, another with 2 df, yet another with 3 df, and so on. Curves corresponding
to several different chi-squared distributions are shown in Figure 8.9. There is no den-
sity to the left of zero, so negative values of chi-squared variables are precluded. Each
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Tests Concerning Hypotheses About a Categorical Population 381
df = 8
df = 12
df = 20
chi-squared curve is positively skewed; as the number of df increases, the curves stretch
farther and farther to the right and become more symmetric.
Our chi-squared tests are all upper-tailed, so the P-value is the area captured under
a particular chi-squared curve to the right of the calculated test statistic value. The
fact that t curves were all centered at zero allowed us to tabulate t-curve tail areas in a
relatively compact way, with the left margin giving values ranging from 0.0 to 4.0 on
the horizontal t scale and various columns displaying corresponding upper-tail areas for
various df’s. The rightward movement of chi-squared curves as df increases necessitates
a somewhat different type of tabulation. The left margin of Appendix Table VII displays
various upper-tail areas: .100, .095, .090, . . . , .005, and .001. Each column of the table
is for a different value of df, and the entries are values on the horizontal chi-squared axis
that capture these corresponding tail areas. For example, moving down to tail area .085
and across to the 2 df column, we see that the area to the right of 4.93 under the 2 df
chi-squared curve is .085 (see Figure 8.10). To capture this same upper-tail area under
the 10 df curve, we must go out to 16.54. In the 2 df column, the top row shows that if
the calculated value of the chi-squared variable is smaller than 4.60, the captured tail
area (the P-value) exceeds .10. Similarly, the bottom row in this column indicates that
if the calculated value exceeds 13.81, the tail area is smaller than .001 (P-value < .001).
4.93
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
382 chapter 8 Testing Statistical Hypotheses
The ’s can also be interpreted as probabilities; i is the probability that a randomly
selected individual or object will fall in the ith category. The null hypothesis completely
specifies the value of each i; we denote these hypothesized values by adding a sub-
script 0 to each i (as we used 0 to denote the null value in a test involving ):
i0 5 value of i asserted to be true by the null hypothesis (i 5 1, . . . , k)
As an example, suppose that the genotype for a particular genetic characteristic can be
either AA, Aa, or aa (k 5 3). The standard genetic argument in this situation implies
the null hypothesis
H0: 1 5 .25, 2 5 .50, 3 5 .25
The alternative hypothesis states simply that the specification in H0 is not correct—that
is, at least one of the i0’s is incorrect (because the hypothesized values add to 1.0, if a
particular value is incorrect, at least one other value must also be incorrect). A test of these
hypotheses will be based on a random sample taken from the population or process. Each
individual or object in the sample will belong in exactly one of the k categories; thus we
will have a sample consisting of univariate categorical data. For example, we might select
n 5 100 individuals and find that the first has genotype Aa, the second has genotype aa,
the third and fourth both have genotype Aa, the fifth has genotype AA, and so on. Let
n1 5 number of sampled individuals or objects falling in the first category
. .
. .
. .
nk 5 number of sampled individuals or objects falling in the kth category
The ni values are called observed category frequencies or counts. In the genetics
example with k 5 3, we might have n 5 100, n1 5 20, and n2 5 53, from which
n3 5 100 2 20 2 53 5 27.
The central idea of the test procedure is to compare the observed counts with what
would be expected were H0 true. If, for example, the three hypothesized values are .25,
.50, and .25, and n 5 100, then when the null hypothesis is true,
expected number in the first category 5 n10 5 100(.25) 5 25
expected number in the second category 5 n20 5 100(.50) 5 50
expected number in the third category 5 n30 5 100(.25) 5 25
More generally,
expected frequency for category i when H0 is true 5 ni0 (i 5 1, . . . , k)
That is, expected frequencies under H0 are obtained by multiplying each hypothesized
value by the sample size. Intuitively, the data supports the null hypothesis when the
observed frequencies are similar to the expected frequencies. If some of the observed
frequencies differ substantially from what would be expected if H0 were true, the null
hypothesis is no longer tenable.
We now need a quantitative measure of how different the observed frequencies
are from the expected frequencies, assuming H0 is true. A first thought is to subtract
each expected frequency from the corresponding observed frequency to obtain a
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Tests Concerning Hypotheses About a Categorical Population 383
deviation, square these deviations, and add them together. Symbolically, this would be
^ (ni 2 ni0)2. Suppose, however, that
Then both deviations are 25, so they both contribute the same amount to our quanti-
tative measure of discrepancy. However, the observed frequency for the first category
is only 5% smaller than what was expected, whereas the observed frequency for the
second category is fully 25% smaller than what we would expect if the null hypothesis
were true. Our proposed measure does not reflect the fact that, on a percentage basis,
the discrepancy for the second category is more sizable than that for the first category.
The chi-squared test statistic takes into account percentage deviations.
Example 8.8 A number of psychologists have considered the relationship between various deviant be-
haviors and geophysical variables such as the lunar phase. The article “Psychiatric and
Alcoholic Admissions Do Not Occur Disproportionately Close to Patients’ Birthdays”
(Psychological Reports, 1992: 944–946) investigated whether the chance of a patient’s
admission date for a particular treatment is smaller or larger than would be the case
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
384 chapter 8 Testing Statistical Hypotheses
under the assumption of complete randomness. Disregarding leap year, there are 365
possible admission days, so complete randomness would imply a probability of 1/365
for each day. However, this results in far too many categories and expected counts that
are too small for the chi-squared test. So the following four categories were established:
1. Within 7 days of an individual’s birthday (7 days before to 7 days after)
2. Between 8 and 30 days, inclusive, from the birthday
3. Between 31 and 90 days, inclusive, from the birthday
4. More than 90 days from the birthday
Let i denote the true proportion of individuals in category i (i 5 1, 2, 3, 4). Then
complete randomness with respect to admission date implies that
1 5 15y365 5 .041 2 5 46y365 5 .126 3 5 .329
4 5 1 2 (.041 1 .126 1 .329) 5 .504
Thus the relevant hypotheses are
H0: 1 5 .041, 2 5 .126, 3 5 .329, 4 5 .504
versus
Ha: the specification of ’s in H0 is not correct
The cited article gave data for n 5 200 patients admitted for alcoholism treatment.
The expected counts when H0 is true are then
expected count for category 1 5 n10 5 200(.041) 5 8.2
n20 5 200(.126) 5 25.2 n30 5 200(.329) 5 65.8
n40 5 200 2 (8.2 1 25.2 1 65.8) 5 100.8 35 200(.504) 4
Since all expected counts exceed 5, the chi-squared test can be used. The observed
counts along with their expected counterparts are as follows:
Category: 1 2 3 4
Observed: 11 24 69 96
Expected: 8.2 25.2 65.8 100.8
The value of the chi-squared statistic is thus
(11 2 8.2)2 (24 2 25.2)2 (69 2 65.8)2 (96 2 100.8)2
X2 5 1 1 1
8.2 25.2 65.8 100.8
5 .96 1 .06 1 .16 1 .23
5 1.41
The test is based on k 2 1 5 3 df. The smallest entry in the 3 df column of Appendix
Table VII is 6.25, corresponding to an upper-tail area of .10. Because 1.41 , 6.25,
the area captured to the right of 1.41 exceeds .10. That is, P-value ..10, so H0 can-
not be rejected at any reasonable significance level. Our analysis is consistent with
the title of the cited article; we have no evidence to suggest that admission date is
anything other than random.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Tests Concerning Hypotheses About a Categorical Population 385
Table 8.1 Two possible configurations of proportions for four categorical populations
(a) Homogeneous populations (b) Nonhomogeneous populations
Category Category
R P S R P S
A .50 .30 .20 A .50 .30 .20
C .50 .30 .20 C .60 .25 .15
Population Population
M .50 .30 .20 M .50 .25 .25
Unless otherwise noted, all content on this page is © Cengage Learning.
The null hypothesis that we wish to test is that the populations are homogeneous.
For this purpose, we require a separate random sample from each of the populations;
let’s denote the corresponding sample sizes by n1, n2, and so on. Of the n1 individuals
or objects selected from the first population, some number will fall in the first category,
some number will be in the second category, and so on. This is also the case for the
samples from the other populations. The resulting category frequencies or counts can
be displayed in a rectangular table called a contingency table; there is a row for each
population and a column for each category. The row sums of these observed frequen-
cies are the sample sizes, so they are fixed by the experimenter. Table 8.2 shows one
possible set of observed frequencies when each sample size is 200.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
386 chapter 8 Testing Statistical Hypotheses
Homogeneity asserts that there is a common value of 1, the proportion in the first
category, for all populations, a common value of 2 for all populations, and so on. If the
values of these ’s were known, then just as in the case of a single population, expected
frequencies would result from multiplying these ’s by the various sample sizes. Let’s now
assume that the populations are homogeneous and estimate the ’s from the observed
frequencies. Consider the frequencies in Table 8.2. Sensible estimates of 1, 2, and 3
are then just the proportions of the total sample size 800 falling in the various categories:
estimate of 1 5 proportion of total sample size in first category
5 403y 800 5 .50375
estimate of 2 5 proportion of total sample size in second category
5 245y 800 5 .30625
estimate of 3 5 proportion of total sample size in third category
5 152y 800 5 .19000
Multiplying these estimates by n1 5 200 gives the estimated expected frequencies for
the sample from the first population (assuming homogeneity). For example,
estimated expected frequency for the first category in the first sample
403
5 200a b 5 100.75
800
Notice that this estimated expected frequency is the product of the row total (200) and
the column total (403) divided by the “grand” total (800). This is in fact the general
prescription for obtaining estimated expected frequencies: (row total)(column total)y
grand total. Once these have been calculated, the value of a chi-squared statistic can Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Tests Concerning Hypotheses About a Categorical Population 387
same for all populations, the proportion falling in the second category is
also the same for all populations, and so on)
a: the populations are not homogeneous
(for at least one of the categories, the proportions are not identical for
all populations)
Test statistic: Suppose the observed counts are displayed in a contingency table consist-
ing of rows, one for the sample from each population, and columns, one
for each category (an by table). Then the expected frequency
corresponding to any particular observed frequency (i.e., to any particular
cell of the table) is computed as
(row total)(column total)
estimated expected frequency 5
where is the sum of the individual sample sizes. The test statistic is then
(observed 2 estimated expected)2
2
5 ^ estimated expected
all cells
-value: When 0 is true and all estimated expected frequencies exceed 5, 2 has ap-
proximately a chi-squared distribution with df 5 ( 2 1)( 2 1). Because any
value larger than the calculated 2 is even more contradictory to 0, the test
is upper-tailed and the -value is approximately the area to the right of the cal-
culated 2 under the ( 2 1)( 2 1) chi-squared curve. If at least one estimated
expected counts is at most 5, categories should be combined in a sensible way.
Example 8.9 A company packages a particular product in cans of three different sizes, each one
using a different production line. Most cans conform to specifications, but a quality
control engineer has identified the following reasons for nonconformance:
1. Blemish on can
2. Crack in can
3. Improper pull tab location
4. Pull tab missing
5. Other
A sample of nonconforming units is selected from each of the three lines, and each
unit is categorized according to reason for nonconformity, resulting in the following
contingency table data:
Reason for nonconformity
Sample
Blemish Crack Location Missing Other size
1 34 65 17 21 13 150
Production 2 23 52 25 19 6 125
line 3 32 28 16 14 10 100
Total 89 145 58 54 29 375
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
388 chapter 8 Testing Statistical Hypotheses
Does the data suggest that the proportions falling in the various nonconformance
categories are not the same for the three lines? The parameters of interest are the
various proportions, and the relevant hypotheses are
H0: the production lines are homogeneous with respect to the five nonconformance
categories
Ha: the production lines are not homogeneous with respect to the categories
To calculate X2, we must first compute the estimated expected frequencies (assuming
homogeneity). Consider the first nonconformance category for the first production
line. When the lines are homogeneous,
estimated expected number among the 150 selected units that are blemished
(first row total)(first column total) (150)(89)
5 5 5 35.60
total of sample sizes 375
The contribution of the cell in the upper-left corner to X2 is then
(observed 2 estimated expected)2 (34 2 35.60)2
5 .072 5
estimated expected 35.60
The other contributions are calculated in a similar manner. Table 8.3 shows Minitab
output for the chi-squared test. The observed count is the top number in each cell,
and directly below it is the estimated expected count. The contribution of each cell
to X2 appears below the counts, and the test statistic value is X2 5 14.159. All esti-
mated expected counts exceed 5, so combining categories is unnecessary. The test
is based on (3 2 1)(5 2 1) 5 8 df. Our chi-squared table shows that the values that
capture upper-tail areas of .08 and .075 under the 8 df curve are 14.06 and 14.26,
respectively. Thus the P-value is between .075 and .08; Minitab gives P-value 5 .079.
The null hypothesis of homogeneity should not be rejected at the usual significance
levels of .05 or .01, but it would be rejected for the higher of .10.
Table 8.3 Minitab output for the chi-squared test of Example 8.9
Expected counts are printed below observed counts
blem crack loc missing other Total
1 34 65 17 21 13 150
35.60 58.00 23.20 21.60 11.60
2 23 52 25 19 6 125
Unless otherwise noted, all content on this page is © Cengage Learning.
29.67 48.33 19.33 18.00 9.67
3 32 28 16 14 10 100
23.73 38.67 15.47 14.40 7.73
Total 89 145 58 54 29 375
Chisq = 0.072 + 0.845 + 1.657 + 0.017 + 0.169 + 1.498 + 0.278 +
1.661 + 0.056 + 1.391 + 2.879 + 2.943 + 0.018 + 0.011 +
0.664 = 14.159
df = 8, p = 0.079
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Tests Concerning Hypotheses About a Categorical Population 389
with respect to a first categorical factor A and with respect to a second such factor B. For
example, each car of a certain type manufactured in a particular year can be classified
with respect to body style—two-door coupe, four-door sedan, or hatchback—and with
respect to color—white, black, blue, green, or red. Suppose we take a sample of size n
and classify each sampled individual or object with respect to both the A factor (style)
and the B factor (color). The resulting counts can be displayed in a contingency table
having a row for each category of the A factor and a column for each category of the B
factor—a 3 by 5 table in the example under consideration. In this situation, neither the
row nor the column totals are fixed in advance, only the sum of all counts, which equals
n. The number in the upper-left corner would be the number of sampled automobiles
that are both coupes and white, and so on. The null hypothesis of interest in this situa-
tion is that the two factors A and B are independent; that is, knowing the body style does
not change the likelihood of a particular color and vice versa.
Although homogeneity and independence are two different scenarios, the follow-
ing can be shown: (1) The estimated expected frequencies in the test of independence
are calculated exactly as they were for the test of homogeneity: row total times column
total divided by n; (2) X2 is still an appropriate test statistic; (3) the test is still upper-
tailed; and (4) the test is based on the same number of df as the homogeneity test.
Total 8 7 15
Chi-Sq = 8.040, DF = 1, P-Value = 0.005
4 cells with expected counts less than 5.
For a contingency table having more than two rows and two columns, if any estimated
expected count is at most 5, it may be possible to consolidate some categories and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
390 chapter 8 Testing Statistical Hypotheses
generate a new contingency table whose estimated expected counts would all exceed 5.
However, this option would not be available for a contingency table with two rows and
two columns as the minimum number of categories for each variable has been reached.
Instead of using the chi-square approach, we now introduce a different method that is
popularly known as Fisher’s Exact Test.
Recall in our example that 8 out of the 15 boards were produced using Method A
and a total of 8 printed circuit boards had defects. If the null hypothesis of indepen-
dence between production method and board condition is true, given that Method A
accounts for 8 out of the 15 boards and that 8 out of the boards had defects, what is the
probability that we would obtain results at least as extreme as what we observed? This
probability is the P-value for Fisher’s Exact Test; it can be computed explicitly by using
a particular discrete distribution.
First, let us consider all possible contingency table configurations under the
assumption that Method A accounts for 8 out of the 15 boards and that 8 out of the
boards had defects. Figure 8.11 reveals that there are only 8 possible contingency
tables. If the null hypothesis is true, it can be shown that the probability of each of
the 8 possible outcomes can be determined by a discrete distribution known as the
hypergeometric. Statistical software packages can readily compute probabilities from
this distribution.
A B A B A B A B
Present 8 0 Present 7 1 Present 6 2 Present 5 3
Absent 0 7 Absent 1 6 Absent 2 5 Absent 3 4
Prob. 5 .0002 Prob. 5 .0087 Prob. 5 .0914 Prob. 5 .3046
A B A B A B A B
Present 4 4 Present 3 5 Present 2 6 Present 1 7
Absent 4 3 Absent 5 2 Absent 6 1 Absent 7 0
Prob. 5 .3807 Prob. 5 .1828 Prob. 5 .0305 Prob. 5 .0012
Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 8.11 All possible contingency tables and corresponding hypergeometric
probabilities
With all table probabilities in hand, we can now obtain P-value information.
Our originally observed contingency table yielded 7 boards having defects manu-
factured by Method A. The corresponding table probability is .0087. To determine
the P-value we need to consider other tables that would be at least as extreme than
what was observed. This would include any tables having a corresponding prob-
ability that is less than or equal to .0087. From Figure 8.11 we see that only two
other tables qualify (with probabilities. 0002 and .0012). Combining these prob-
abilities, we have P-value 5 .0087 1 .0002 1 0012 5 .0101. Thus, at the .05 signifi-
cance level we can reject the null hypothesis of independence between production
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Exercises 391
method and board condition. Figure 8.12 is the corresponding output from SAS for
our example:
Sample Size = 15
From the output, the P-value we computed corresponds to the probability reported next
to Two-sided Pr ,5 P as we were interested in testing if any type of dependence
existed. As the output suggests, we can use a directional alternative for Fisher’s Exact
Test as well. Consult the book by Agresti cited in the chapter bibliography for more
details on this test.
for social science students, and the other 10% from African
the course for agriculture students. A random sam- Ethnicity: American Asian Caucasian Hispanic
ple of n 5 120 clients revealed 52, 38, 21, and 9 Frequency: 57 11 330 6
from the four courses. Does this data suggest that The 2000 census proportions for these four ethnic
the percentages on which staffing was based are not groups are .177, .032, .734, and .057, respectively.
correct? State and test the relevant hypotheses us- Does the data suggest that the proportions in com-
ing 5 .05. mercials are different from the census proportions?
Carry out a test of appropriate hypotheses using a
46. Criminologists have long debated whether there is
significance level of .01.
a relationship between weather and violent crime.
The author of the article “Is There a Season for 48. An information retrieval system has ten storage lo-
Homicide?” (Criminology, 1988: 287–296) classi- cations. Information has been stored with the ex-
fied 1361 homicides according to season, resulting in pectation that the long-run proportion of requests
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
392 chapter 8 Testing Statistical Hypotheses
for location i is given by i 5 (5.5 2 i 2 5.5 )y30. A 51. A placebo—that is, a fake medication or treatment—
sample of 200 retrieval requests gave the following is well known to sometimes have a positive effect just
frequencies for locations 1–10, respectively: 4, 15, because patients often expect the medication or treat-
23, 25, 38, 31, 32, 14, 10, and 8. Use a chi-squared ment to be helpful. The article “Beware the Nocebo
test at significance level .10 to decide whether the Effect” (The New York Times, Aug. 12, 2012) gave ex-
data is consistent with the a priori proportions. amples of a less familiar phenomenon: the tendency
for patients informed of possible side effects to actu-
49. The article “The Gap Between Wine Expert Rat-
ally experience those side effects. The article cited
ings and Consumer Preferences” (Intl. J. of Wine
a study reported in The Journal of Sexual Medicine
Business Res., 2008: 335–351) studied differences
in which a group of patients diagnosed with benign
between expert and consumer ratings by consider-
prostatic hyperplasia was randomly divided into two
ing medal ratings for wines: gold (G), silver (S), or
subgroups. One subgroup of size 55 received a com-
bronze (B). Three categories were then established:
pound of proven efficacy along with counseling that
1. Rating is the same [(G,G), (B,B), (S,S)].
a potential side effect of the treatment was erectile
2. Rating differs by one medal [(G,S), (S,G), (S,B),
dysfunction. The other subgroup of size 52 was giv-
(B,S)].
en the same treatment without counseling. The per-
3. Rating differs by two medals [(G,B), (B,G)].
centage of the no-counseling subgroup that reported
The observed frequencies for these three categories one or more sexual side effects was 15.3%, whereas
were 69, 102, and 45, respectively. On the hypoth- 43.6% of the counseling subgroup reported at least
esis of equally likely expert ratings and consumer one sexual side effect. State and test the appropriate
ratings being assigned completely by chance, each hypotheses at significance level .05 to decide wheth-
of the 9 medal pairs has probability 1y9. Carry out er the nocebo effect is operating here. (Hint: First
an appropriate chi-squared test using a significance arrange the data into a contingency table comparing
level of .10. subgroup versus presence of side effects.)
50. A random sample of smokers was obtained, and each 52. A random sample of individuals who drive to work
individual was classified by both gender and age when in a large metropolitan area was obtained, and each
he or she first started smoking. The data in the ac- individual was categorized with respect to both size of
companying table is consistent with summary results vehicle and commuting distance (in miles). Does the
reported in the article “Cigarette Tar Yields in Rela- accompanying data suggest that there is an association
tion to Mortality in the Cancer Prevention Study II between type of vehicle and commuting distance?
Prospective Cohort” (British Med. J., 2004: 72–79).
Commuting Distance
Gender 0 2 ,10 10 2 ,20 $20
Male Female Subcompact 6 27 19
,16 25 10 Type of Compact 8 36 17
Age 16–17 24 32 vehicle Midsize 21 45 33
18–20 28 17 Full-size 14 18 6
.20 19 34 X2 5 14.16
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.3 Exercises 393
law enforcement officers, dockworkers). Clearly, 55. Children often suffer from a condition known as ton-
a dangerous job can lead to illness or death. But sillitis in which the tonsils become sore or swollen.
can the psychological stress of a work environment When the condition becomes chronic, many sufferers
affect employees’ overall health? This issue was in- have their tonsils surgically removed by the tonsillec-
vestigated in the article “Are There Health Effects tomy (TE) procedure. TE is one of the most common
of Harassment in the Workplace? A Gender-Sensi- surgeries performed in children and young adults
tive Study of the Relationships Between Work and worldwide. However, because of the invasive nature
Neck Pain” (Ergonomics, 2012: 147–159). The re- of the surgery, TE patients often experience severe
searchers wanted to identify workplace physical and postoperative complications. Tonsillotomy (TT), an al-
psychosocial risk factors for neck pain among male ternative procedure to surgically removing the tonsils,
and female workers. They also wanted to study the has become increasingly popular because studies have
relationship between neck pain and intimidation shown it to be less invasive and to have lower risk of
or sexual harassment in the workplace. (Advanced postoperative complications.
statistical techniques were used to show that neck The article “Differences in Pain and Nau-
pain was significantly associated with intimidation sea in Children Operated on by Tonsillectomy or
at work among both male and female workers.) Tonsillotomy—a Prospective Follow-Up Study”
This study was based on a representative sam- (J. of Advanced Nursing, 2012) examined the dif-
ple (5405 men, 3987 women) of the Quebec work- ferences in postoperative pain, nausea, and time of
ing population. The following cross-classification discharge in children 3–12 years of age after TE or
table for this sample on gender versus level of neck TT. To compare differences in postoperative nau-
pain is consistent with data reported in the article: sea, researchers kept track of the number of pre-
scriptions of ondansetron (a drug to treat nausea
Gender
and vomiting) that were issued to the TE and TT
Men Women
children. Four out of 34 TE children compared to
Never 3048 1842
none of the 53 TT children received such prescrip-
Pain Occasionally 1767 1411 tions.
At Least Failry Often 590 734 a. Suppose we are interested in testing whether
surgery method affects the provision of ondan-
Does it appear that there might be an association setron prescriptions. Determine the estimated
between gender and neck pain? Carry out a test of expected counts based on the chi-squared test
hypotheses using the .01 significance level. method. Do all expected counts exceed 5?
54. The article cited in Exercise 53 classified each b. Use Fisher’s exact test to analyze this data and
member of the sample of workers with respect to report the P-value based on a two-sided alterna-
both gender and level of work-related psychologi- tive (as did the authors of the cited article). If
cal demands. The following table is consistent with your software does not perform this test, there
summary results reported in the article: are many online calculators that will report
the P-value based on this test. One such site is
Gender
http://research.microsoft.com/en-us/um
Men Women /redmond/projects/mscompbio/fisherexacttest
Low 1692 1324
Job Demand Medium 1838 1352 56. For many years, federal equal employment oppor-
tunity laws have prohibited compensation discrimi-
High 1875 1311
nation. However, according to the U.S. Equal Em-
Does it appear that there might be an association ployment Opportunity Commission (EEOC), pay
between gender and work-related psychological de- disparities continue to exist in various demographic
mands? Carry out a test of hypotheses using the .05 groups. According to the EEOC website (visited
significance level. on January 13, 2013), Section 10 of the EEOC
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
394 chapter 8 Testing Statistical Hypotheses
Compliance Manual describes the standards and and 2). There are 14 members in group 1 and
suggested steps for investigating a charge of com- 17 in group 2. Eight members of group 1 and three
pensation discrimination. In the statistical analysis members of group 2 earn salaries greater than the
section, Fisher’s exact test is recommended as the company median salary. Use Fisher’s exact test at
test of choice. The following is based on the exam- significance level .05 to investigate whether group
ple found in the EEOC Compliance Manual. affiliation has an effect on salary status. (The previ-
Suppose the employees of a particular com- ous exercise identifies a website that will carry out
pany can be classified into one of two groups (1 the calculations.)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.4 Testing the Form of a Distribution 395
The test described in the next box involves a slight modification of what we have
so far suggested. For technical reasons, rather than using z quantiles corresponding to
(i 2 .5)yn, quantiles corresponding to (i 2 .375)y(n 1 .25) are used. These alternative
“plotting positions” do not greatly alter the appearance of the plot, but they have been
found to improve the behavior of the test.
Example 8.10 The following sample of n 5 17 observations on length-diameter ratio (LDR) mea-
surements based on static pile load tests first appeared in Example 2.17.
Quantile: 21.89 21.35 21.05 20.82 20.63 20.46 20.30 20.15 0.00
LDR: 30.86 37.68 39.04 42.78 42.89 42.89 45.05 47.08 47.08
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
396 chapter 8 Testing Statistical Hypotheses
99
Mean 47.31
StDev 7.560
95 N 17
90 RJ 0.990
80 P-Value >0.100
70
Percent 60
50
40
30
20
10
5
1
30 40 50 60 70
LDR
Figure 8.13 Minitab output from the Ryan-Joiner test for the data of
Chi-Squared Tests
Carrying out a chi-squared test requires that categories be established so that observed fre-
quencies can be compared with those expected if the hypothesized family is correct. Sup-
pose, for example, that we have observations on x 5 number of defects for a sample of 200
automobiles. Possible values of x are 0, 1, 2, . . . . A reasonable null hypothesis is that x has
a Poisson distribution. We might select the x value 0 as the first category, the value 1 as the
second category, 2 as the third category, 3 as the fourth category, and aggregate all x values
that are at least 4 as the remaining catchall category. The form of the Poisson mass function
is p(x) 5 e 2xyx! for x 5 0, 1, 2, . . . . Substituting x 5 0, 1, 2, and 3 and multiplying each
result by n 5 200 would give the expected frequencies for the first four categories; the last ex-
pected frequency could then be obtained by adding the first four and subtracting from 200.
However, carrying this out requires that we have a value of the parameter . The null
hypothesis states only that the distribution is Poisson, without specifying the correct . So
the value of must be estimated from the data before a test can be conducted, and the cor-
rect way to do this is to use the method of maximum likelihood introduced in Chapter 7. Unless otherwise noted, all content on this page is © Cengage Learning.
The estimate should be based on the grouped data (i.e., the number of observations falling
in each of the five categories) rather than the individual observations, but this is virtually
never done. Instead, the estimate n 5 x based on the full data is customarily used (this es-
timate is intuitively appealing because the mean value of a Poisson variable is just x 5 ).
Furthermore, the estimation of any parameters before calculating expected frequencies and
carrying out the test reduces the number of degrees of freedom on which the test is based.
Each parameter that must be estimated from the data before calculating expected fre-
quencies and carrying out a chi-squared test reduces the number of df for the test by one.
Thus if the test is based on categories, all (estimated) expected counts are at least 5; and
if parameters were estimated, the test is based on 2 1 2 df.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.4 Exercises 397
For a Poisson distribution, using the five categories suggested previously would result
in a test based on df 5 5 2 1 2 1 5 3 (provided that all expected counts were at least 5).
A chi-squared test for normality (not recommended because the Ryan–Joiner test, as
well as other tests, have smaller type II error probabilities for the same significance
level) would require estimating both and , reducing degrees of freedom by two.
Example 8.11 Consider the accompanying data on the number of Larrea divaricata plants found
in each of n 5 48 identically shaped sampling regions (ecologists call such regions
quadrats), taken from the article “Some Sampling Characteristics of Plants and
Arthropods in the Arizona Desert” (Ecology, 1962: 567–571):
Number of plants: 0 1 2 3 at least 4
Frequency: 9 9 10 14 6
The author of the article fit a Poisson distribution to this data. Suppose that the six
observations in the last category were actually 4, 4, 5, 5, 6, and 6; it is easily verified
n 5 x 5 2.10 (the value reported in the article). The (estimated) expected fre-
that
quency for the first category is then
e22.1(2.1)0
48 c d 5 5.88
0!
The other four expected frequencies, calculated in the same way, are 12.34, 12.96,
9.07, and (by subtraction) 7.75. All expected frequencies exceed 5, so the test will be
based on 5 2 1 2 1 5 3 df. The test statistic value is
(9 2 5.88)2 … (6 2 7.75)2
X2 5 1 1 5 6.31
5.88 7.75
The two smallest critical values in the 3 df column of our chi-squared table (Appen-
dix Table VII) are 6.25 and 6.36, corresponding to upper-tail areas of .100 and .095,
respectively. Thus the approximate P-value for the test is slightly less than .10. At a
significance level of either .05 or .01, there is little reason to doubt that the distribu-
tion of the number of plants per quadrat is Poisson.
In the case of continuous data, the categories are simply class intervals. For example,
we might select the following six classes: (2 , 85), (85, 95), (95, 100), (100, 105), (105,
115), and (115, ). After estimating any parameters, the estimated expected frequency for
105
the fourth class would be n ? 3#100 f (x) dx 4, where parameters in the density function f(x)
are replaced by their estimates.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
398 chapter 8 Testing Statistical Hypotheses
58. The article cited in Exercise 31 of Section 8.2 24.6 12.7 14.4 30.6 16.1 9.5 31.5 17.2
gave the following observations on bending rigid- 46.9 68.3 30.8 116.7 39.5 73.8 80.6 20.3
ity (N ? m) for medium-quality fabric specimens,
25.8 30.9 39.2 36.8 46.6 15.6 32.3
from which the accompanying Minitab output was
obtained: Would you use a one-sample t confidence interval
to estimate true average bending rigidity? Explain
your reasoning.
99
Mean 37.42
95 StDev 25.81
90 N 23
80 RJ 0.911
70 P-Value <0.010
Percent
60
50
40
30
20
10
5
1
40 20 0 20 40 60 80 100 120
Bendrig
59. The article from which the data in Exercise 44 of P-value ..10. Would you use the one-sample t test
Chapter 7 was obtained also gave the following to test hypotheses about the value of the true aver-
data on the compressive strength (in MPa) for 7 age compressive strength? Why or why not?
specimens of internally cured concrete that have
60. The data in Exercise 40 is paired, so a paired
been set for 28 days:
t analysis is appropriate if it is plausible that the val-
38.7 40.1 40.3 47.5 48.0 56.0 61.1 ues of the differences were selected from a normal
distribution. Based on the accompanying plot from
Minitab gives r 5 .953 as the value of the cor- Minitab, does this appear to be the case?
relation coefficient test statistic and reported that
60
50
40
30
20
10
5
1
1.25 1.00 0.75 0.50 0.25 0.00 0.25 0.50
Difference
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.5 Further Aspects of Hypothesis Testing 399
61. The article cited in Exercise 88 of Chapter 7 1976: 567–573). The mean value of this distribution
gave the following observations on conductivity is 5 (1y) 2 (x0e2x0)y(1 2 e2x0). Replacing by
(% IACS) for eight wire electrodes used for wire x and by n and solving for the latter quantity gives
electrical-discharge machining: an estimate of . The expected frequencies for vari-
ous categories (class intervals) can then be calcu-
31 28 26 24 33 65 29 29
lated. Use the accompanying data along with x 5
a. Employ software to perform a test for normal- 13.086 to decide whether the truncated exponen-
ity (such as the Ryan-Joiner test) using a signifi- tial distribution is a plausible model (x0 5 70 here).
cance level of .05.
Class: 02,8 82,16 162,24
b. Note that there is one unusually high conduc-
Frequency: 20 8 7
tivity reading. Suppose the researchers discov-
ered there was a recording error for this observa- Class: 242,32 322,40 402,48
tion. Remove it and repeat part (a). How does Frequency: 1 2 1
the removal of the observation affect the test for Class: 482,56 562,64 642,70
normality? Frequency: 0 1 0
62. In a genetics experiment, investigators examined 64. It is hypothesized that when homing pigeons are
300 chromosomes of a particular type and counted disoriented in a certain manner, they will exhibit
the number of sister-chromatid exchanges on no preference for any direction of flight after take-
each one (“On the Nature of Sister-Chromatid off (the direction x, a continuous variable, should
Exchanges in 5-Bromodeoxyuridine-Substituted be uniformly distributed on the interval from 0°
Chromosomes,” Genetics, 1979: 1251–1264). A to 360°, so f (x) 5 1y360 on this interval). To test
Poisson model was hypothesized for the distribution this, 50 pigeons were disoriented and released,
of the number of exchanges. Test the fit of such a resulting in the following observed directions.
model to the accompanying data by first estimating Use a chi-squared test based on eight classes to
and then combining the frequencies for x 5 8 test the appropriate hypotheses at a significance
and x 5 9. level of .05.
x: 0 1 2 3 4 5 6 7 8 9
Frequency: 6 24 42 59 62 44 41 14 6 2 171 338 238 37 92 287 203 320 88
36 131 32 61 250 99 138 155 183
63. In an investigation into the distribution of out-
put tuft weight x of cotton fibers when the input 201 312 89 158 206 170 204 46 323
weight was x0, a truncated exponential distribu- 289 141 319 242 179 249 185 277 95
tion, f (x) 5 (e2x)y(1 2 e2x0) for 0 , x , x0, was 46 197 251 196 326 124 350 112 37
hypothesized (“Some Studies on Tuft Weight Dis-
104 290 47 310 86
tributions in the Opening Room,” Textile Res. J.,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
400 chapter 8 Testing Statistical Hypotheses
Example 8.12 Samples of two different automobile braking systems were selected and the braking
distance (ft) for each was determined under specified experimental conditions, re-
sulting in the following summary information:
n1 5 100 x1 5 120 s1 5 5.0
n2 5 100 x2 5 118 s2 5 5.0
Does it appear that true average braking distance for the first system differs from that
for the second system? The relevant hypotheses are H0: 1 2 2 5 0 versus the alter-
native Ha: 1 2 2 Þ 0, and
x1 2 x2 120 2 118 2
z5 5 5 5 2.83
s21 s22 25 25 .707
1 1
C n1 n2 C 100 100
The P-value for this two-tailed z test is then 2 ? (area under z curve to the right of 2.83) 5
.0046. Thus the null hypothesis should be rejected at a significance level of .05 or
even at .01. We say that the data is statistically significant at either of these levels.
However, because of the rather large sample sizes and relatively small standard devia-
tions, it appears that 1 2 2 x1 2 x2 5 2.0. From a practical point of view, a 2-foot
difference in true average braking distance would appear to be relatively unimport-
ant. This is an instance of statistical significance without any evidence of a practi-
cally significant difference.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.5 Further Aspects of Hypothesis Testing 401
the chosen . Using a small significance level results in a test that has good protection
against the commission of a type I error. However, if at the same time the likelihood of
committing a type II error—not rejecting the null hypothesis when in fact it is false—is
large, then the test procedure will be quite ineffective at detecting departures from the
null hypothesis. For example, consider testing H0: 5 100 versus Ha: . 100 using a
test with a significance level of 5 .01. If this test is used repeatedly on different samples
selected from the population of interest and if H0 is in fact true, in the long run only
1% of all samples will result in the incorrect rejection of the null hypothesis. Suppose,
though, that the alternative 5 105 represents an important departure from the null hy-
pothesis, but that in this situation 5 P(type II error) 5 .75. Then if the test procedure
is used over and over on different samples and in fact really is 105 rather than 100, in
the long run only 25% of all samples will result in the rejection of H0, whereas the other
75% of all samples will yield an incorrect conclusion. The test procedure has rather
poor ability to detect a departure from the null hypothesis that has substantial practical
significance. In general, it makes little sense to expend the resources necessary to acquire
sample data and carry out a test if the test procedure has very poor ability to detect impor-
tant departures from the null hypothesis. This is why we recommend investigating the
likelihood of committing a type II error before a test with a specified is used.
One way to determine is to use an appropriate set of curves. Figure 8.14 shows
three different curves for a one-tailed t test (appropriate for either the alternative
Ha: . 0 or the alternative Ha: . 0). Obtaining requires that we specify an alter-
native value of (e.g., 105 in the situation considered in the previous paragraph) and
also that we select a realistic value of the population or process standard deviation .
Then we calculate the value of
(alternative value of ) 2 0
d5
the distance between the alternative value and the null value expressed as some number
of population standard deviations. Thus d 5 2 means that the alternative value of is 2
1.0
= .01, df = 6
Unless otherwise noted, all content on this page is © Cengage Learning.
.8
= .05, df = 6
.6
= .01, df = 19
.4
Associated
value of .2
0 1 2 3
Value of
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
402 chapter 8 Testing Statistical Hypotheses
population standard deviations away from the null value. Finally, locate the value of d
on the horizontal axis, move directly up to the curve for n 2 1 df, and move over to the
vertical axis to read the value of .
The following general properties provide insight into how behaves.
1. The larger the number of degrees of freedom, the lower is the corresponding
curve for any value of d. Because df increases as the sample size increases, we
have the intuitively plausible result that decreases as n increases.
2. The farther the alternative value of interest is from the null value, the larger the
value of d. Because every curve decreases as d increases, it follows that will be
smaller for an alternative value far from what the null hypothesis asserts than for
a value close to 0. Thus the test is more likely to detect a large departure from
0 than a small departure.
3. The larger the value of , the smaller the value of d and the larger the resulting
value of corresponding to any particular alternative value of . That is, the
more underlying variability there is in the population or process, the more dif-
ficult it will be to detect a departure from H0 of any given magnitude. Selecting
a relatively large value of for the calculation gives a pessimistic value of .
In recent years, the use of curves has been superseded by statistical software, which is
quicker and avoids the visual inaccuracies associated with the curves. In particular, Minitab
will determine the power of the one-sample t test, where power 5 1 2 , once the differ-
ence between the null value and alternative value of and also the value of have been
specified (small is equivalent to large power; a powerful test is one that has large power
and therefore good ability to discriminate between the null hypothesis and the alternative
value of ). In addition, instead of specifying n and asking for power, the user can specify
the desired power for the given difference and ask Minitab for the necessary sample size.
Example 8.13 The true average voltage drop from collector to emitter of insulated gate bipolar tran-
sistors of a certain type is supposed to be at most 2.5 volts. An investigator selects a
sample of n 5 10 such transistors and uses the resulting voltages as a basis for testing
H0: 5 2.5 versus Ha: . 2.5 using a t test with significance level 5 .05. If the
standard deviation of the voltage distribution is 5 .100, how likely is it that H0 will
not be rejected when in fact 5 2.6?
The difference value is 2.6 2 2.5 5 .1. Providing this information to Minitab
along with the sample size, value of , and the fact that the test is upper-tailed
results in power 5 .8975, from which .1. The investigator may think that this
value of is too large for such a substantial departure from H0. When Minitab is
supplied with the difference .1, 5 .1, and the target power of .95 ( 5 .05) for
an upper-tailed test with 5 .05, the necessary sample size is returned as 13. The
actual power in this case is .9597, whereas using n 5 12 would result in power
being somewhat below the target.
Type II error probabilities for other tests can be determined in a similar manner using
appropriate statistical software.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.5 Further Aspects of Hypothesis Testing 403
You might ask whether there is another test procedure, based on a different test statis-
tic (a different function of the sample data), that outperforms the one-sample t test in the
sense that it has the same significance level but smaller type II error probabilities. It turns
out that there is no such test as long as the population distribution is normal. The one-
sample t test is really the best possible test in this situation. Furthermore, if the population
distribution is not too far from being normal, no test can improve on the one-sample t test
by very much. However, if the population distribution is highly nonnormal (heavy-tailed,
highly skewed, or multimodal), the t test should not be used. Then it is time to consult your
friendly neighborhood statistician to see what alternative methods of analysis are available.
Consider testing H0: 1 2 2 5 0 versus Ha: 1 2 2 . 0. The key idea behind the
test is that, if the null hypothesis is true, observations from the two samples should be
intermingled in magnitude, so that the ranks are intermingled. However, when 1 ex-
ceeds 2, observations in the first sample will tend to be larger than those in the second
sample. In this case, the larger ranks will be assigned to sample 1 observations and the
smaller ranks to the observations from sample 2. The Wilcoxon test statistic w is the
sum of the ranks assigned to observations in the first sample. For the data introduced,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
404 chapter 8 Testing Statistical Hypotheses
Because the inequality . appears in Ha, values of w larger than 21 are even more con-
tradictory to the null hypothesis than the value actually obtained. Thus
Now the only set of four ranks for which w 5 21 is the one that resulted, and the only
possible w value larger than 21 is w 5 22, which occurs when the ranks are 4, 5, 6, and
7. So
But when the null hypothesis is true, all seven observations have actually been selected
from the same distribution, in which case any set of four ranks for the observations in
the first sample has the same chance of resulting—the set 1, 2, 5, 7 or the set 1, 3, 4, 6,
and so on. It is not difficult to see that there are 35 possible sets of four ranks that can be
selected from the ranks 1, . . ., 7.1 Since only two of these 35 sets have w $ 21,
2
5 .0571 P@value 5
35
When H0 is true and this test statistic is used repeatedly on different samples, in the long
run about 5.7% of all samples will give a w value at least as contradictory to the null
hypothesis as what we obtained. The P-value is small enough to justify rejection of H0
at level .10 but not at level .05.
Unless n1 and n2 are quite small, it can be time-consuming to determine the sets of
ranks corresponding to w values at least as extreme as what was obtained to calculate the P-
value. We recommend using a statistical computer package for this purpose. The Wilcox-
on test is valid whatever the nature of the two distributions as long as they are continuous
with the same shapes and spreads. This test is often described as being distribution-free (or
nonparametric), meaning that it is valid for a wide variety of underlying distributions rather
than just one particular type of distribution. The t test is not distribution-free, because its
validity is predicated on the two distributions being at least approximately normal. There
are a number of other distribution-free tests in a statistician’s toolbox, many of them based
on ranks of the observations. The best of these tests, including the Wilcoxon test, perform
almost as well as tests such as the t test that are developed with specific types of distribu-
tions in mind. That is, for the same significance level , type II error probabilities for the
distribution-free tests are not much larger than those of the best tests in various situations.
Consult one of the chapter references for more information on procedures of this type.
1
In general, there are (n1 1 n2)!y(n1!)(n2!) ways to select the n1 ranks for the observations from the first sample.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8.5 Further Aspects of Hypothesis Testing 405
(e.g., 100 or 110), then the P-value # .05 and H0 can be rejected. In other words, the
95% interval consists precisely of all values 0 for which the null hypothesis H0: 5 0
cannot be rejected at a significance level of .05. This is intuitively reasonable, since the
confidence interval consists of all plausible values of at the designated confidence lev-
el, and not rejecting H0 means that 0 is plausible. The following generalization of this
situation describes an important relationship between tests and confidence intervals.
Let nL denote the lower confidence limit for some parameter and nU denote the upper
confidence limit, where the confidence level is 100(1 2 ),. Consider the test procedure
that rejects H0: 5 0 in favor of a: Þ 0 if 0 lies outside the interval and does not
reject the null hypothesis if 0 falls between nL and nU. (Notice that there is no explicit test
statistic, but we still have a decision rule.) This test procedure has a significance level of .
The result is important because a confidence interval can be used as a basis for
testing hypotheses, and, by the same token, there is a confidence interval procedure cor-
responding to any particular test procedure. (Our discussion has focused on two-sided
confidence intervals and two-tailed tests, but one-sided confidence intervals that specify
a lower or an upper confidence bound give rise to one-tailed tests and vice versa.) For
example, in Chapter 7 we discussed the bootstrap method for calculating confidence
intervals; these intervals also form the basis for bootstrap tests of hypotheses. Similarly,
the Wilcoxon rank-sum test, which was described previously, gives rise to a distribution-
free confidence interval for 1 2 2. In summary, the duality between tests and confi-
dence intervals has led to the development of many important inferential procedures.
5 …
x 1! xn! x 1! x n !
Now consider general null and alternative hypotheses of the form H0: V 0 ver-
sus Ha: V a( is read as “lies in the set. . . ”). For example, V 0 might be the single
value 10, and V a might consist of all numbers except 10, whence the hypotheses are
H0: 5 10 versus Ha: Þ 10. Now consider the following likelihood ratio test statistic:
maximum value of likelihood for all V0
(x) 5 likelihood ratio test statistic 5
maximum value of likelihood for all Va
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
406 chapter 8 Testing Statistical Hypotheses
where x is compact notation for x1, . . . , xn. If the numerator of this statistic is much
larger than the denominator (the ratio is much larger than 1.0), then there is a value of
specified by the null hypothesis for which the observed data is a lot more likely than it
would be for any value of specified by the alternative hypothesis. If, however, the ratio
is much smaller than 1.0, there is an alternative value of for which the observed data is
much more likely than would be the case if the null hypothesis were true. A ratio of the
latter sort therefore suggests rejecting H0 in favor of Ha. Suppose, for example, that the
value of (x) is .2. Then values of this statistic smaller than .2 are even more contradic-
tory to H0 than what was obtained, implying that
P@value 5 P( (x) # .2 when H0 is true)
Suppose that the population distribution is normal and that we wish to test the null
hypothesis H0: 5 0 against one of the three alternatives considered previously. It is
not at all obvious by inspection, and the argument requires a bit of tedious algebra, but
it can be shown that application of the likelihood ratio principle here gives rise to the
one-sample t test. So this test procedure can be derived from a general principle for test
construction rather than being justified simply on intuitive grounds. This is also true of
a number of test procedures to be considered in the next several chapters.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 407
Alpha = 0.01 Sigma = 0.8 five bondings of two surfaces, and the force necessary
Sample to separate the two surfaces was determined for each
Difference Size Power bonding, resulting in the following data:
0.5 15 0.3311
0.8 15 0.7967 Adhesive 1: 229 286 245 299 250
Alpha = 0.01 Sigma = 0.8 Adhesive 2: 216 179 183 247 232
Sample Target Actual Use the Wilcoxon rank-sum test to decide whether
Difference Size Power Power true average bond strengths differ for the two adhe-
0.5 42 0.9000 0.9047 sives. Hint: For these sample sizes, when H0 is true,
0.8 19 0.9000 0.9147
P(w $ c) 5 .048 for c 5 36, .028 for c 5 37, and .008
68. The article “A Study of Wood Stove Particulate for c 5 39. Furthermore, when H0 is true, the distri-
Emission” (J. of the Air Pollution Control Fed., bution of w is symmetric about n1(n1 1 n2 1 1)y2, so
1979: 724–728) reported the following data on in this case P(w # c) 5 .048 for c 5 19.
burn time (hr) for specimens of oak and pine.
70. The confidence interval associated with Wilcoxon’s
Use Wilcoxon’s test at a significance level of .05
rank-sum test has the following general form. First,
to decide whether true average burn time for oak
subtract each observation in the first sample from ev-
exceeds that for pine. Hint: With n1 5 6 and n2 5
ery observation in the second sample to obtain a set
8, when H0 is true, P(w $ c) 5 .054 for c 5 58
of n1n2 differences. Then the confidence interval ex-
and is .010 for c 5 63.
tends from the cth smallest of these differences to the
Pine: .98 1.40 1.33 1.52 .73 1.20 cth largest difference, where the value of c depends
on the desired confidence level. In the case n1 5
Oak: 1.72 .67 1.55 1.56 1.42 1.23 1.77 .48
n2 5 5, c 5 4 results in a confidence level of 94.4%,
69. In an experiment to compare the bond strength of which is as close to 95% as can be obtained. Deter-
two different adhesives, each adhesive was used in mine this CI for the strength data in Exercise 69.
Supplementary Exercises
71. Have you ever been frustrated because you could not conclusion change if a significance level of .01
get a container of some sort to release the last bit of its had been used?
contents? The article “Shake, Rattle, and Squeeze: c. Describe in context type I and II errors, and say
How Much Is Left in That Container?” (Consumer which error might have been made in reaching
Reports, May 2009: 8) reported on an investigation a conclusion.
of this issue for various consumer products. Suppose
72. The article cited in Exercise 25 of Section 8.2 gave
five 6.0-oz tubes of toothpaste of a particular brand
the following data on mass crystallinity (in %) for
are randomly selected and squeezed until no more
12 samples of the PHB polymer:
toothpaste will come out. Then each tube is cut
open and the amount remaining is weighed, result- 42.97 38.81 38.83 41.03 41.25 36.99
ing in the following data (consistent with what the 49.57 41.77 34.50 44.77 36.92 40.48
cited article reported): .53, .65, .46, .50, .37. Does it
appear that the true average amount left is less than a. Is it plausible that the mass crystallinity for this
10% of the advertised net contents? type of polymer is normally distributed?
a. Check the validity of any assumptions necessary b. Suppose researchers wanted to investigate
for testing the appropriate hypotheses. whether the true average mass crystallinity ex-
b. Carry out a test of the appropriate hypotheses ceeds 40%. Carry out a test of appropriate hy-
using a significance level of .05. Would your potheses using a significance level of .05.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
408 chapter 8 Testing Statistical Hypotheses
73. The following summary data on daily caffeine con- Replace 2 in X2 by its hypothesized value 20 to
sumption for a sample of adult women appeared obtain a test statistic. If the alternative hypothesis is
in the article “Caffeine Knowledge, Attitudes, and Ha: . 0, the P-value is the area under the n 2 1
Consumption in Adult Women” (J. of Nutrition df chi-squared curve to the right of the calculated X2
Educ., 1992: 179–184): n 5 47, x 5 215 mg, s 5 (an upper-tailed test).
235 mg, range of data: 5–1176. a. To ensure reasonably uniform characteristics
a. Does it appear plausible that the population distri- for a particular application, it is desired that the
bution of daily caffeine consumption is normal? true standard deviation of the softening point
Is it necessary to assume a normal population of a certain type of petroleum pitch be at most
distribution to test hypotheses about population .50°C. The softening points of ten different
mean consumption? Explain your reasoning. specimens were determined, yielding a sample
b. Suppose it had previously been believed that pop- standard deviation of .58°C. Assume that the
ulation mean consumption was at most 200 mg. distribution from which the observations were
Does the given data contradict prior belief? selected is normal. Does the data contradict the
uniformity specification? State and test the ap-
74. Contamination of mine soils in China is a serious en-
propriate hypotheses using 5 .01.
vironmental problem. The article “Heavy Metal Con-
tamination in Soils and Phytoaccumulation in a Man- b. Suppose that the investigator who performed
ganese Mine Wasteland, South China” (Air, Soil, and the experiment described in part (a) had wished
Water Res., 2008: 31–41) reported that, for a sample of to test H0: 5 .70 versus Ha: , .70. Can this
3 soil specimens from a certain restored mining area, test be carried out using the chi-squared table in
the sample mean concentration of total Cu was 45.31 this book? Why or why not?
mg/kg with a corresponding (estimated) standard error 77. Let denote the proportion of “successes” in some
of the mean of 5.26. It was also stated that the China population. Consider selecting a random sample of
background value for this concentration was 20. The size n, and let p denote the sample proportion of
results of various statistical tests described in the article successes (number of successes in the sample divid-
were predicated on assuming normality. ed by n). Suppose we wish to test H0: 5 0. When
Does the data provide strong evidence for H0 is true and both n0 . 5 and n(1 2 0) . 5, the
concluding that the true average concentration in sampling distribution of p is approximately nor-
the sampled region exceeds the stated background mal with mean value 0 and standard deviation
value? Carry out a test at significance level .01. 20(1 2 0)yn. This implies that a “large-sample”
test statistic is z 5 (p 2 0)y20(1 2 0)yn (i.e.,
75. In an investigation of the toxin produced by a certain
we standardize p assuming that H0 is true); the
poisonous snake, a researcher prepared 26 different
P-value is calculated as was done in Section 8.1 for
vials, each containing 1 g of the toxin, and then
a z test concerning .
determined the amount of antitoxin necessary to
Seat belts help prevent injuries in vehicle ac-
neutralize the toxin. The sample average amount of
cidents, but they don’t offer complete protection
antitoxin necessary was found to be 1.89 mg, and the
in extreme situations. A sample of 319 front-seat
sample standard deviation was .42. Previous research
occupants involved in head-on collisions in a cer-
had indicated that the true average neutralizing
tain region resulted in 95 who sustained no inju-
amount was 1.75 mg/g of toxin. Does the new data
ries (“Influencing Factors on the Injury Severity
contradict the value suggested by prior research?
of Restrained Front Seat Occupants in Car-to-Car
State and test the relevant hypotheses using 5 .05.
Head-on Collisions,” Accident Analysis and Preven-
76. When the population distribution is normal, it can tion, 1995: 143–150). Does this data suggest that
be shown that the variable X2 5 (n 2 1)s2y2 has less than one-third of all such accidents result in
a chi- squared distribution with n 2 1 df. This can no injuries? State and test the relevant hypotheses
be used as a basis for testing H0: 5 0, as follows: using a significance level of .05.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 409
78. Some of the deadliest mass shootings in U.S. histo- Is this difference in fact statistically significant? State
ry occurred in 2012. These events led to many calls the appropriate hypotheses and test at 5 .05.
for stricter national gun control. On December
81. Information about hand posture and forces gener-
27, 2012, the Gallup organization reported that
ated by the fingers during manipulation of various
roughly 600 of 1038 American adults surveyed said
daily objects is needed for designing high-tech hand
they would be in favor of strengthening laws cover-
prosthetic devices. The article “Grip Posture and
ing the sale of firearms.
Forces During Holding Cylindrical Objects with
a. Does this provide strong evidence for conclud-
Circular Grips” (Ergonomics, 1996: 1163–1176)
ing that more than 50% of the population of
reported that for a sample of 11 females, the sample
American adults was in favor of making laws
mean four-finger pinch strength (N) was 98.1 and
covering the sale of firearms more strict? Con-
the sample standard deviation was 14.2. For a sam-
duct an appropriate test of hypotheses using a
ple of 15 males, the sample mean and sample stan-
.01 significance level. (Hint: Read the first para-
dard deviation were 129.2 and 39.1, respectively.
graph of the previous problem.)
a. A test carried out to see whether true average
b. This poll was conducted December 19–22,
strengths for the two genders were different re-
just days after a mass shooting at an elementary
sulted in t 5 2.51 and P-value 5 .019. Does the
school in Connecticut. Discuss what effects this
appropriate test procedure described in this chap-
event may have had on the poll’s outcome.
ter yield this value of t and the stated P-value?
79. Headability is the ability of a cylindrical piece b. Is there substantial evidence for concluding that
of material to be shaped into the head of a bolt, true average strength for males exceeds that for
screw, or other cold-formed part without crack- females by more than 25 N? State and test the
ing. The article “New Methods for Assessing Cold relevant hypotheses.
Heading Quality” (Wire J. Intl., Oct. 1996: 66–72)
82. The article “Pine Needles as Sensors of Atmospheric
described the result of a headability impact test
Pollution” (Environ. Monitoring, 1982: 273–286)
applied to 30 specimens of aluminum killed steel
reported on the use of neutron-activity analysis to
and 30 specimens of silicon killed steel. The
determine pollutant concentration in pine needles.
sample mean headability rating number for the
According to the article’s authors, “These obser-
steel specimens was 6.43 and the sample mean for
vations strongly indicated that for those elements
aluminum specimens was 7.09. Suppose that the
which are determined well by the analytical proce-
sample standard deviations were 1.08 and 1.19, re-
dures, the distribution of concentration is lognor-
spectively. Do you agree with the article’s authors
mal. Accordingly, in tests of significance the loga-
that the difference in headability ratings is signifi-
rithms of concentrations will be used.” The given
cant at the 5% level?
data refers to bromine concentration in needles
80. The article “Two Parameters Limiting the Sensitivity taken from a site near an oil-fired steam plant and
of Laboratory Tests of Condoms as Viral Barriers” from a relatively clean site. The summary values
(J. of Testing and Eval., 1996: 279–286) reported that, are means and standard deviations of the log-trans-
in brand A condoms, among 16 tears produced by a formed observations.
puncturing needle, the sample mean tear length was Standard
74.0 m, whereas for the 14 brand B tears, the sample Mean Deviation
mean length was 61.0 m (determined using light Sample log of log
microscopy and scanning electron micrographs). Site Size concentration concentration
Suppose the sample standard deviations are 14.8 Steam plant 8 18.0 4.9
and 12.5, respectively (consistent with the sample
Clean 9 11.0 4.6
ranges given in the article). The authors commented
that the thicker brand B condom displayed a smaller Let *1 be the true average log concentration at the
mean tear length than the thinner brand A condom. first site and define *2 analogously for the second site.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
410 chapter 8 Testing Statistical Hypotheses
a. Use the pooled t test (based on assuming nor- 85. Tardive dyskinesia refers to a syndrome comprising a
mality and equal standard deviations), described variety of abnormal involuntary movements assumed
in Exercise 37, to decide at significance level to follow long-term use of antipsychotic drugs. An ex-
.05 whether the two concentration distribution periment carried out to investigate the effect of the
means are equal. drug deanol also used a placebo treatment, some-
b. If 1* and 2* , the standard deviations of the thing that resembled deanol in every way but was
two log concentration distributions, are not known to be inert and have absolutely no medical
equal, would 1 and 2, the means of the con- effect. The two treatments were administered for 4
centration distributions, be equal if *1 5 *?
2 weeks each in random order to 14 patients, resulting
Explain your reasoning. in the following total severity index scores (“Double
Blind Evaluation of Deanol in Tardive Dyskinesia,”
83. The article cited in Exercise 78 of Chapter 7 gave
J. of the Amer. Med. Assoc., 1978: 1997–1998):
additional data on breaking force (N):
Patient: 1 2 3 4 5 6 7
Temp Medium n x s
Deanol: 12.4 6.8 12.6 13.2 12.4 7.6 12.1
22° Dry 6 170.60 39.08
Placebo: 9.2 10.2 12.2 12.7 12.1 9.0 12.4
37° Dry 6 325.73 34.97
22° Wet 6 366.36 34.82 Patient: 8 9 10 11 12 13 14
37° Wet 6 306.09 41.97 Deanol: 5.9 12.0 1.1 11.5 13.0 5.1 9.6
Placebo: 5.9 8.5 4.8 7.8 9.1 3.5 6.4
a. Is there strong evidence for concluding that true
average force in a dry medium at the higher Does the data indicate that, on average, deanol
temperature exceeds that at the lower tempera- yields a higher total severity index score than does
ture by more than 100 N? the placebo treatment?
b. Is there strong evidence for concluding that true
average force in a wet medium at the lower tem- 86. The authors of the article “Predicting Professional
perature exceeds that at the higher temperature Sports Game Outcomes from Intermediate Game
by more than 50 N? Scores” (Chance, 1992: 18–22) used statistical anal-
ysis to determine whether there was any merit to
84. Long-term exposure of textile workers to cotton dust the idea that basketball games are not settled until
released during processing can result in substantial the last quarter, whereas baseball games are “over”
health problems so textile researchers have been inves- by the seventh inning. They also considered foot-
tigating methods that will result in reduced risks while ball and hockey. Data was collected for a sample of
preserving important fabric properties. The accompa- games of each type, selected from all games played
nying data on roving cohesion strength (kN?m/kg) for during the 1990 season for baseball and football
specimens produced at five different twist multiples is and during the 1990–1991 season for the other two
from the article “Heat Treatment of Cotton: Effect on sports. For each game, the late-game leader was
Endotoxin Content, Fiber and Yarn Properties, and determined, and it was noted whether the leader
Processability” (Textile Research J., 1996: 727–738): actually ended up winning the game. The leader
was defined as the team ahead after three quarters
Twist multiple: 1.054 1.141 1.2451.370 .481
in basketball and football, two periods in hockey,
Control strength: .45 .60 .61 .73 .69 and seven innings in baseball. The results follow:
Heated strength: .51 .59 .63 .73 .74
Sport Leader wins Leader loses
The authors of the cited article stated that strength
Basketball 150 39
for heated specimens appeared to be slightly higher
Baseball 86 6
on average than for the control specimens. Is the
difference statistically significant? State and test the Hockey 65 15
relevant hypotheses using 5 .05. Football 72 21
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 411
Do the four sports appear to be identical with respect Does the true mean difference between slide retriev-
to the proportion of games won by the late-game al time and digital retrieval time appear to exceed
leader? State and test the appropriate hypotheses us- 10 sec? Be sure to check the validity of any assump-
ing 5 .05. Do you think your conclusion can be tions on which your chosen inferential method is
attributed to a single sport being an anomaly? based.
87. As the population ages, there is increasing concern 89. The NCAA basketball tournament begins with
about accident-related injuries to the elderly. The 64 teams that are apportioned into four regional
article “Age and Gender Differences in Single-Step tournaments, each involving 16 teams. The 16
Recovery from a Forward Fall” (J. of Gerontology, teams in each region are then ranked (seeded)
1999: M444–M50) reported on an experiment from 1 to 16. During the 12-year period from
in which the maximum lean angle—the furthest 1991 to 2002, the top-ranked team won its re-
a subject is able to lean and still recover in one gional tournament 22 times, the second-ranked
step—was determined for both a sample of younger team won 10 times, the third-ranked team was
females (21–29 years) and a sample of older females 5 times, and the remaining 11 regional tourna-
(67–81 years). The following observations are con- ments were won by teams ranked lower than
sistent with summary data given in the article: 3. Let P ij denote the probability that the team
ranked i in its region is victorious in its game
YF: 29 34 33 27 28 32 31 34 32 27 against the team ranked j. Once the Pij’s are avail-
OF: 18 15 23 13 12 able, it is possible to compute the probability that
any particular seed wins its regional tournament
Does the data suggest that true average maximum (a complicated calculation because the number
lean angle for older females is more than 10 degrees of outcomes in the sample space is quite large).
smaller than it is for younger females? State and test The paper “Probability Models for the NCAA
the relevant hypotheses at significance level .10. Regional Basketball Tournaments” (The Ameri-
can Statistician, 1991: 35–38) proposed several
88. Adding computerized medical images to a data-
different models for the Pij’s.
base promises to provide great resources for physi-
a. One model postulated Pij 5 .5 1 (i 2 j) with
cians. However, there are other methods of obtain-
5 1y32 (from which P16.1 5 , P16.2 5 2, etc.).
ing such information, so the issue of efficiency of
Based on this, P(seed #1 wins) 5 .27477, P(seed
access needs to be investigated. The article “The
#2 wins) 5 .20834, and P(seed #3 wins) 5
Comparative Effectiveness of Conventional and
.15429. Does this model appear to provide a
Digital Image Libraries” (J. of Audiovisual Media
good fit to the data?
in Medicine, 2001: 8–15) reported on an experi-
b. A more sophisticated model has Pij 5 .5 1
ment in which 13 computer-proficient medical
.2813625(zi 2 zj) where the z’s are measures
professionals were timed both while retrieving an
of relative strengths related to standard normal
image from a library of slides and while retrieving
percentiles (percentiles for successive highly seed-
the same image from a computer database with a
ed teams are closer together than is the case for
Web front end.
teams seeded lower, and .2813625 ensures that the
Subject: 1 2 3 4 5 6 7 range of probabilities is the same as for the model
Slide: 30 35 40 25 20 30 35 in part (a)). The resulting probabilities of seeds
Digital: 25 16 15 15 10 20 7 1, 2, or 3 winning their regional tournaments are
Difference: 5 19 25 10 10 10 28 .45883, .18813, and .11032, respectively. Assess
the fit of this model.
Subject: 8 9 10 11 12 13
Slide: 62 40 51 25 42 33 90. One way to reduce the equipment problems that
Digital: 16 15 13 11 19 19 occur during die casting is to apply a thin coating to
Difference: 46 25 38 14 23 14 the core pins. The paper “Tool Treatment Extends
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
412 chapter 8 Testing Statistical Hypotheses
Core and Pin Life” (Die Casting Engineer, March/ estimator is unbiased and normally distributed
April 1999: 88) reported on an experiment in which provided that the two population distributions are
one group of core pins was coated using the tradi- normal, and its variance can be determined from
tional nitride process and a second group was coat- the fact that for any two independent random vari-
ed using a new thermal diffusion process. Use the ables y1 and y2 and numerical constants a1 and a2,
accompanying data to decide at significance level V(a1y1 1 a2y2) 5 a21V(y1) 1 a22V(y2).
.01 whether there is strong evidence for concluding
that true average lifetime for the thermal treatment Nitrite: 9000 20,000 10,000 20,000
is more than four times that of the nitride treat- 21,000 3000 4000
ment. Hint: Consider the parameter 5 41 2 2 Thermal: 49,000 23,000 20,000 100,000
with corresponding estimator n 5 4x1 2 x2. This 114,000 35,000 30,000
Bibliography
Agresti, Alan, An Introduction to Categorical Data Belmont, CA, 2012. A somewhat more comprehensive
Analysis (2nd ed.), Wiley, New York, 2007. An excel- and slightly advanced treatment of hypothesis testing
lent treatment of various aspects of categorical data and other topics than what is presented in this book.
analysis by one of the most prominent researchers in See also the books by Devore et al. and by DeGroot et al.
this area. listed in the Chapter 7 bibliography.
Devore, Jay L., Probability and Statistics for Engineer-
ing and the Sciences (8th ed.), Brooks/Cole -Cengage,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9
Baloncici/Shutterstock.com
The Analysis of Variance
9.1 Terminology and Concepts
9.2 Single-Factor Anova
9.3 Interpreting Anova Results
9.4 Randomized Block Experiments
Introduction
As we saw in Chapter 8, there is more than one way to make comparisons be-
tween two populations or processes. Choosing the best approach involves using
one’s technical knowledge of a problem to select an appropriate statistical tech-
nique. In some cases, the independent samples test (Section 8.2) may be the best
approach. At other times, the paired-samples test (also Section 8.2) may be supe-
rior. In Chapter 9, both of these methods are extended to comparisons between
more than two population means. The independent samples test generalizes to the
single-factor analysis of variance (Section 9.2), whereas the paired-samples test
generalizes to the randomized block design (Section 9.4). The procedures in
Chapter 9 are the tip of a large statistical iceberg called experimental design,
which is discussed further in Chapter 10.
One of the important features of the designs in Chapter 9 is that they com-
bine the sample data from several populations into a test capable of de-
tecting when one or more of the population means differ from the rest. That is,
analysis of variance tests are not conducted by simply performing the two-sample
tests of Chapter 8 on all the different pairings of several populations. Only after
an analysis of variance test signals a possible difference between the population
means do we begin to conduct multiple comparisons of the populations, dis-
cussed in Section 9.3, to pinpoint the specific populations whose means differ
from one another.
413
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
414 chapter 9 The Analysis of Variance
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.1 Terminology and Concepts 415
When an ANOVA problem is expressed in terms of a factor and a response, the goal
of the study is to determine whether the different factor levels have different effects on
the response variable. Think of the factor as the independent variable and the response
as the dependent variable. It often helps to draw a picture, as shown in Figure 9.1, to
visualize the data from an ANOVA study.
Response
Samples
Factor
1 2 3 ...
Level
This is the natural extension of the independent samples situation of Section 8.2. To
determine whether the population means differ, the ANOVA approach compares the
variation between the four sample means to the inherent variability within each sample
(see Figure 9.2). The more the sample means differ, the larger will be the between-
samples variation shown at the right in Figure 9.2. The test statistic that compares these
two types of variation is the ratio of the between-samples variation to the within-samples
variation,
between@samples variation
test statistic5
within@samples variation
Figure 9.3 shows how this test statistic behaves when there is no difference be-
tween the four means (i.e., when H0: 1 5 2 5 3 5 4 is true) and when the means
do differ (when H0 is false). In essence, large values of the test statistic tend to support
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
416 chapter 9 The Analysis of Variance
the alternative hypothesis (that some of the means differ from the others), whereas small
values of the statistic support the null hypothesis.
–
4
–
2 Variation between
– sample means
1
–
3
Sample number
1 2 3 4
F Distributions
When the hypothesis H0: 1 5 2 5 3 5 … 5 k is true, it can be shown that the
test statistic described previously follows a continuous probability distribution called
an F distribution. F distributions arise in statistical tests that involve ratios of two varia-
tion measures, such as the ratio of between-samples variation to within-samples varia-
tion, as shown in Figure 9.3. The variation measures used in an F ratio are based on
1 2 3 4 1 2 3 4
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.1 Exercises 417
certain sums of squares calculated from the sample data, and each sum of squares has
an associated number of degrees of freedom. The numerator degrees of freedom, de-
noted df1, is the number of degrees of freedom associated with the sum of squares in the
numerator of an F ratio. Similarly, df2 denotes the denominator degrees of freedom.
There is a different F distribution for every different combination of positive in-
tegers df1 and df2. For example, there is an F distribution with 4 numerator degrees
of freedom and 12 denominator degrees of freedom, another with 3 numerator degrees
of freedom and 20 denominator degrees of freedom, and so forth. Because they are
ratios of nonnegative quantities, variables that follow F distributions have only nonnega-
tive values, and their density curves have a shape similar to that shown in Figure 9.4.
ANOVA tests, as Figure 9.3 illustrates, are upper-tailed tests. In other words, only
large values of the F ratio lead to rejecting H0; small values do not reject H0. In terms
of P-values, this means that the P-value associated with a calculated F ratio is the area
under the F distribution to the right of the calculated F ratio. Figure 9.4 shows the P-value
associated with a calculated F ratio of 9.15 based on df1 5 4 and df2 5 6. Tables of critical
values for F distributions can be found in Appendix Table VIII.
0
= 9.15
The F table (Table VIII) contains critical F values associated with tail areas of .10,
.05, .01, and .001. To use the table with an F ratio based on df1 5 4 and df2 5 6, read
across the top of the table to find the column with df1 5 4 and read down the left side of
the table to find the row with df2 5 6. At the intersection of this column and row, there
will be four critical values, corresponding to right-tail probabilities of .10, .05, .01, and
.001. For example, a P-value of .01 is associated with an F ratio of 9.15, a P-value of .05
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
418 chapter 9 The Analysis of Variance
c. Suppose the ANOVA test does not reveal any sig- select the factor level having the largest sample
nificant differences in strength between the three mean. This strategy has been called the “pick the
types of beams. If the builder must use one of the winner” approach in the literature on experimental
three types, which type should be chosen? design. Explain what is wrong with this approach
and why it does not take the place of an ANOVA test.
2. Suppose you have a fixed budget to allocate to the
samples used in a study of the effect of the factor 6. Use the table of F distribution critical values (Ap-
“chemical concentration” on the plating thickness pendix Table VIII) to find
of electroplated plastic parts. Describe in general a. The F critical value based on df1 5 5 and df2 5 8
terms how you would allocate the samples. Spe- that captures an upper-tail area of .05
cifically, what information would make you want b. The F critical value based on df1 5 8 and df2 5 5
to use fewer levels of chemical concentration and, that captures an upper-tail area of .05
correspondingly, more plastic parts at each concen- c. The F critical value based on df1 5 5 and df2 5 8
tration level? Conversely, what scenario would lead that captures an upper-tail area of .01
you to use a larger number of concentration levels d. The F critical value based on df1 5 8 and df2 5 5
and, therefore, fewer plastic parts per concentration that captures an upper-tail area of .01
level? Include the two sources of variation in an e. The 95th percentile of an F distribution with
ANOVA experiment in your answers. df1 5 3 and df2 5 20
f. The probability P(F # 6.16) for df1 5 6 and
3. In a one-way ANOVA test for comparing the mean
df2 5 4
strengths (in kilograms) of three different alloys,
g. The probability P(4.74 # F # 7.87) for df1 5 10
suppose that the measuring instrument used is
and df2 5 5
out of calibration, causing it to give readings that
are consistently 2.5 kilograms higher than the true 7. Based on your answers to Exercise 6(a)–(d), what
measured strength. Using the general description of effect does interchanging df1 and df2 have on the
the techniques given in this section, explain what critical F value (for a fixed upper-tail area)?
effect you think such data would have on the results
8. An experiment was carried out to compare flow
of an ANOVA test comparing samples of the three
rates for four different types of nozzles.
alloys. Do you think an ANOVA test based on ac-
a. Samples of five type-A nozzles, six type-B noz-
curate measurements of the same samples of alloys
zles, seven type-C nozzles, and six type-D noz-
will lead to a different conclusion?
zles were tested. ANOVA calculations yielded
4. Repeated measurements in an ANOVA study are an F value of 3.68 with df1 5 3 and df2 5 20.
supposed to indicate what would happen if another State and test the relevant hypotheses using
researcher tried to repeat your study. In particular, 5 .01.
simply measuring the same sampled item several b. Analysis of the data using statistical software
times, which gives repeated measurements of that yielded a P-value of P 5 .029. Using 5 .01,
item, is not considered to be a valid form of replica- what conclusion would you draw regarding the
tion. Instead, several different items should each be test in part (a)?
measured once. What is the danger in using repeated
9. In a test of the hypothesis H0: 1 5 2 5 3 5 4,
measurements of the same item instead of truly repli-
samples of size 6 were selected from each of four
cating an experimental result? What do you expect the
populations, and an F statistic value of 4.12 was cal-
effect on the F statistic to be if repeated measurements
culated (using the methods in the next section). The
of a single item are used at each level of a factor?
appropriate degrees of freedom for the F distribution
5. As a simple method of determining which of k factor in this exercise are df1 5 3, df2 5 20. Using 5 .05,
levels maximizes the average value of a certain re- conduct the test to determine whether you can con-
sponse variable, inexperienced researchers some- clude that there are differences between 1, 2, 3,
times calculate the k sample means and then simply and 4 .
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.2 Single-Factor ANOVA 419
ANOVA Notation
Sample sizes: 1, 2, 3, ...,
Sample means: 1, 2, 3, . . . ,
Sample variances: 2 2 2 2
1, 2, 3, ...,
Total sample size: 5 1 1 …1
1 2 31
Grand average: 5 average of all responses
The double bar in the notation for the grand average is meant to imply that x is an
average of averages. More accurately, x is a weighted average of the k sample means:
n1 n2 n3 nk
x5 a b x1 1 a b x2 1 a b x3 1 … 1 a b xk
n n n n
SSTr and SSE form the basis of the between-samples variation and within-samples
variation described in Section 9.1. Before these quantities are used to conduct the
hypothesis test, however, they must be adjusted to take into account the effects of sample
sizes. This is done later in the section.
SSE can also be written in the form
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
420 chapter 9 The Analysis of Variance
which more clearly shows how SSE combines or pools the information in the k sample
variances s21, s22, s23, . . . , s2k . Together, these two sources of variation comprise the total
sum of squares (denoted SST). That is,
SST 5 SSTr 1 SSE
where SST is the sum of squared deviations from the grand mean:
k ni
SST 5 ^ ^ (xij 2 x )2
i51 j51
Hypothesis Tests
Until now, our ANOVA formulas have been merely arithmetic constructs. To put these
ingredients together to form a statistical procedure, we must be willing to make a few
assumptions about the populations being studied. ANOVA tests, in particular, are based
on the following assumptions:
ANOVA Assumptions
1. All of the k population variances are equal (i.e., 21 5 22 5 23 5 … 5 2k )
2. Each of the k populations follows a normal distribution.
These assumptions are identical to those for the two-sample equal-variance procedures
in Exercise 54 of Chapter 7 and Exercise 37 of Chapter 8, which, in fact, are just special
cases (when k 5 2) of the more general single-factor ANOVA test we are currently dis-
cussing. If the ratio of the largest sample variance to the smallest one does not exceed 4
by very much, then Assumption 1 is plausible. And for very small sample sizes, this rule
is conservative, so 4 can be replaced by 6. Formal test procedures can be found in the
chapter references. Assumption 2 can be checked by examining normal quantile plots
of each sample or, if sample sizes are quite small, a single quantile plot of the deviations
xij 2 xi calculated separately within each sample.
When sampling from normal populations, each sum of squares (such as SST, SSTr,
and SSE) has its own unique number of degrees of freedom. Furthermore, just as SST
can be decomposed into the sum of SSTr and SSE, the degrees of freedom associated with
these sums of squares also decompose in a similar fashion. In a one-way classification, the
total degrees of freedom (associated with SST) is n 2 1, which equals the sum of k 2 1
(the degrees of freedom for treatments) plus n 2 k (the degrees of freedom for error)1:
1
The total degrees of freedom is always n 2 1, regardless of the ANOVA test we use. However, the other sums
of squares have df values that depend on the particular test. For example, the df for SSE in a one-way ANOVA
is different from the df for SSE in a two-way ANOVA.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.2 Single-Factor ANOVA 421
Our purpose in finding the degrees of freedom is to convert the sums of squares
into mean squares by dividing each sum of squares by its associated df. Thus we define
SSTr
Mean square for treatments (between-samples) 5 MSTr 5
k21
SSE
Mean square error (within-samples) 5 MSE 5
n2k
MSTr and MSE serve as our measures of the between-samples and within-samples varia-
tion described in Section 9.1. All of this information is usually organized into an ANOVA
table (Figure 9.5). The ANOVA table is arranged in column form to emphasize the fact
that the sums of squares and degrees of freedom sum to SST and n 2 1, respectively.
Source of variation df SS MS F P-value
Between samples (treatments) k21 SSTr MSTr MSTryMSE
Within samples (error) n2k SSE MSE
Total variation n21 SST
Figure 9.5 ANOVA table for the one-way classification
The entry in the F column of the ANOVA table is the test statistic value
MSTr
F5
MSE
which is used to test the hypothesis H0: 1 5 2 5 3 5 … k. This F distribution has
(k 2 1, n 2 k) degrees of freedom since the numerator in MSTryMSE has df 5 k 2 1
and the denominator has df 5 n 2 k. As we mentioned in Section 9.1, the test proce-
dure is always right-tailed; that is, the P-value associated with an F statistic is equal to
the area to the right of the statistic under the appropriate F density curve. We reject H0
whenever the P-value of the test statistic F is less than or equal to the desired signifi-
cance level . Software packages usually include the P-value in the table.
MSTr
Test statistic: 5
MSe
-value: is the area under the density with ( 2 1, 2 ) degrees
of freedom to the right of the calculated .
Decision: Reject 0 if -value # .
Example 9.1 Numerous factors contribute to the smooth running of an electric motor (“Increas-
ing Market Share Through Improved Product and Process Design: An Experimental
Approach,” Quality Engineering, 1991: 361–369). In particular, it is desirable to keep
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
422 chapter 9 The Analysis of Variance
motor noise and vibration to a minimum. To study the effect that the brand of bearing
has on motor vibration, five different motor bearing brands were examined by install-
ing each type of bearing on different random samples of six motors. The amount of
motor vibration (measured in microns) was recorded when each of the 30 motors
was running. The data for this study is given in Table 9.1. Because each sample of
six motors was selected independently of the other samples, this is a completely ran-
domized design with the factor brand at five levels (brand 1, brand 2, . . . , brand 5).
Determining whether the bearing brands have different effects on the response vari-
able (motor vibration) can be accomplished with a one-way ANOVA test. The null
hypothesis is H0: 1 5 2 5 3 5 4 5 5, where i average vibration (in microns) for
motors using bearings of brand i. We use a significance level of .05 to conduct this test.
Table 9.1 Vibration (in microns) in five groups of electric motors
with each group using a different brand of bearing
Brand 1 Brand 2 Brand 3 Brand 4 Brand 5
13.1 16.3 13.7 15.7 13.5
15.0 15.7 13.9 13.7 13.4
14.0 17.2 12.4 14.4 13.2
14.4 14.9 13.8 16.0 12.7
14.0 14.4 14.9 13.9 13.4
11.6 17.2 13.3 14.7 12.3
Mean: 13.68 15.95 13.67 14.73 13.08
St. dev.: 1.194 1.167 .816 .940 .479
ANOVA Table
Source df SS MS F
Factor 4 30.88 7.72 8.45
Error 25 22.83 .913
Total 29 53.71
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.2 Exercises 423
and
SSE 5 (n1 2 1)s21 1 (n2 2 1)s22 1 (n3 2 1)s23 1 … 1 (n5 2 1)s25
5 (6 2 1)(1.194)2 1 (6 2 1)(1.167)2 1 … 1 (6 2 1)(.479)2
5 22.83
Putting these results into the formulas for MSTr and MSE, we find
SSTr 30.88
MSTr 5 5 5 7.72
k21 521
SSE 22.83
MSE 5 5 5 .913
n 2 k 30 2 5
which yields the test statistic value
MSTr 7.72
F5 5 5 8.45
MSE .913
Using Appendix Table VIII for the F distribution with (k 2 1, n 2 k) 5 (5 2 1, 30 2 5) 5
(4, 25) degrees of freedom, we find that the P-value associated with the test statistic
F 5 8.45 is less than .001. Since this P-value is smaller than the prescribed of .05,
we can reject the hypothesis that all five means are equal and conclude that the type
of motor bearing used does have a significant effect on motor vibration. In particular,
a visual inspection of the sample means in Table 9.1 suggests that brand 5 is the best
choice for reducing vibration. In Section 9.3, we present a statistical procedure to
sort out which brands are indeed the better ones to use.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
424 chapter 9 The Analysis of Variance
has shown that the pulsed current gas tungsten arc ent HPJA coolant pressure levels (.6, 10, and 30 MPa)
welding (PCGTAW) process offers superior SDSS and recorded the corresponding average tool flank
welds compared to other methods. The authors of wear (ATFW), a combination of abrasive and depth
“Optimization of Experimental Conditions of the of cut notch wear:
Pulsed Current GTAW Parameters for Mechani-
Pressure
cal Properties of SDSS UNS S32760 Welds Based
.6: 145.00 158.14 157.32 409.42 143.00 135.50
on the Taguchi Design Method” (J. of the Air and
10: 75.00 113.82 76.02 378.65 61.58 183.39
Waste Mgmt. Assoc., 2012: 1978–1988) researched
30: 94.03 65.90 102.31 131.62 53.12 108.41
the impact of different PCGTAW process param-
eters on mechanical properties of the welds of a par- Consider conducting an ANOVA test to see if there
ticular SDSS. One investigation focused on seeing are any differences in the true mean ATFW caused
how pulse current (A) of the PCGTAW affects the by the different coolant pressures. The validity of an
toughness (J) of the SDSS welds. Here are experi- ANOVA test depends on the extent to which the two
mental results for toughness measurements under fundamental ANOVA assumptions (normal popula-
three pulse current settings: tions; equal population variances) are satisfied.
a. Create a single normal probability (quantile)
Pulse Current: 100 100 100 120 120 120 140 140 140
plot based on the deviations of the sample data
Toughness: 39 47 44 52 56 53 40 46 42 from the sample mean for each of the three sam-
Use 5 .05 to conduct the test for whether there are ples. Does the assumption of normality appear
any differences in the true average weld toughness to hold?
that may be attributable to the different pulse currents. b. The assumption of equal population variances
is plausible if the ratio of the largest sample
13. The article “Influence of Contamination and variance to the smallest sample variance is not
Cleaning on Bond Strength to Modified Zirconia” much more than 4. Is it plausible that the popu-
(Dental Materials, 2009: 1541–1550) reported on lation variances are approximately equal?
an experiment in which 50 zirconium-oxide disks
were divided into 5 groups of 10 each. Then a dif- 15. It is common practice in many countries to destroy
ferent contamination/cleaning protocol was used (shred) refrigerators at the end of their useful lives.
for each group. The following summary data on In this process, material from insulating foam may
shear bond strength (MPa) appeared in the article: be released into the atmosphere. The article “Release
of Fluorocarbons from Insulation Foam in Home
Treatment: 1 2 3 4 5
Appliances During Shredding” (J. of the Air and Waste
Sample mean: 10.5 14.8 15.7 10.0 21.6 Mgmt. Assoc., 2007: 1452–1460) gave the following
Sample sd: 4.5 6.8 6.5 6.7 6.0 data on foam density (g/L) for each of two refrigerators
a. State the hypotheses of interest in this experiment. produced by four different manufacturers:
b. Using a significance level of .01, can you con-
Manufacturer: 1 1 2 2 3 3 4 4
clude that there is a difference between the
Foam Density: 30.4 29.2 27.7 27.1 27.1 24.8 25.5 28.8
mean shear bond strength of the five groups?
Does it appear that true average foam density is not the
14. In “Investigation on Machining Performance of
same for all these manufacturers? State and test the
Inconel 718 Under High Pressure Cooling Con-
relevant hypotheses using a significance level of 5
ditions” (J. of Mech. Engr., 2012: 683–690), re-
.05. Summarize your analysis in an ANOVA table.
searchers varied selected high-pressure jet-assisted
(HPJA) machining parameters for the nickel-based 16. According to “Evaluating Fracture Behavior of Brit-
alloy Inconel 718 and investigated their effect on tle Polymeric Materials Using an IASCB Specimen”
tool wear. (J. of Engr. Manuf., 2013: 133–140), researchers
In one experiment, the researchers machined have recently proposed an improved test for the in-
six specimens of Inconel 718 at each of three differ- vestigation of fracture toughness of brittle polymeric
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.2 Exercises 425
materials. The authors applied this new fracture from inches to centimeters has on the ANOVA
test to the brittle polymer polymethylmethacrylate calculations.
(PMMA), more popularly known as Plexiglas, b. Based on your conclusions in part (a), what gen-
which is widely used in commercial products. eral statement can you make about the effect of
The test was performed by applying asymmet- changing units of measure in an ANOVA test?
ric three-point bending loads on PMMA specimens
18. The accompanying summary data on skeletal-
and varied the location of one of the three load-
muscle citrate synthase activity (nmol/min/mg)
ing points to determine its effect on fracture load.
appeared in the article “Impact of Lifelong Seden-
In one experiment, three loading point locations
tary Behavior on Mitochondrial Function of Mice
based on different distances (mm) from the center
Skeletal Muscle” (J. of Gerontology, 2009: 927–939):
of the specimen’s base were selected, resulting in
the following fracture load data (kN): Old Old
Young Sedentary Active
Distance Fracture Load
Sample size 10 8 10
42 2.62 2.99 3.39 2.86
Sample mean 46.68 47.71 58.24
36 3.47 3.85 3.77 3.63
31.2 4.78 4.41 4.91 5.06 Suppose that the total sum of squares for the experi-
mental data is SST 5 2116.81.
Here is the corresponding Minitab ANOVA table:
a. Construct an ANOVA table for this experiment.
One-way ANOVA: Fracture versus Distance b. Using 5 .05, can you conclude that true aver-
age activity differs for the three groups?
Source DF SS MS F P
Dist. 2 6.7653 3.3826 48.58 0.000 19. A study was conducted to determine whether cer-
Error 9 0.6267 0.0696 tain physical properties of asphalt are related to
Total 11 7.3920 portions of a gel permeation chromatogram of the
a. Use your calculator to confirm Minitab’s com- asphalt (“Methodology for Defining LMS Por-
putations. tion in Asphalt Chromatogram,” J. of Materials in
b. At a significance level of .01, can you conclude Civil Engr., 1997: 31–39). To determine whether
there is a difference among true average fracture certain bands or slices of the chromatogram can
loads for the three loading point locations? be used to distinguish different aging conditions
c. Returning to the Minitab output, note that the in asphalt, samples of grade AC-10 asphalt were
number reported under P corresponds to the sampled from several sources and artificially aged,
P-value. Is the P-value exactly zero? What does it some samples for 5 hours and others for 24 hours.
mean when Minitab reports 0.000? Another group of samples was not aged. The fol-
lowing table shows the percentage area of the
17. In an experiment to study the possible effects of four same slice of the chromatograms of these samples
different concentrations of a chemical on heights of (i.e., area of the slice as a percentage of the entire
newly grown plants, suppose that an ANOVA test chromatogram):
is conducted and that plant height is measured in
inches. At a later date, the experimenter decides that Age of asphalt
plant heights should have been measured in centi- 0 hours 5 hours 24 hours
meters instead of inches. After multiplying the data Mean 3.43 3.18 3.22
in the original samples by 2.54 (1 in. 5 2.54 cm), Standard deviation .22 .13 .11
the experimenter wants to know what effect this Sample size 6 6 6
data conversion will have on the conclusions drawn
from the ANOVA test. Can you conclude (using 5 .05) that there is a
a. Use the formulas for SSTr, SSE, SST, MSE, difference between the means for the three age
and MSTr to discuss the effect that changing categories?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
426 chapter 9 The Analysis of Variance
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.3 Interpreting ANOVA Results 427
Using a significance level of 5%, can it be concluded calibration problem in Exercise 3 will have on
that there is a difference among the true mean the entries in the ANOVA table.
temperature measurements for the three structure b. Based on your conclusions in part (a), what gen-
thicknesses? eral statement can you make about the effect of
calibration problems in measuring the response
23. In Exercise 3, a measuring instrument that was
variable of a single-factor ANOVA test?
out of calibration was used to measure strengths
(in kg) of three different alloys. Use the formulas 24. Check the validity of the two fundamental ANOVA
in Section 9.2 to give a more specific answer to the assumptions for the data in Exercise 21 by following
question posed in Exercise 3. That is, the steps stated in Exercise 14.
a. Using the formulas for SSTr, SSE, SST, MSE,
MSTr, and F, describe the exact effect the
Example 9.2 In Example 9.1, we compared five different brands of motor bearings to find out
which brands, if any, are better for reducing motor vibration. Because the ANOVA
test in that example shows that the factor “brand” is statistically significant, it is per-
missible to create the effects plot for the five sample means given in Table 9.1. This
plot (Figure 9.6) clearly shows that the sample from brand 5 gives the lowest average
vibration. However, this fact still does not allow us to conclusively say that brand 5 is
the best. It might prove to be the case, for instance, that brands 1 and 3 are about the
same as brand 5 in their effectiveness for reducing vibration, even though the effects
plot shows that their sample means are slightly higher than the mean for brand 5.
Example 9.3 further clarifies the results of this study.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
428 chapter 9 The Analysis of Variance
Vibration (microns)
16
15
14
13 Brand
1 2 3 4 5
inventing the fast Fourier transform (FFT) method and for introducing the term bit as
a shortened version of binary digit.
Tukey’s procedure allows us to conduct separate tests to decide whether i 5 j for
each pair of means in an ANOVA study of k population means. The method is based on
the selection of a “family” significance level, , that applies to the entire collection of
pairwise hypothesis tests. For example, when using the Tukey procedure with a signifi-
cance level of, say, 5%, we are assured that there is at most a 5% chance of obtaining a
false positive among the entire set of pairwise tests. That is, there is at most a 5% chance
of mistakenly concluding that two population means differ when, in fact, they are equal.
This is very different from simply conducting all the pairwise tests as individual tests,
each at a 5 .05, which can result in a high probability of finding false positives among
the pairwise tests.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.3 Interpreting ANOVA Results 429
Consider first the case of equal sample sizes. Tukey’s procedure is based on
comparing the distance between any two sample means, xi 2 xj , to a threshold val-
ue T that depends on as well as on the MSE from the ANOVA test. The formula
for T is
MSE
T 5 q
A ni
where ni is the size of the sample drawn from each population. The value of q is found from
a table of right-tail values of a statistic, q, that follows the Studentized range distribution.
A table of the values of q is given in Appendix Table IX. The Studentized range distribution
is a probability distribution that depends on a pair of degrees of freedom (k, m), where
k 5 number of population means to be compared
m 5 error degrees of freedom
5 n 2 k, for single@factor ANOVA
n5n 1n 1n 1…1n
1 2 3 k
2. Compute 5 .
A
3. Conclude that Þ if 2 . .
4. Use bars to connect each pair of means and for which 2 does exceed
in Step 3. The corresponding means and of such pairs are not considered to dif-
fer statistically from one another.
One easy method for keeping track of the results of all these pairwise tests is to ar-
range the sample means x1, x2, x3, . . . , xk in increasing order, plot the ordered means
along a horizontal line, and then draw horizontal bars connecting pairs of means that
are no farther than T units apart. These connecting bars are usually drawn in several
rows beneath the corresponding means to keep the diagram uncluttered. The bars show
which population means do not significantly differ from one another. Likewise, means
that are not connected by a bar do differ significantly from one another. Figure 9.7
illustrates how this graphical procedure would be used to summarize the multiple com-
parisons of an ANOVA test using k 5 4 populations.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
430 chapter 9 The Analysis of Variance
– – – –
3 2 4 1
Example 9.3 Because the ANOVA test in Example 9.1 is significant, it is necessary to conduct a
multiple comparisons procedure to delineate exactly which of the five bearing brands
are better than the others. Using Tukey’s procedure with a family significance level of
5 .05, we calculate the critical distance between sample means to be
MSE .913
T 5 q 5 (4.15) 5 1.62
A ni A 6
where q is based on (k, n 2 k) 5 (5, 25) degrees of freedom and is approximated by inter-
polating between the values of q.05(5, 24) and q.05(5, 30) found in Appendix Table IX. The
pairwise distances between the five sample means in Table 9.1 are then compared to T:
The information from these ten tests is summarized in Figure 9.8 by arranging
the five sample means in ascending order and then drawing rows of bars connecting
the pairs whose distances do not exceed T 5 1.62. Starting at the left, the top row
connects the means that do not significantly differ from 5; the next row shows the
means that do not differ from 4; etc. Using this diagram along with the effects plot
(Figure 9.6), we can now summarize what is happening in the ANOVA test. Although
brand 5 has the lowest mean, it does not significantly differ from brands 1 and 3 in its
effect on vibration. We can conclude, however, that brand 5 is definitely better than
brands 2 and 4. Thus the choice of which bearing brand is best has been narrowed
to brands 1, 3, and 5. If we are satisfied that the average vibration levels produced
by these three brands are acceptable for use in the motors, then the choice could be
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.3 Interpreting ANOVA Results 431
further narrowed by considering additional factors, such as unit cost and reliability.
Figure 9.9 shows the SAS output from the application of Turkey’s procedure.
A 15.9500 6 2
A
B A 14.7333 6 4
B
B C 13.6833 6 1
B C
B C 13.6667 6 3
C
C 13.0833 6 5
Experimental designs that use the same sample size for each treatment level are
called balanced designs, whereas those with different sample sizes for some treatment
levels are called unbalanced designs. For an unbalanced design, the Tukey procedure
Unless otherwise noted, all content on this page is © Cengage Learning.
is often run by choosing the minimum of the numbers n1, n2, n3, . . . , nk to use in the
calculation of the critical value T. This leads to a slightly larger value of T than neces-
sary for multiple comparisons; consequently, this practice is considered a conservative
procedure. That is, differences between sample means that exceed T would surely re-
main significant if larger values of ni were to be used in the calculation of T. Other mod-
ifications of Tukey’s procedure include using Tij 5 q 1(MSEy2)(1yni 1 1ynj) in place
of q 1MSEyni when comparing two sample means based on unequal sample sizes.
One question that sometimes arises when first encountering multiple comparisons pro-
cedures is: Why not simply conduct such procedures at the outset and bypass the step of
conducting the ANOVA test? One answer is that most multiple comparison procedures tend
to be not quite so powerful as the ANOVA test for detecting differences between means. The
main reason for this is that, faced with a large number of pairwise hypothesis tests, multiple
comparisons procedures attempt to avoid the problem of making too many type I errors
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
432 chapter 9 The Analysis of Variance
(i.e., falsely detecting differences between means that are, in fact, equal) by using family
significance levels. These family significance levels essentially put more demands on the
individual pairwise tests than we might normally do if we were comparing only one pair of
means, not several. By controlling the overall, or family, error rate of all the tests, each of
the individual pairs tested must pass a higher standard (i.e., the significance levels for each
individual test are much smaller than the family error rate). The end result is that a multiple
comparisons procedure can sometimes miss significant findings that the ANOVA test would
not fail to detect. For these and other reasons, it is usually recommended that multiple com-
parisons procedures be run after determining that the appropriate ANOVA test is significant.
Example 9.4 To illustrate Dunnett’s method, we reconsider the data of Example 9.1. Suppose that
bearings of brand 2 are currently used to manufacture the electric motors and that
we want to compare each of four competing brands to brand 2. To conduct such a
test of k 5 5 means, we would use Dunnett’s method and compare the k 2 1 5 4
treatment samples (brands 1, 3, 4, and 5) to the control sample (brand 2). Because
the sample sizes are equal, the same T value would be used for all four comparisons.
Using a family significance level of a 5 .05, we find
1 1 1 1
T 5 t(k 2 1, n 2 k) MSEa 1 b 5 (2.61) (.913)a 1 b 5 1.440
B ni nc B 6 6
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.3 Interpreting ANOVA Results 433
where the value of t.05(4, 25) is found in Appendix Table X to be approximately 2.61.
The four comparisons to the control sample yield the following results:
Samples
Distance T Conclusion
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
434 chapter 9 The Analysis of Variance
sample results is due to the variability between the various brands in the population
(from which the five brands in the study were selected) and how much is due to the
experimental error. These two components of variance sum to the total variation, 2:
2 5 2 1 2«
where 2 denotes the variability in the population from which the treatment levels
are chosen and 2« is the experimental, or within-samples, error. In the random effects
model, the hypotheses we test are H0: 2 5 0 versus Ha: 2 . 0. For the case of equal
sample sizes, estimates of 2 and 2« are given by the formulas
n 2« 5 MSE
MSTr 2 MSE
n 2 5
ni
Example 9.5 The study of nondestructive forces and stresses in materials furnishes important
information for efficient engineering design. The paper “Zero-Force Travel-Time
Parameters for Ultrasonic Head-Waves in Railroad Rail” (Materials Evaluation,
1985: 854–858) reports on a study of travel time for a certain type of wave that results
from longitudinal stress of rails used for railroad track. Three measurements were
made on each of six rails randomly selected from a population of rails. The investiga-
tors used random effects ANOVA to decide whether some of the variation in travel
time could be attributed to “between-rail variability.” The data for this experiment
and the corresponding ANOVA table appear in Table 9.2.
The error variance is estimated by n 2« 5 MSE 5 16.17, and the estimated var-
2
iation in the population of rails is
n 5 (MSTr 2 MSE)yni 5 (1862.1 2 16.17)y3 5
615.31. Furthermore, since the F ratio of 115.2 is highly significant (i.e., it has a low
P-value), we can conclude that the differences between rails are an important source
of travel-time variability.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.4 Randomized Block Experiments 435
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
436 chapter 9 The Analysis of Variance
mpg for the other samples, even if brand 1 were the worst of the four in terms of average
fuel efficiency. In fact, it is easy to imagine many scenarios in which the sample means
might reflect more about the particular sizes of the automobiles chosen than about the
efficiency of the gasoline brands. To avoid such problems, we should use an experimental
design that ensures that each brand of fuel is tested on the same range of car sizes.
External influences, such as car size, can be thought of as additional factors to be
included in an experimental design. There is usually no need to test such external factors
for statistical significance. Either common sense or technical knowledge tells us that they
are influential, and our reason for considering them is to make sure that they do not in-
validate conclusions about the factor in which we are truly interested (e.g., brand of fuel).
The effect of such external influences can be eliminated, or at least substantially
reduced, by using them as blocks in an experiment. Blocks are groups of items in a
population that have similar characteristics, such as the block of compact cars and the
block of mid-size cars. By making sure that a member of each block is included in each
of the samples, we can eliminate the effect of external factors on the differences between
average responses for the factor we are studying.
For example, to eliminate the influence of car size in the fuel efficiency study, we
could select a range of car sizes, call them B1, B2, B3, . . . , Bb, and then make sure that
each gasoline brand is used on a car from each of these blocks. Denoting the levels of
the factor “gasoline brand” by A1, A2, A3, . . . , Aa, we can summarize the data from such
an experiment in matrix form (Figure 9.10).
The design in Figure 9.10, called a randomized block design, is the natural exten-
sion of the paired-samples test of Section 8.2. In Figure 9.10, blocks of homogeneous
experimental units take the place of the data pairs of Section 8.2. Notice, for instance,
that the observations in any two rows of this matrix are paired because each level of the
blocking factor is represented in each row. Just as in the paired-samples test, the effect
of the different blocks is subtracted out when calculating the difference between any
two row means (i.e., the differences in the average responses for the levels of factor A).
Block
1 2
1 ...
2 ...
Unless otherwise noted, all content on this page is © Cengage Learning.
Treatment . . . .
. . . .
. . . .
...
=
Each cell has one response value.
Sums of Squares
In a randomized block design, the total variation SST in the response variable decom-
poses into three terms, one representing the variation due to the differences in treatment
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.4 Randomized Block Experiments 437
levels (SSTr), one representing the variation between the block means (SSB), and the error
term (SSE, which accounts for all other variation):
SST 5 SSTr 1 SSB 1 SSE
SST, SSTr, and SSB are computed from the formulas shown in the following box:
SST 5 ^ ^ ( 2 )2
51 51
SSTr 5 ^( 2 )2, where denotes the mean of the data the th row
51
SSB 5 ^( 2 )2, where denotes the mean of the data in the th column
51
Hypothesis Tests
Under the usual ANOVA assumptions of normal populations and equal variances, the
total degrees of freedom associated with SST is n 2 1, where n 5 a ? b. The degrees
of freedom for treatments and the blocking factor B are a 2 1 and b 2 1, respectively.
The remaining degrees of freedom, (a 2 1)(b 2 1), are associated with the error term:
ANOVA decomposition: SST 5 SSTr 1 SSB 1 SSE
Degrees of freedom: ab 2 1 5 (a 2 1) 1 (b 2 1) 1 (a 2 1)(b 2 1)
The mean squares are given by
SSTr
MSTr (treatments) 5
a21
SSB
MSB (blocks) 5
b21
SSE
MSE (error) 5
(a 2 1)(b 2 1)
The hypothesis tests for a randomized block design are summarized in the fol-
lowing box:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
438 chapter 9 The Analysis of Variance
Example 9.6 The applications of statistics to crop studies, which began in the 1920s, frequently
makes use of a particular blocking variable, the plot. As farmers have long known, dif-
ferent plots of land have unique combinations of water, sunlight, and soil chemicals,
each having a significant effect on crop growth and yield. Oranges, for example, are
so sensitive to different amounts of sunlight that it is a well-known fact that the sweet-
est oranges come from the south side of the tree.2
In a study of different rootstocks for orange trees, four different varieties of
rootstock are tested by planting each variety on the same ten plots of land.3 The
numbers of oranges produced by these trees are recorded in Figure 9.11. In this
study, the factor A 5 “variety” has a 5 4 levels. The blocking factor B 5 “plot” has
b 5 10 levels.
Block (plot)
1 2 3 4 5 6 7 8 9 10 Average:
1 11 12 10 10 10 9 10 10 10 12 10.4
2 12 12 10 10 10 9 10 10 10 12 10.5
Treatment
(variety)
3 14 15 12 13 12 12 13 13 14 16 13.4
4 12 13 10 10 11 9 11 12 11 14 11.3
Average: 12.25 13.0 10.5 10.75 10.75 9.75 11.0 11.25 11.25 13.5
Figure 9.11 Number of oranges per tree (in 100s) for Example 9.6
The grand average of the 40 values is x 5 11.4, and the ANOVA calculations for
this data are as follows:
a
Unless otherwise noted, all content on this page is © Cengage Learning.
SSTr 5 b ^ (Ai 2 x)2
i51
5 10 3 (10.4 2 11.4)2 1 (10.5 2 11.4)2 1 (13.4 2 11.4)2
1 (11.3 2 11.4)2 4 5 58.2
b
SSB 5 a ^ (Bj 2 x)2
j51
2
McPhee, J., Oranges, Farrar, Straus, Giroux, New York, 1967, p. 8.
3
Oranges, like roses, are grown by grafting plants with desirable characteristics onto the root structure
of another plant whose root system is known to be resistant to disease and other problems.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
9.4 Exercises 439
a b
SST 5 ^ ^ (xij 2 x)2
i51 j51
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
440 chapter 9 The Analysis of Variance
36. A certain county uses three assessors to determine people could exhibit large differences in effort, even
the values of residential properties. To see whether from the same type of stool, a sample of nine people
the three assessors differ in their assessments, five was selected and each was tested on all four stools:
houses are selected and each assessor is asked to
determine the market value of each house. Let Subject
A denote the factor “assessors” and B denote the 1 2 3 4 5 6 7 8 9
blocking factor “houses.” An ANOVA calcula-
A 12 10 7 7 8 9 8 7 9
tion revealed that SSA 5 11.7, SSB 5 113.5, and Type of B 15 14 14 11 11 11 12 11 13
SSE 5 25.6. stool
a. Using 5 .05, test the hypothesis that there are C 12 13 13 10 8 11 12 8 10
no differences between the average values re- D 10 12 9 9 7 10 11 7 8
ported by the three assessors.
b. Based on the ANOVA results, was the use of a. Using a significance level of 5 .05, can you
houses as a blocking factor warranted in this conclude that there is a difference in the average
study? effort required to rise from each type of stool?
b. Do the differences in rising effort that the
37. The article “A Software-Based Resource Selection
researchers expected seem to be confirmed by
Process in Competitive Network Environment
the data?
Using ANOVA (A Case Study)” (Intl. J. of Comp.
Appl., 2012: 17–21) reported on a study in which 39. To assess the potential risks associated with failure
three types of lathes were compared. Each of three of a particular process, investigators often perform
operators used each of the lathes for the equivalent a failure modes and effects analysis (FMEA). An
of a full workday shift. For each shift, the research- FMEA identifies opportunities for failure, known
ers recorded the percentage of acceptable products as failure modes, in a given process. Each mode is
manufactured by the operator. The data from the assessed with a numeric score based on (1) severity
experiment is given here: of the consequences of failure, (2) likelihood of
failure occurrence, and (3) likelihood that failure
Lathe Brand
would not be detected. The product of these scores
1 2 3 is the risk priority number (RPN) for the mode.
1 86 86 88 Modes having the highest RPN values are usually
Operator 2 85 86 91 given the highest priority in carrying out further
3 82 83 85 analyses.
The article “Continuous Quality Improvement
a. Using the three operators as blocks, can you
in Investment Castings: An Experimental Study us-
conclude that there is a difference among the
ing a Modified FMEA Approach Called FEAROM”
percent of acceptable products due to lathes?
(Eur. J. of Sci. Res., 2012: 308–325) reported on a study
(Use 5 .05)
that compared four design methods (M1, M2, M3, M4)
b. Can you conclude that the different operators
in preproduction trials of the upper range for a particu-
have differing effects on product acceptability
lar casting valve. The design methods are applied by
rate? (Use 5 .05).
human operators, which introduces potential operator-
38. In the article “The Effects of a Pneumatic Stool to-operator variation in RPN values. To account
and a One-Legged Stool on Lower Limb Joint Load for this, each of the four design methods was used
and Muscular Activity During Sitting and Rising” (in random order) by all 21 individuals in the study.
(Ergonomics, 1993: 519–535), the following data The data was analyzed by the R software, giv-
is given on the effort (measured on the Borg scale) ing the following output. Note that the format of
required by a subject to arise from sitting on four dif- the ANOVA table in R is very similar to the one
ferent stools. Because it was suspected that different we use, except R eliminates the row of “totals” and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 441
uses the word residuals instead of error. The column d. Explain why your conclusions about wood types
labeled ‘Pr(>F)’ represents P-value. in this experiment differ from the conclusions
reached in Exercise 20.
Df Sum Sq Mean Sq F-value Pr(F)
DESIGN ? 519515 ? ? ? 41. Example 4.15 (Chapter 4) describes a randomized
PERSON ? ? 5023 ? 0.445 block experiment for comparing three different
Residuals ? 293009 ? methods (A, B, and C) of curing concrete. Different
a. Fill in the missing values in the table above. batches of concrete are used as the blocks in the ex-
b. Using 5 .05, can it be concluded that there is periment. For convenience, the data from Table 4.1
a difference in the true average RPN among the is repeated here:
four design methods? Strength (in MPa)
c. Do the person-to-person differences in RPN Batch Method A Method B Method C
seem to be confirmed by the data? Explain.
1 30.7 33.7 30.5
40. In the study described in Exercise 20, the wood
2 29.1 30.6 32.6
grade is known to affect wood strength. To incor-
porate this information, three wood grades were 3 30.0 32.2 30.5
studied: SS (select structural), grade 2, and grade 3. 4 31.9 34.6 33.5
Wood grades are determined by visual inspection. 5 30.5 33.0 32.4
The following table shows bending strengths from 6 26.9 29.3 27.8
testing wood samples of each type and grade: 7 28.2 28.4 30.7
8 32.4 32.4 33.6
Wood grade 9 26.6 29.5 29.2
SS Grade 2 Grade3 10 28.6 29.4 33.2
Douglas Fir 65 43 41 a. Using a significance level of 5%, can you con-
Species Hem-Fir 45 38 32 clude that there is a difference in mean concrete
Spruce-Pine-Fir 42 35 30 strength between the three curing methods?
b. Can you conclude that there are differences be-
a. Using the three wood grades as blocks, can you tween the batch means? (Use 5 .05.)
conclude that there is a difference between the c. Suppose that you ignore the fact that the batches
mean bending strengths of the three species of are blocks in this experiment and that you sim-
wood? (use 5 .05.) ply run a one-factor ANOVA test, treating the
b. Can you conclude that there are differences be- three columns of data as three random samples.
tween the mean bending strengths for the three Using a significance level of .05, what conclu-
grades of wood? (use 5 .05.) sion do you reach regarding the differences be-
c. Suppose that wood with a large bending strength tween the three curing methods?
is needed for a particular structure and that any
wood grade is acceptable. Which type and grade
of wood is best for such a structure?
Supplementary Exercises
42. The authors of “Statistical Analysis and Optimiza- Engr. Manuf., 2012: 1847–1861) investigated
tion Study on the Machinability of BerylliumCopper the machinability of berylliumcopper alloy in an
Alloy in Electro Discharge Machining” (J. of electro discharge machining (EDM) process. The
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
442 chapter 9 The Analysis of Variance
accompanying data resulted from an EDM process In particular, it can be shown that F 5 (t@2)2, for
using an oil dielectric medium where researchers an F distribution with df1 5 1 and any value of df2
applied four different EDM pulse times ( s) and and for a t distribution with df 5 df2. The subscripts
recorded the corresponding material removal rate and @2 on F and t@2 denote right-tail areas of
(MRR, in mm3/s). and @2 under the density curves for the F and t
MRR distributions, respectively.
a. Verify this relationship by looking up F.05(df1 5
20 0.1797 0.3353 0.4073 0.7548
1, df2 5 10) and t.025(df 5 10) in the F and t ta-
Pulse 40 0.2433 0.3830 0.5625 0.7258
bles, Appendix Tables VIII and IV, respectively.
Time 60 0.2338 0.3372 0.5552 0.7453 b. For 5 .05, the values of t@2 approach
80 0.1341 0.2806 0.5502 0.8212 z@2 5 1.96 as the degrees of freedom increase.
What limit does F.05(df1 5 1, df2) approach as
df2 increases?
Use 5 .05 to conduct the test for whether there
are any differences in the true average MRR that 46. Consider the following data on plant growth after the
may be attributable to the different pulse times. application of five different types of growth hormone:
43. The lumen output was determined for three dif- Data
ferent brands of 60-watt soft-white light bulbs, with
A 13 17 7 14
eight bulbs of each brand tested. From the result-
ing lumen measurements, the following sums of B 21 13 20 17
squares were computed: SSE 5 4773.3 and SSTr 5 C 18 15 20 17
591.2. D 7 11 18 10
a. State the hypotheses of interest. Describe,
E 6 11 15 8
in words, the parameters that appear in the
hypotheses. a. Perform the F test for this single-factor ANOVA
b. Compute each of the entries in the ANOVA at 5 .05.
table for this experiment. b. Apply Tukey’s procedure to this data with
c. Using 5 .05, can you conclude that there are 5 .05. Compare your results to the conclu-
any differences between the average lumen out- sion obtained in part (a).
puts for the three brands?
47. Consider a single-factor ANOVA in which samples
44. In the study described in Exercise 12, the authors of size 5 each are measured at each of three levels
also investigated how pulse current affects the hard- of a certain factor. The means of the three samples
ness of the SDSS welds. Hardness is measured in are 10, 12, and 20. Find a value of SSE that satisfies
HV (known as the Vickers number; higher values the following two requirements:
indicate harder metals). (1) The calculated F statistic is larger than the tabled
value of F for 5 .05, df1 5 2, and df2 5 12,
Pulse Current: 100 100 100 120 120 120 140 140 140 so the hypothesis H0: 1 5 2 5 3 is rejected at
Hardness: 326 296 312 245 273 276 299 296 282 5 .05.
(2) When Tukey’s procedure is applied, none of the
Use 5 .05 to conduct the test for whether there three i’s can be said to differ from one another
are any differences in the true average weld hard- (again using 5 .05).
ness attributable to the different pulse currents.
48. For the data referenced in Exercise 39, the article re-
45. In the special case where df1 5 1, the right-tail ported that there was a difference in RPN means for
areas associated with an F distribution are related to the four design methods (M1, M2, M3, M4). Perform
similar areas under a t distribution’s density curve. a post hoc analysis by applying Tukey’s procedure
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 443
(as the authors did) using the following output from a. Using 5 .01, conduct an ANOVA test to
the SAS software: determine whether there is a difference in the
Alpha = 0.05 df = 60 MSE = 4883.488 average focus settings between the two groups of
Critical Value of Studentized Range = pilots.
3.73709 b. Which test procedure in Chapter 8 could have
Minimum Significant Difference = 56.989 been used on this data in place of the ANOVA
test in part (a)?
Means with the same letter are not signifi-
c. Conduct the appropriate test you identified in
cantly different.
part (b), using 5 .01, and compare your an-
Tukey Grouping Mean N trt swer to the answer in part (a).
A 336.00 21 M2
A 51. The results on the effectiveness of line dry-
A 301.00 21 M4 ing on the smoothness of fabric were studied
in the paper “Line-Dried vs. Machine-Dried
B 171.43 21 M3 Fabrics: Comparison on Appearance, Hand, and
B Consumer Acceptance” (Home Econ. Research J.,
B 155.71 21 M1 1984: 27–35). Smoothness scores were given
49. In Exercise 47, suppose that the three sample means for nine types of fabric and five different dry-
are 10, 15, and 20. Can you now find a value of SSE ing methods. Because the different types of fab-
that satisfies the two conditions in Exercise 47? ric were expected to have large differences in
smoothness, regardless of drying method, each of
50. Helmet-mounted displays (HMDs) are computer the five drying methods was used on five samples
displays that are presented on see-through screens of each fabric type. The smoothness scores for this
attached to the helmets of helicopter pilots. experiment were as follows:
HMDs are normally employed to aid night flights.
In a study of HMDs, researchers tested Apache Drying method
helicopter pilots to determine whether the pres-
ence of in-flight vision problems has an effect on Fabric type 1 2 3 4 5
a pilot’s ability to focus the HMD panel. Thirteen Crepe 3.3 2.5 2.8 2.5 1.9
pilots were divided into two groups: those who
Double knit 3.6 2.0 3.6 2.4 2.3
experience certain in-flight vision problems and
Twill 4.2 3.4 3.8 3.1 3.1
those who do not. Subjects were asked to set the
Twill mix 3.4 2.4 2.9 1.6 1.7
focus of the HMD for a fixed test pattern, and their
Terry 3.8 1.3 2.8 2.0 1.6
focus settings were then measured with a dioptom-
eter (“Oculomotor Responses with Aviator Helmet- Broadcloth 2.2 1.5 2.7 1.5 1.9
Mounted Displays and Their Relation to In-Flight Sheeting 3.5 2.1 2.8 2.1 2.2
Symptoms,” Human Factors, 1995: 699–710). The Corduroy 3.6 1.3 2.8 1.7 1.8
data from one such experiment is given here: Denim 2.6 1.4 2.4 1.3 1.6
In-Flight symptom tested: Distance
misperception (measurements are in diopters) a. Construct an ANOVA table for this experiment.
b. Using a significance level of .05, can you con-
Symptom Symptom clude that there is a difference between the
present absent mean smoothness scores for the five drying
Sample size 9 4 methods?
Sample mean 2.83 2.70
52. A consumer protection organization carried out
Sample standard deviation .172 .184
a study to compare the electricity usage for four
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
444 chapter 9 The Analysis of Variance
Bibliography
Montgomery, D. C., Design and Analysis of Experi- Ott, R.L. and M. Longnecker, Introduction to Statisti-
ments (8th ed.), Wiley, New York, 2012. The first cal Methods and Data Analysis (6th ed.), Cengage
half of the book gives a good introduction to statisti- Learning, Belmont, CA, 2008. A practitioner’s guide
cal inference and the analysis of variance method. The to analysis of variance and experimental design. Em-
remaining chapters give an equally readable account phasizes applications, calculations, and interpretation
of the experimental design techniques described in of results.
Chapter 10.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Ashley Cooper/Visuals Unlimited, Inc/Getty images
10
Experimental Design
10.1 Terminology and Concepts
10.2 Two-Factor Designs
10.3 Multifactor Designs
k
10.4 2 Designs
10.5 Fractional Factorial Designs
Introduction
Methods of experimental design are used to evaluate the effects of several dif-
ferent treatments on a response variable. In the field of agronomy, where experi-
mental design techniques were first applied in the 1920s, different fertilizer blends
(the treatments) were applied to a crop in an effort to find the particular blend
that maximized crop yield (the response). The essential statistical ideas underlying
experimental design lie in the commonsense notion that the usefulness of the con-
clusions drawn from an experiment will critically depend on how the experiment
is conducted.
Scientific applications of experimental design methods are often called design
of experiments (abbreviated DOE). Furthermore, the designs discussed in this
chapter are from a special class called factorial designs. The multifactor designs
presented in this chapter are an extension of the single-factor designs discussed
in Chapter 9. Consequently, the terminology in Section 10.1 builds on that already
introduced in Chapter 9. Sections 10.2 and 10.3 show how to conduct factorial
experiments and how to interpret the results from such experiments.
Throughout the chapter, the statistical tool of analysis of variance
(ANOVA) is used to analyze the data from experiments and to make decisions
about whether a given factor has a significant impact on a response variable.
445
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
446 chapter 10 Experimental Design
In addition, the graphical tools of effects plots and probability plots provide
very simple, yet powerful methods for visually summarizing the results of an ex-
periment and for sorting out factors that are influential from those that are not.
Effects plots, which were introduced in Section 9.3, are discussed in Section 10.2,
and probability (quantile) plots, first introduced in Section 2.4, are used through-
out Sections 10.4 and 10.5.
Sections 10.4 and 10.5 deal with a class of factorial designs called 2k designs.
These designs have been widely used in industrial and scientific applications. Because
each factor in a 2 design is restricted to only two levels, the resulting statistical
analyses are simplified, making these designs very intuitive and easy to use.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.1 Terminology and Concepts 447
Brand
1 Machine 2
Figure 10.1 Experimental
runs in a one-factor-at-a-time
experiment
The arrows in the figure indicate the direction in which the factor is varied (e.g., test
runs for brand are first done for brand A and machine 1). Two experimental runs
are made at each fixed combination of factor settings to help increase the preci-
sion of the estimates derived from the data. Employing repeated measurements, or
replication, is an intuitive method often used in experiments to reduce errors intro-
duced by outside factors that can bias experimental results. Along the horizontal axis
in Figure 10.1, the experimenter holds the brand factor fixed (i.e., only brand A is
used) and allows the machine factor to vary. Then, holding the machine factor fixed
(at machine 1), the brand factor is varied as shown on the vertical axis. A total of
six experimental measurements are made using this one-factor-at-a-time method. To
estimate the effect of changing from machine 1 to machine 2, the experimenter can
compare the average of the two response values for machine 1 with the average of the
two responses for machine 2. The difference between these two averages is a measure
of how much the response changes when the machine factor is varied. Similarly,
the difference in the two averages associated with brands A and B can be used to
measure the effect of varying the brand factor.
Figure 10.1 highlights one of three problems with one-factor-at-a-time experi-
ments mentioned above: the inability of this design to capture all the information
about the interplay between factors. Suppose, for illustration, that the plastic of brand
B works about the same as brand A does in machine 1, but that brand B works signifi-
cantly better in machine 2 than in machine 1. If so, such information would not be
Unless otherwise noted, all content on this page is © Cengage Learning.
seen in the results of the experiment shown in Figure 10.1. Instead, the data from the
one-factor-at-a-time experiment would show that there was very little effect on hard-
ness when changing the brand factor, since brand B is evaluated only on machine 1.
From those results, an experimenter would incorrectly conclude that changing
plastic brands has little effect on the hardness of the molded parts. As you can see
from Figure 10.1, this potential problem is caused by the fact that the one-factor-at-
a-time approach does not include any experimental runs using plastic of brand B on
machine 2.
In contrast, the designs introduced in this chapter are constructed to expressly take
into account the possibility of significant interplay between factors. In statistics, such in-
terplay between factors is called interaction. Two or more factors are said to interact if,
as described in the previous paragraph, the magnitude of a factor’s effect on the response
variable depends on the particular level(s) of the other factor(s) in the experiment.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
448 chapter 10 Experimental Design
In our example, the effect of changing plastic brands on plastic hardness was negligible
when machine 1 was used, but the brand effect becomes substantial when machine 2
was used. Thus there is an interaction between brand and machine. Interactions be-
tween factors are discussed in more detail in Section 10.2.
Figure 10.2 shows an experimental design that does allow for detecting such an
interaction, if it exists. This design is an example of the factorial designs discussed
throughout the chapter. One of the important features of such designs is that experi-
mental tests are conducted at many, if not all, combinations of the levels of the factors.
In particular, note that the design in Figure 10.2 includes a test measurement for the
combination of machine 2 with plastic brand B. If there is an interaction between the
two factors, this design will be able to detect it.
3 4
B
Brand
2
1
A
1 Machine 2
Another significant feature of the design in Figure 10.2 is that only one measure-
ment is made at each of the combinations of factor levels, which means that a total of
four experimental runs are needed. This brings up the question of whether this four-run
experiment is capable of estimating the factor effects with the same precision as the
one-factor-at-a-time experiment, in which each factor effect is estimated as the differ-
ence between two averages, each based on two measurements. To answer this question,
we denote the four test measurements in Figure 10.2 by y1, y2, y3, and y4. First consider
the factor machine. The difference y2 2 y1 estimates the change in plastic hardness
caused by changing machines when brand A is used on both machines. Similarly, the
difference y4 2 y3 estimates the effect of changing machines when brand B is used on
both. Therefore, by averaging these two estimates, we obtain a more precise estimate of Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.1 Terminology and Concepts 449
does so with fewer experimental runs. Using the same reasoning, we can show that the
factor brand is also measured with the same precision and can be written
1 1
brand effect 5 ( y3 1 y4) 2 ( y1 1 y2)
2 2
As the preceding paragraph illustrates, factorial experiments are more efficient than
one-at-a-time experiments. In fact, as you will see in later sections, the efficiency of
factorial designs compared to one-factor-at-a-time experiments increases as more and
more factors are included in an experiment. As Figure 10.2 shows, factorial experi-
ments achieve their efficiency by using the data more than once. Note, for example,
that the same four data values in Figure 10.2 are used in both of the effects estimates
described in the previous paragraph. Cuthbert Daniel, one of the pioneers in apply-
ing factorial designs to industrial processes, describes this feature of factorial designs as
“making each piece of data work twice,” an expression originally credited to the statisti-
cian W. J. Youden.1
To demonstrate that one-factor-at-a-time experiments do not generally yield the op-
timum settings for each factor, it is helpful to imagine what would happen if we were for-
tunate enough to know the exact relationship between the factors and the response vari-
able. Suppose, for instance, that such information is available for a particular response
value y and two factors whose measured values are denoted by x1 and x2. Thus we can
find the exact value of y associated with any two values of x1 and x2 and, therefore, create
a graph of y versus x1 and x2. Such a graph is called a response surface. Figure 10.3 is
an idealized example of a response surface, which illustrates how the percentage yield
y of a process might be related to the levels of two factors known to affect process yield.
From this graph, it is easy to find the particular values of x1 and x2 that will maximize the
percentage yield y. In a real experiment, of course, the shape of the response surface is
unknown, and the experimenter’s goal is to come as close as possible to the settings of x1
and x2 that optimize the response variable.
80
60
Unless otherwise noted, all content on this page is © Cengage Learning.
40
20
0 10 20 30 40 50
0 2
10
20
30
40
50
1
Figure 10.3 A response surface of process yield (in percent) versus the
values 1 and 2 of two factors
1
Daniel, C., Application of Statistics to Industrial Experimentation, John Wiley & Sons, New York, 1976: 3.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
450 chapter 10 Experimental Design
65 60 55 50 45 40 35 30
65
70 70 65
60
75
80 80
75
85
80
75
85 80
85
2
90
70 75
65
60 85
55 80
50 75
80
70
65 75
60
70
65
15 20 25 30 35 40 45 50 55
Figures 10.5 and 10.6 show two different experimental strategies that could be fol-
lowed in a one-factor-at-a-time experiment. Suppose, for illustration, that a process is
currently running with the two factors set at the values associated with point A in the
figures. Starting with Figure 10.5, suppose that an experimenter begins by varying the
values of x1 (keeping x2 fixed) and tries to maximize the process yield. As Figure 10.5
shows, the best value of x1 occurs near point B in the figure. Next, keeping x1 fixed at
its value from point B, the experimenter then varies x2 until its optimum value is found
near point C. The experimenter would conclude that both factors had been optimized
and that the best process yield possible is about 86%.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.1 Terminology and Concepts 451
65 60 55 50 45 40 35 30
65
70 70 65
60
75
80 80
75
85
80 Maximum
75
85 80
85
2 90
70 75
C
65
60 85
55 80
50 75
80
70
A B
65 75
60
70
65
15 20 25 30 35 40 45 50 55
65 60 55 50 45 40 35 30
65
70 70 65
60
75
80 80
75
C 85
D
80 Maximum
75
85 80
85
90
2
Unless otherwise noted, all content on this page is © Cengage Learning.
70 75
65
60 85
55 80
50 75
80
70
A B
65 75
60
70
65
15 20 25 30 35 40 45 50 55
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
452 chapter 10 Experimental Design
65 60 55 50 45 40 35 30
65
70 65
70
60
75
80 80
75
85
B
80
75
C 85 80
85
90
2
Unless otherwise noted, all content on this page is © Cengage Learning.
70 75
65
60 85
55 80
50 75
80
70
A 75
65
60
70
65
15 20 25 30 35 40 45 50 55
Figure 10.7 Using factorial designs to search for optimum factor settings
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.2 Two-Factor Designs 453
1. What statistical purpose does replication serve in an b. From the graph in part (a), determine the ap-
experimental design? proximate coordinates of the point (x, y) at which
the response surface is at its maximum.
2. Factors A and B are thought to have an effect on a
c. Find an equation that describes the typical con-
certain response value, y. The following table con-
tour of the response surface.
tains data on the response variable measured at each
d. Sketch some of the contours using your answer
combination of the two levels of factors used in a
to part (b). From this sketch, determine the ap-
study:
proximate coordinates of the point at which the
Factor A level response surface is at its maximum from these
1 2 contours.
1 5.2 7.4 4. Suppose that the response surface for a two-factor ex-
Factor B level
2 4.0 6.3 periment can be described by the function f (x, y) 5
2
e2(x2y) .
a. Calculate an estimate of the effect of changing a. Use a computer package to create a graph of the
factor A from level 1 to level 2. response surface.
b. Calculate an estimate of the effect of changing b. Find an equation that describes the contours of
factor B from level 1 to level 2. the response surface.
3. Suppose that the response surface for a two-factor ex- c. Sketch some of the contours using the equation(s)
periment can be described by the function f (x, y) 5 in part (b). Using these results, determine from
2 2
e(21y2)[(x22) 1(y25) ]. these contours the approximate coordinates of
a. Use a computer package to create a graph of the the point(s) at which the response surface is at its
response surface. maximum.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
454 chapter 10 Experimental Design
Factor
1 2 ...
=
Each cell has response values.
The most convenient way to keep track of the information in a two-way design is to
display it in matrix form, as shown in Figure 10.8. Each of the a 3 b combinations has
its own cell in which r response values are recorded. With r values in each of the a 3 b
cells, the total number of experimental runs is n 5 a 3 b 3 r.
10 + 14 + ... + 27 18 + 14 + ... + 25
21 = –––––––––––––– 19 = ––––––––––––––
6 6
In the margins of the matrix, we have included the averages of all the responses in
the rows and columns. For instance, the average response for the first row is 14, which
is the average of all four numbers in that row. Notice that these four numbers each cor-
respond to the first level of factor A, which we will denote by A1 in the graphs that follow.
Also, because we have used a balanced design, each level of B is included an equal
number of times in these four numbers, which is what makes the average response of 14
a good representation of what to expect at level A1. As you can see, each level of B is also
represented in the four numbers used to find the average responses for levels A2 and A3.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.2 Two-Factor Designs 455
Following our general rule for computing effects averages, we compute the average
responses for B1 and B2 from six numbers, because each column (i.e., each level of B)
in the matrix contains a total of six measurements.
By plotting the average response versus the levels of a factor, we obtain a graph of the
main effect of that factor. In Figure 10.9, for example, the plots of the main effects of factors
A and B in our example show that the average response tends to increase as factor A changes
from level A1 to A2 to A3, whereas changing factor B from B1 to B2 has the effect of decreas-
ing the average response from 21 to 19. Plotting both the A and B main effects on the same
graph allows you to easily compare the magnitudes—of the A and B effects. In those cases
where the average response stays relatively constant from level to level (e.g., if the average
response had been 20 at all three levels of A), we say that a factor has no main effect.
Response
26
21
20 19
14
1 2 3 1 2
From Figure 10.9, we can see that going from level A1 to level A3 has the net effect of
raising the average response by 12 units (i.e., from 14 at A1 to 26 at A3) and that going from
B1 to B2 lowers the average response by 2 units (from 21 at B1 to 19 at B2). Looking at this
figure, it is tempting to want to treat A and B separately, by simply choosing desirable set-
tings first for A, then for B. If this were always the case, it would make the results of a two-way
experiment exceedingly easy to interpret. Unfortunately, things are not always that simple.
It is possible, as noted in Section 10.1, that two (or more) factors do not act inde-
Unless otherwise noted, all content on this page is © Cengage Learning.
pendently of one another. Two factors are said to interact when the effect of changing
the levels of one factor depends on the particular level of the other factor. This is the case
in our example. The following calculations show that the effect of changing factor A
depends on the particular setting of factor B:
Effect of changing A (B fixed at B1) Effect of changing A (B fixed at B2)
A1 and B1 (10 1 14)y2 5 12 A1 and B2 (18 1 14)y2 5 16
A2 and B1 (23 1 21)y2 5 22 s 17 A2 and B2 (16 1 20)y2 5 18 s 7
A3 and B1 (31 1 27)y2 5 29 A3 and B2 (21 1 25)y2 5 23
Notice that the effect of going from A1 to A3 is an increase of 17 units when B is at level
B1, whereas the corresponding increase is only 7 units when B is at level B2. Thus the
effect of changing the levels of A seems to depend on the particular level of B.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
456 chapter 10 Experimental Design
Like main effects, such two-factor interaction effects can also be plotted. This can
be done as shown in Figure 10.10 by overlaying separate graphs, one for each level of
factor B. Alternatively, the two values of B could be used on the horizontal axis with
three overlaid graphs (one for each level of A). The presence of interaction between
two factors is indicated by graphs that either cross one another or, more generally, are
not parallel. Parallel graphs, as depicted in Figure 10.11, are a sign of no interaction
between the factors. Why?
Response Response
1
1
29
22
23
16 2
18
2
12
1 2 3
Keep in mind that effects plots do not take the place of statistical tests. You should
always run an ANOVA test first to determine which of the effects are significant and
which are not. It may turn out, for example, that the interaction effect is not statistically
significant, in which case you can interpret the main effects without worrying about
factor interactions. At other times, you may discover that a factor that you initially
thought was important turns out to have no significant effect on the response variable.
When statistical testing shows that an interaction effect is significant, then the re-
sults of the experiment must be interpreted by examining the interaction plots, not the
main effects plots. When interactions exist, the conclusions drawn from the main effects
plots may or may not agree with those drawn from the interaction plots. On the other
hand, if the interaction between factors is not significant, then you can simply examine Unless otherwise noted, all content on this page is © Cengage Learning.
and interpret the main effects plots. For instance, in our example, neither the main
effect for factor B nor the interaction effect is significant at 5 .05 (see Exercise 6).
This means that we need only examine the main effects plot for factor A. If the goal of
the study is, say, to maximize the response value, then the main effects plot suggests that
we set factor A at level 3. Because the main effect for factor B is not significant and
because the interaction between A and B is not significant, choosing either level of B
should give substantially the same response value.
ANOVA Formulas
All ANOVA procedures share a common goal: to analyze the total variation (SST) in
a response variable by breaking it into identifiable sources of variation. This is accom-
plished by defining a separate sum of squares for each source of variation and then
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.2 Two-Factor Designs 457
decomposing SST into a sum of these components. Such formulas are called ANOVA
decompositions.
The general ANOVA decomposition for a two-factor analysis of variance is
SST 5 SSA 1 SSB 1 SS(AB) 1 SSE
where SSA, SSB, and SS(AB) denote the sums of squares associated with factor A, factor
B, and the AB interaction, respectively. SSE, the error or residual sum of squares, rep-
resents the variation from all sources of variation other than A, B, and their interaction.
The formulas for these sums of squares are given in the following box.2 Note that once
SST, SSA, SSB, and SSE are computed, SS(AB) can easily be found by rewriting the
ANOVA decomposition as
SS(AB) 5 SST 2 SSA 2 SSB 2 SSE
SSB 5 ^( 2 )2
51
5 ^ ^ ^( 2 )2
51 51 51
5 ^ ^ ^( 2 )2
51 51 51
Hypothesis Tests
We now proceed to find the degrees of freedom and mean squares associated with each
source of variation. The total degrees of freedom is n 2 1, where n 5 abr. The degrees
of freedom associated with a factor is simply its number of levels minus 1, and the
2
In the precomputer era, shortcut formulas were often used instead of the formulas we have given. The
interested reader may consult other texts for these formulas.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
458 chapter 10 Experimental Design
degrees of freedom for an interaction term is the product of the degrees of freedom of
the corresponding factors. The error degrees of freedom equals ab(r – 1). Decomposi-
tion of degrees of freedom mimics that for sums of squares:
ANOVA
SST 5 SSA 1 SSB 1 SS(AB) 1 SSE
decomposition:
Degree of
abr 2 1 5 (a 2 1) 1 (b 2 1) 1 (a 2 1)(b 2 1) 1 ab(r 2 1)
freedom:
By dividing each sum of squares by its degrees of freedom, we form the mean squares:
SSA SS(AB)
MSA 5 MS(AB) 5
a21 (a 2 1)(b 2 1)
SSB SSE
MSB 5 MSE 5
b21 ab(r 2 1)
These are used to form the F ratios used in our hypothesis tests. In a two-way ANOVA, we
can conduct separate tests for the presence of each main effect and the interaction effect.
In each such test, the null hypothesis is that the effect does not exist, and the alternative
hypothesis is that the effect is present. To conclude, for example, that the factor A (or B)
effect is present means that the average response differs at different levels of A (or B). The
following box summarizes the test procedures for a two-factor ANOVA. An ANOVA table
(Figure 10.12) provides the most convenient way to summarize these results.
If 0 is rejected, then the interaction plot takes precedence over the main effects plots
when interpreting the effects of and .
Source of variation df SS MS F
Factor A a21 SSA MSA MSA/MSE
Factor B b21 SSB MSB MSB/MSE
AB interaction (a 2 1)(b 2 1) SS(AB) MS(AB) MS(AB)/MSE
Error ab(r 2 1) SSE MSE
Total variation abr – 1 SST
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.2 Two-Factor Designs 459
Technically speaking, the statistical tests just described are based on a fixed effects
model, in which the particular levels of A and B are assumed to be the only ones
of interest in the study. If, on the other hand, we think of the levels only as samples
from all of the possible levels of A and B, then a random effects model should be
used. Recall that the distinction between fixed and random effects was first introduced
in Section 9.3. Although the distinction between fixed and random factors does not
alter the ANOVA calculations for one-factor experiments (Chapter 9), this situation
changes for multifactor designs. In particular, the calculation of F ratios for random
effects models and mixed models (one factor fixed, the other random) are slightly dif-
ferent from those of the fixed effects models. These topics are beyond the scope of our
introductory discussion. Throughout this chapter, we consider all factors in a design
to be fixed factors.
Example 10.1 Refer to Example 9.1, where we examined the possible causes of electric motor vibra-
tion. Suppose that we have identified two product characteristics (factors) that are
thought to influence the amount of vibration (the response, measured in microns)
of running motors: factor A 5 the brand of bearing used in the motor and B 5 the
material used for the motor casing. Figure 10.13 shows the data from an experiment
in which a 5 5 brands of bearings were tested along with b 5 3 types of casing mate-
rial (steel, aluminum, and plastic). Two motors (r 5 2) were constructed and tested
for each of the ab 5 5 ? 3 5 15 combinations of bearing brand and casing type, giving
a total sample size of abr 5 5 ? 3 ? 2 5 30.
Factor
(casing material)
1 2 3 Averages:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
460 chapter 10 Experimental Design
Source of variation df SS MS F
Factor A
(bearing brand) 5 2 1 5 4 36.709 36.709y4 5 9.177 9.177y.1113 5 82.45
Factor B
(casing material) 3 2 1 5 2 .705 .705y2 5 .353 .353y.1113 5 3.17
AB interaction (5 2 1)(3 2 1) 5 8 11.571 11.571y8 5 1.446 1.446y.1113 5 12.99
Error 5 · 3(2 2 1) 5 15 1.670 1.670y15 5 .1113
Total variation 5 · 3 · 2 2 1 5 29 50.655
At a significance level of 5 .05, let’s first test for the presence of any interac-
tions. Because the P-value for F 5 12.99 (based on df1 5 8, df2 5 15) is less than
.001, H0AB must be rejected. It appears that there is interaction between the two fac-
tors. Therefore we should consider the corresponding effects plot (see Figure 10.14)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.2 Exercises 461
to draw conclusions. Although the casing material does not have a significant effect
by itself, it does influence the A main effect (because the AB interaction is signifi-
cant). The lowest vibration occurs for bearing brand A3, but only if casing B3 (plastic
casing) is used with A3.
Response
17
16 2
1
15
14
13
3
12 Factor
1 2 3 4 5
surements are available for each combination of 1 200, 211 226, 219 240, 249 261, 250
factor levels. Complete the following ANOVA table 2 278, 267 312, 324 330, 337 381, 375
for this experiment:
Factor A 3 369, 355 416, 402 462, 457 517, 524
4 500, 487 575, 593 645, 632 733, 718
Source of
variation df SS MS F a. Is there evidence of a significant interaction be-
Factor A 20 tween the two factors? Use 5 .01.
Factor B 8.1 b. Use 5 .01 to test the hypothesis that gas flow
AB interaction rate has no effect on the heat transfer coefficient.
Error 2 c. Use 5 .01 to test the hypothesis that liquid flow
Total variation 200 rate has no effect on the heat transfer coefficient.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
462 chapter 10 Experimental Design
9. The following data was obtained in an experiment c. Use 5 .01 to test the hypothesis that asphalt
to investigate whether the yield from a certain binder grade has no effect on thermal conductivity.
chemical process depends on either the chemical
12. Factorial designs have been used to study produc-
formulation of the input materials or the mixer
tivity of software engineers (“Experimental Design
speed, or on both factors:
and Analysis in Software Engineering,” Software
Speed Engineering Notes, 1995: 14–16). Suppose that an
60 70 80 experiment is conducted to study the time it takes
189.7 185.1 189.0 to code a software module. Factors that may affect
1 188.6 179.4 193.0 the coding time are the size of the module and
190.1 177.3 191.0 whether the programmer has access to a library of
Formulation previously coded submodules. Module size is stud-
ied at two levels, large and small, whereas access to
165.1 161.7 163.3
a library of submodules is either available or not. Af-
2 165.9 159.8 166.6
ter running a two-factor design on sample modules,
167.6 161.6 170.3
suppose that the interaction between module size
A statistical software package gave these results: and library access is found to be significant.
SS(formulation) 5 2253.44, SS(speed) 5 230.81, a. If the goal is to reduce coding time, describe the
SS(interaction) 5 18.58, and SSE 5 71.87. conclusions you can draw from the experiment
a. Does there appear to be interaction between the if the interaction plot looks like this:
two factors? (Use 5 .05.)
Coding time
b. Does the yield appear to depend on either the
formulation or the speed? (Use 5 .05.) No library access
10. Draw an interaction plot for the data of Exercise 9.
11. Lightweight aggregate asphalt mix has been found Library access
to have lower thermal conductivity, which is desir-
able, than a conventional mix would have. The
article “Influence of Selected Mix Design Factors Module size
Small Large
on the Thermal Behavior of Lightweight Aggregate
Asphalt Mixes” (J. of Testing and Eval., 2008: 1–8) b. What possible reasons can you give for an inter-
reported on an experiment in which various thermal action plot that looks like the following one?
properties of mixes were determined. Three different
Coding time
binder grades were used in combination with three
Library access
different coarse aggregate contents (%), with two ob-
Unless otherwise noted, all content on this page is © Cengage Learning.
No library access
servations made for each such combination, result-
ing in the conductivity data (W/m∙K) given here:
Coarse Aggregate Content (%)
38 41 44
Asphalt PG58 .835, .845 .822, .826 .785, .795 Module size
Binder PG64 .855, .865 .832, .836 .790, .800 Small Large
Grade PG70 .815, .825 .800, .820 .770, .790 13. The article “Fatigue Limits of Enamel Bonds with
a. Test for the presence of interaction between the Moist and Dry Techniques” (Dental Materials,
two factors. Use 5 .01. 2009: 1527–1531) described an experiment to in-
b. Use 5 .01 to test the hypothesis that coarse vestigate the ability of adhesive systems to bond to
aggregate content has no effect on thermal con- mineralized tooth structures. The response variable
ductivity. is shear bond strength (MPa), and two different
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.3 Multifactor Designs 463
adhesives—Adper Single Bond Plus (SBP) and taken to determine the effects of carbon fiber (in %)
OptiBond Solo Plus (OBP)—were used in combi- and sand addition (in %) on two response variables,
nation with two different surface conditions. The casting hardness and wet-mold strength.
accompanying data was supplied by the authors of
the article. The first 12 observations came from the Sand Carbon fiber Casting Wet-mold
SBP-dry treatment, the next 12 from the SBP-moist addition (%) addition (%) hardness strength
treatment, the next 12 from the OBP-dry treatment, 0 0 61.0 34.0
and the last 12 from the OBP-moist treatment. 0 0 63.0 16.0
15 0 67.0 36.0
SBP-Dry 56.7 57.4 53.4 54.0 49.9 49.9
15 0 69.0 19.0
56.2 51.9 49.6 45.7 56.8 54.1
SBP-Moist 49.2 47.4 53.7 50.6 62.7 48.8 30 0 65.0 28.0
41.0 57.4 51.4 53.4 55.2 38.9 30 0 74.0 17.0
OBP-Dry 38.8 46.0 38.0 47.0 46.2 39.8 0 .25 69.0 49.0
25.9 37.8 43.4 40.2 35.4 40.3 0 .25 69.0 48.0
OBP-Moist 40.6 35.5 58.7 50.4 43.1 61.7 15 .25 69.0 43.0
33.3 38.7 45.4 47.2 53.3 44.9 15 .25 74.0 29.0
30 .25 74.0 31.0
a. Construct a comparative boxplot of the data on
30 .25 72.0 24.0
the four different treatments and comment.
0 .50 67.0 55.0
b. Carry out an appropriate analysis of variance
0 .50 69.0 60.0
and state your conclusions (use a significance
15 .50 69.0 45.0
level of .01 for any tests). Include any graphs
15 .50 74.0 43.0
that provide insight.
30 .50 74.0 22.0
c. If a significance level of .05 is used for the two-
30 .50 74.0 48.0
way ANOVA, the interaction effect is significant
(just as in general different glues work better
a. Construct an ANOVA table for the effects of
with some materials than with others). So now
these factors on wet-mold strength. Test for the
it makes sense to carry out a one-way ANOVA
presence of significant effects using 5 .05.
on the four treatments SBP-D, SBP-M, OBP-D,
b. Construct an ANOVA table for the effects of
and OBP-M. Do this and identify significant dif-
these factors on casting hardness. Test for the
ferences among the treatments.
presence of significant effects using 5 .05.
14. Experiments often have more than one response c. From your results in parts (a) and (b), what levels
value of interest. In the article “Towards Improving of each factor would you select to maximize wet-
the Properties of Plaster Moulds and Castings” (J. mold strength? What factor levels would you
Engr. Manuf., 1991: 265–269), a study was under- choose to maximize casting hardness?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
464 chapter 10 Experimental Design
The “3” notation used to describe two-factor designs also provides compact descrip-
tions of multifactor designs. For instance, a 3 3 2 3 2 factorial design is one that has
three factors A, B, and C, with a 5 3 levels of A, b 5 2 levels of B, and c 5 2 levels of C.
Figure 10.15 shows a data layout for such a design. Note that the number of replicates, r,
is not included in this notation. To indicate that repeated measurements have been made
at each factor–level combination, we simply state this fact when referring to the design, for
example, a replicated 3 3 2 3 2 design, or a 3 3 2 3 2 design with r replicates.
1 2
1 2 1 2
* * * *
1 * * * *
* * * *
2 * * * *
* * * *
3
* * * *
ANOVA Formulas
ANOVA decompositions for factorial designs contain a sum of squares term for every
possible main effect and interaction. For example, a factorial design based on factors A,
B, and C gives rise to three main effects terms (A, B, and C), three two-factor interactions
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.3 Multifactor Designs 465
(AB, AC, and BC), and one three-factor interaction (ABC). The sum of squares for an
interaction term is denoted by putting the interaction term in parentheses after the SS
notation. Thus SS(AB) denotes the sum of squares associated with the AB interaction,
and so on. The ANOVA decomposition for a three-factor model is given by
SST 5 SSA 1 SSB 1 SSC 1 SS(AB) 1 SS(AC) 1 SS(BC) 1 SS(ABC) 1 SSE
where SST (the total sum of squares) measures the total variation in the response data
and SSE (the error sum of squares) is the variation from all sources other than the factors
included in the experiment.
For a three-factor design including every possible main effect and interaction, com-
putational formulas for sums of squares are given in the following box. The key is to start
by computing the sums of squares of main effects and then use these results to find sums
of squares for the two-factor interactions. Similarly, the sums of squares of the two-factor
interactions are used to find SS(ABC). Although the patterns evident in these formulas
can be extended to the case of four or more factors, in practice one usually relies on
statistical software to perform the calculations.
For a three-factor design restricted to main effects and two-factor interactions (i.e., a
design that excludes the ABC interaction), we can determine the sums of squares for total
variation, main effects, and two-factor interactions using the computational formulas in
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
466 chapter 10 Experimental Design
the foregoing box. However, now that we are excluding the ABC term, the SSE must
necessarily change. It is no longer just the sum of squared cell deviations (shown in the
foregoing table), but it will now be increased by an amount equal to SS(ABC). The cor-
rect ANOVA decomposition will now be given by
SST 5 SSA 1 SSB 1 SSC 1 SS(AB) 1 SS(AC) 1 SS(BC) 1 SSE
Rearrangement of this decomposition yields an expression for the new SSE:
SSE 5 SST 2 SSA 2 SSB 2 SSC 2 SS(AB) 2 SS(AC) 2 SS(BC).
Comparing SSE to the SSE for the full three-factor model, we see that the error term
now includes the ABC contribution in the sense that SSE 5 SSE 1 SS(ABC).
Similarly, if we want to restrict the model to only main effects, the ANOVA decom-
position becomes
SST 5 SSA 1 SSB 1 SSC 1 SSE
from which
SSE 5 SST 2 SSA 2 SSB 2 SSC.
Again, the SSE term has simply absorbed the SSE for the full three-factor model and
all sums of squares of the terms omitted from this model. This also happens when terms
from a two-factor design are dropped (cf. page 457) or, in general, from a model having
any number of factors, including multiple regression models (discussed in Chapter 11).
Hypothesis Tests
Hypothesis tests concerning main effects and interactions are based on the familiar
ANOVA assumption that the response values at each fixed factor–level combination
follow a normal distribution and that the variances of these distributions are the same,
regardless of the particular factor–level combination. From these assumptions, a sepa-
rate degrees of freedom and mean square can be computed for each source of variation.
The total degrees of freedom is n 2 1, where n is the total number of experimental
runs. The degrees of freedom for each main effect equals its number of levels minus 1
and the degrees of freedom for any interaction term is simply the product of the degrees
of freedom for its component factors. The mean square associated with any main ef-
fect or interaction equals its sum of squares divided by its degrees of freedom. All of this
information is summarized in the form of an ANOVA table. For example, Figure 10.16
shows the general form of the ANOVA table for a three-factor design. Unless otherwise noted, all content on this page is © Cengage Learning.
Source of variation df SS MS F
A a21 SSA MSA MSA/MSE
B b21 SSB MSB MSB/MSE
C c21 SSC MSC MSC/MSE
AB (a 2 1)(b 2 1) SS(AB) MS(AB) MS(AB)/MSE
AC (a 2 1)(c 2 1) SS(AC) MS(AC) MS(AC)/MSE
BC (b 2 1)(c 2 1) SS(BC) MS(BC) MS(BC)/MSE
ABC (a 2 1)(b 2 1)(c 2 1) SS(ABC) MS(ABC) MS(ABC)/MSE
Error abc(r 2 1) SSE MSE
Total variation abcr 2 1 SST
Figure 10.16 ANOVA table for a factorial design with three factors, , , and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.3 Multifactor Designs 467
Example 10.2 Over the past decade researchers and consumers have shown increased interest
in renewable fuels such as biodiesel, a form of diesel fuel derived from vegetable
oils and animal fats. According to www.fueleconomy.gov, compared to petroleum
diesel, the advantages of using biodiesel include its nontoxicity, biodegradability and
lower greenhouse gas emissions. One popular biodiesel fuel is fatty acid ethyl ester
(FAEE). The authors of “Application of the Full Factorial Design to Optimization
of Base-Catalyzed Sunflower Oil Ethanolysis” (Fuel, 2013: 433−442) performed an
experiment to determine optimal process conditions for producing FAEE from the
ethanolysis of sunflower oils. In one study, the effects of three process factors on
FAEE purity (%) were investigated.
Table 10.1 shows the data from this 3 3 3 3 3 experiment. Note that there are
r 5 2 repeated tests run at each combination of factor levels. Figure 10.17 shows
the resulting ANOVA table. All effects except the BC and ABC interaction effects
are significant at 5 .05. Because some interaction terms are significant, the
interaction plots must be examined when drawing conclusions about the factor
effects.
Plots of all two-factor interactions are shown in Figure 10.18, along with the
main effects plots for the three factors. Suppose we are interested in maximizing
the value of the response variable, FAEE purity. Looking at the interaction plots,
the combination of factor levels that best accomplishes this objective is A 5 75°C,
B 5 12:1, and C 5 1.25%. In this example, the conclusions from the interaction
plots agree with the conclusions that we would have drawn from inspecting the main
effects plots.
Temp
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
468 chapter 10 Experimental Design
Source df SS MS F P
A 2 215.38 107.69 112.07 .0000
B 2 74.51 37.26 38.77 .0000
C 2 602.72 301.36 313.60 .0000
AB 4 13.45 3.36 3.50 .0200
AC 4 107.41 26.85 27.94 .0000
BC 4 4.37 1.09 1.14 .3598
ABC 8 12.47 1.56 1.62 .1649
Error 27 25.95 .961
Total 53 1056.26
Figure 10.17 ANOVA for the data of Table 10.1
Interaction Plots for FAEE
Data Means
6 9 12
TEMP
95 25
50
90 TEMP 75
85
RATIO
95 6
9
RATIO 90 12
85
LOAD
95 0.75
1.00
90 LOAD 1.25
85
25 50 75 0.75 1.00 1.25
90
88
Mean
25 50 75 6 9 12
LOAD
96
94
92
90
88
0.75 1.00 1.25
Figure 10.18 Two-factor interaction plots and main effects plots for Example 10.2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.3 Exercises 469
Based on years of empirical evidence, the results of a factorial experiment usually show
that, over the range of factor levels studied, only a few factors are significant and even fewer
interaction terms are significant. When all main effects and interactions are significant, the
experimenter should carefully examine how the test runs were conducted to make sure that
correct procedures were followed. Recall from Section 4.3 that the proper method of con-
ducting repeated tests is to completely replicate the experimental conditions for each test.
For example, in Example 10.2, the two repeated tests made at A 5 25 C, B 5
6:1 molar ratio, and C 5 .75 wt.% catalyst loading should be conducted by resetting
the apparatus used in the first test, substituting a new sunflower oil sample using the
specified molar ratio and catalyst loading, allowing the temperature to change and be
reset to 25°C, and then running the second test. If, instead, the experimenter simply
leaves the apparatus from the first test in place and immediately conducts a second test,
then the variation between the two FAEE purity responses is more likely to be a mea-
sure of the repeatability of the purity measurement system. It will not truly capture the
experimental error we would expect for any sunflower oil sample under the conditions
A 5 25°C, B 5 6:1 molar ratio, and C 5 .75 wt.% catalyst loading. Test runs that are
incorrectly conducted by simply taking two successive measurements usually result in
underestimating the experimental error MSE, thereby artificially increasing the F ratios
on which hypothesis tests are based.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
470 chapter 10 Experimental Design
The response variable is an average rating of five buds heat treatment applied (B), and machine used (C).
from a seedling. The ratings are 0 (bud not broken), The three times were 8:00 a.m., 11:00 a.m., and
1 (bud partially expanded), and 2 (bud fully expanded). 3:00 p.m. Two types of heat and four machines were
a. Using a significance level of 5%, conduct an used. The data from this 3 3 2 3 4 factorial design
ANOVA test for this data. Indicate which factors is given in the following table. Note: Data is coded as
are significant and whether the interaction term 1000(length 2 4.380); this does not affect the analysis.
is significant.
B1
b. Create an effects plot for the factors that were
found to be significant in part (a). C1 C2 C3 C4
c. What conclusions can you draw regarding the A1 6, 9, 1, 3 7, 9, 5, 5 1, 2, 0, 4 6, 6, 7, 3
effects of the two factors on bud rating? A2 6, 3, 1, 21 8, 7, 4, 8 3, 2, 1, 0 7, 9, 11, 6
17. The output of a continuous extruding machine that A3 5, 4, 9, 6 10, 11, 6, 4 21, 2, 6, 1 10, 5, 4, 8
coats steel pipe with plastic was studied as a function B2
of thermostat temperature profile (A, at three levels), C1 C2 C3 C4
type of plastic (B, at three levels), and the speed (C, at
A1 4, 6, 0, 1 6, 5, 3, 4 21, 0, 0, 1 4, 5, 5, 4
three levels) of the rotating screw that forces the plas-
A2 3, 1, 1, 22 6, 4, 1, 3 2, 0, 21, 1 9, 4, 6, 3
tic through a tube-forming die. Two replications were
A3 6, 0, 3, 7 8, 7, 10, 0 0, 22, 4, 24 4, 3, 7, 0
obtained at each factor–level combination, yielding
a total of 54 observations. The sums of squares were a. Construct an ANOVA table for this data.
SSA 5 14,144.44, SSB 5 5,511.27, SSC 5 244,696.39, b. Test to see whether any interaction effects are
SS(AB) 5 1,069.62, SS(AC) 5 62.67, SS(BC) 5 significant at 5 .05.
331.67, SSE 5 3127.50, and SST 5 270,024.33. c. Test to see whether any main effects are signifi-
a. Construct an ANOVA table for this experiment. cant at 5 .05.
b. Use the appropriate F ratios to show that none of
the two- or three-factor interactions is significant 20. The deposition of thick protective coatings on
at 5 .05. substrates can be facilitated by laser cladding, in
c. Which main effects are significant at 5 .05? which an alloy powder is melted on the substrate
surface. Experiments were conducted to determine
18. To see whether the force in drilling is affected by the how three processing parameters, laser power (A),
drilling speed (A), feed rate (B), or material used (C), scanning velocity (B), and powder flow rate (C) af-
an experiment using four speeds, three rates, and two fect the coating hardness. (“Laser Cladding: An Ex-
materials was performed, with two replicate samples perimental Study of Geometric Form and Hardness
drilled at each combination of levels of the three of Coating Using Statistical Analysis,” J. of Engr.
factors. A software package was used to obtain the sums Manuf., 2006: 1549–1554). Each factor had three
of squares for the experimental data: SSA 5 19,149.73, levels, and there was one observation at each factor-
SSB 5 2,589,047.62, SSC 5 157,437.52, SS(AB) 5 level combination. The following corresponds to
53,238.21, SS(AC) 5 9,033.73, SS(BC) 5 91,880.04, the ANOVA table from the article; only main effects
SSE 5 56,819.50, and SST 5 2,983,164.81. and two-factor interactions were considered there:
a. Construct an ANOVA table for this experiment, SOURCE DF SS MS
and identify significant effects using 5 .01. ? ? ? 63.24
b. Is there any single factor that appears to have ? 2034.74 ? ?
no effect on thrust force? If so, how would you ? ? 480.26 ?
go about choosing the level of this factor that ? ? ? 6.48
would minimize thrust force? ? 729.04 ? ?
? ? 115.26 ?
19. An experiment was conducted to investigate how the Error ? ? 104.26
length of steel bars is affected by the time of day (A), Total ? ?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.3 Exercises 471
a. Fill in the missing entries in the table. 22. A four-factor factorial design was used to investigate
b. Identify significant effects using 5 .01. the effect of fabric (A), type of exposure (B), level of
exposure (C), and fabric direction (D) on the extent
21. Recently, nickel titanium (NiTi) shape memory
of color change as measured by a spectrocolorimeter
alloy (SMA) has become widely used in medical
(from “Accelerated Weathering of Marine Fabrics,”
devices. This is attributable largely to the alloy’s
J. Testing and Eval., 1992: 139–143). Two observa-
shape memory effect (material returns to its origi-
tions were made at each combination of the factor
nal shape after heat deformation), superelasticity,
levels. The resulting mean squares were MSA 5
and biocompatibility. An alloy element is usually
2,207.329, MSB 5 47.255, MSC 5 491.783,
coated on the surface of NiTi SMAs to prevent
MSD 5 .44, MS(AB) 5 15.303, MS(AC) 5
toxic Ni release. The alloy element is coated
275.446, MS(AD) 5 .470, MS(BC) 5 2.141,
by laser cladding, a technique first described in
MS(BD) 5 .273, MS(CD) 5 .247, MS(ABC) 5
Exercise 20.
3.714, MS(ABD) 5 4.072, MS(ACD) 5 .767,
The authors of “Parametrical Optimization
MS(BCD) 5 .280, and MSE 5 .977. Perform an
of Laser Surface Alloyed NiTi Shape Memory
analysis of variance using 5 .01 for all tests, and
Alloy with Co and Nb by the Taguchi Method”
summarize your conclusions.
(J. of Engr. Manuf., 2012: 969–979) conducted
a study to see whether the percent by weight of 23. One property of automobile air bags that contrib-
nickel in the alloyed layer is affected by carbon utes to their ability to absorb energy is the per-
monoxide powder paste thickness (A, at three meability of the woven material used to construct
levels), scanning speed (B, at three levels), and the air bags. Understanding how permeability
laser power (C, at three levels). One observation is influenced by various factors is important for
was made at each factor-level combination (Note: increasing effectiveness. In one study, the ef-
Thickness column headings were incorrect in the fects of three factors were studied: temperature
cited article): (A), fabric denier (B), and air pressure (C). Two
specimens were measured at each factor-level
Paste Thickness
combination (“Analysis of Fabrics Used in Passive
Power Speed .2 .3 .4 Restraint Systems—Airbags,” J. of the Textile Insti-
600 600 38.64 35.13 19.20 tute, 1996: 554–571).
900 38.16 34.24 26.23 Temperature
1200 37.54 33.46 30.44
8 50 75
700 600 36.56 35.91 34.62 Pressure 17.2 34.4 103.4 17.2 34.4 103.4 17.2 34.4 103.4
900 39.16 33.10 28.71 420-D 73 157 332 52 125 281 37 95 276
1200 37.06 31.78 21.50 80 155 332 51 118 264 31 106 281
800 600 39.44 40.42 37.21 630-D 35 91 288 16 72 169 30 91 213
900 39.34 37.64 35.65 43 98 271 12 78 173 41 100 211
1200 39.30 34.97 32.50 840-D 125 234 477 90 149 338 102 170 307
111 233 464 100 155 350 98 160 311
a. Construct an ANOVA table for this experi-
ment, including all main effects and two- Denier
factor interactions (as did the authors of the
cited article). a. Construct an ANOVA table for this data.
b. Use the appropriate F ratios to show that none b. Test to see whether any interaction effects are
of the two-factor interactions is significant at significant at 5 .01.
5 .05. c. Test to see whether any main effects are signifi-
c. Which main effects are significant at 5 .05? cant at 5 .01.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
472 chapter 10 Experimental Design
10.4 2k Designs
The minimum number of experimental runs needed for a factorial experiment can
increase rapidly as more factors are added to an experiment. Recall, for instance, that
to study factor A at three levels, factor B at two levels, and factor C at four levels, a
minimum of 3 3 2 3 4 5 24 runs are needed, one run for each different combination
of factor levels. If each test run is replicated r times, then the total number of runs will
further increase by a factor of r. As a consequence, the cost of resources needed to con-
duct a factorial experiment can quickly become prohibitive.
One method of combating the problem of extremely large numbers of runs is to use
only two levels of each of the factors of interest. Using this approach to study k different
factors, each having only two levels, the minimum number of experimental runs need-
ed is 2 2 2 … 2 5 2k, which is the reason such experiments are called 2k factorial
designs. These designs are very popular in the research and development of products
and processes, not only because they require smaller sample sizes but also because the
associated statistical analyses are exceedingly simple and, if necessary, can even be done
by hand.
The 6 1 coding scheme provides a quick method for listing all 2k experimental
runs. Using capital letters A, B, C, . . . , to denote the names of the k factors in an experi-
ment, we form k columns of 11 and 21 values according to the following rule:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 473
Example 10.3 For a 23 experiment based on the factors A, B, and C, the eight experimental runs in
Yates standard order are as follows:
Run A B C
1 21 21 21
2 11 21 21
3 21 11 21
4 11 11 21
5 21 21 11
6 11 21 11
7 21 11 11
8 11 11 11
The alternative coding scheme used with 2k designs is based on lowercase letters a, b, c, d, . . . ,
which are intended to denote the high levels of the corresponding factors A, B, C, D, . . . . To denote a
particular experimental run, we form a string of lowercase letters, showing which factors in the run are
set to their high levels. Letters are omitted for factors that are set to their low levels. For instance, in a 23
experiment with factors A, B, and C, the combination of letters ab refers to the test run in which both A
and B are set at their high levels and C is set at its low level. Similarly, the letter b denotes the run with B
high and both A and C low. The notation (1) is used for the one test run in which all factors are set to their
low levels.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
474 chapter 10 Experimental Design
Example 10.4 Using the letter coding method, the eight test runs of the 23 experiment in Example
10.3 are coded as follows. The table shows the letter codes that correspond to runs
that have been written in Yates standard order:
Conducting an Experiment
Yates’s method for generating the columns of the design matrix provides a quick
and organized method for laying out the factor–level combinations of a 2k experi-
ment. However, when it comes to actually performing the experimental tests, test
runs should be conducted in random order. Randomization of experimental runs, first
discussed in Section 4.3, helps reduce the possible effects of unknown factors on the
test results.
To see why randomization is used, suppose that we begin to conduct the runs
in a 23 experiment in standard order (as in Example 10.3) but that unforeseen prob-
lems occur and only half the runs can be performed in one day, the remaining runs
being postponed until later in the week. Because the runs are not randomized,
factor C is always at its low level during the first day of testing. Later in the week,
the remaining runs will be conducted when C is at its high level and when other
external conditions may possibly have changed. Consequently, any effect that C has
on the response will be commingled with the effects of changing conditions during
the week. If statistical tests eventually show that factor C has a significant effect on
the response, the experimenter will not be able to tell whether this effect is really
caused by factor C or, instead, if it is caused by changes in other conditions that
might have arisen between the two days of testing. If the test runs had been ran-
domized, there would have been a much smaller chance that such external factors
could systematically influence the test results. For instance, it is highly unlikely that
a randomized run sequence would have resulted in having C always at its low level
during the first half of the runs.
Example 10.5 To randomize the test runs in a 23 experiment, first find the total number of runs
required, including replicated runs. For example, if we decide to conduct two rep-
licate runs for each factor–level combination, then a total of N 5 r2k 5 2 × 23 5 16
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 475
8 6 10 16 4 15 7 3 14 1 11 12 5 13 2 9
Proceeding down the rows of the design matrix, we write the first set of eight random
numbers. Returning to the top row, we record the second set of eight random num-
bers. According to this randomization, the experimenter should begin by conduct-
ing run 2, followed by run 7, then run 8, run 5, run 5, run 2, and so forth. The
response value measured at each run is recorded in the row corresponding to its test
number.
definitions The main effect of a factor is the average response value for all test runs at the
high level of the factor minus the average response value for runs at the low level
of the factor.
The two-factor interaction effect is one-half of the difference between the main
effects of one factor calculated at the two levels of the other factor.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
476 chapter 10 Experimental Design
Run A B C D y
1 21 21 21 21 27.22
2 1 21 21 21 25.19
3 21 1 21 21 23.23
4 1 1 21 21 18.93
5 21 21 1 21 25.32
6 1 21 1 21 22.61
7 21 1 1 21 26.80
8 1 1 1 21 20.20
9 21 21 21 1 44.53
10 1 21 21 1 42.44
11 21 1 21 1 43.78
12 1 1 21 1 37.66
13 21 21 1 1 42.16
14 1 21 1 1 38.97
15 21 1 1 1 48.85
16 1 1 1 1 42.05
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 477
Response
42.56
23.69
– +
The interaction effect between two factors A and B is denoted by writing either AB or
A 3 B. Both notations are found in the literature. The interaction between three fac-
tors A, B, and C is written either as ABC or A 3 B 3 C; four-factor interactions are
written ABCD or A 3 B 3 C 3 D, and so forth. To illustrate the calculation of a two-
factor interaction effect, consider the BC interaction for the experiment in Table 10.2.
Figure 10.20 shows the BC interaction graph created by plotting the average response
values for all four combinations of levels of B and C. Each plotted point is now the aver-
age of four data points, not eight. For instance, the point where B is low and C is low is
the average of the data points 27.22, 25.19, 44.53, and 42.44. With B on the horizontal
axis, the pairs of points with the same level of C are joined by line segments. These two
lines show the main effect of changing B from low to high while holding each level of C
fixed. As you can see from the graph, the effect of changing B from low to high is very
different for the two levels of C. The BC interaction is defined to be one-half of the dif-
ference between the main effect for B with C at its high level and the main effect for B
with C at its low level:
1
BC interaction effect 5 3 (34.48 2 32.27) 2 (30.90 2 34.85) 4 5 3.08
2
Response
Unless otherwise noted, all content on this page is © Cengage Learning.
34.85
34.48
+
interaction = half of the
effect (when is held at +1)
minus the effect (for held at –1)
32.27 –
30.90
– +
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
478 chapter 10 Experimental Design
and CB interaction effects are exactly the same and are both treated as measures of the
same two-factor interaction.
The definitions of higher-order interaction effects become more complex as the
number of factors increases. For example, the three-factor ABC interaction is defined
to be one-half of the difference between the AB interaction values calculated at the two
levels of C. As is the case with all interaction calculations, this definition is symmetric
in the sense that the ABC interaction can also be calculated by using the difference
between the BC interactions at the two levels of A or by using the difference between
the AC interaction values at the two levels of factor B.
Fortunately, there is a much simpler method for calculating interactions of any
order. Starting with the design matrix, we first create additional columns by forming
all possible products (two at a time, three at a time, etc.) of columns in the design
matrix. It is convenient to append these columns to the right of the design matrix.
For example, an AB column is formed by multiplying the corresponding entries in
columns A and B, an ABC column is formed by multiplying across the rows of A, B,
and C, and so forth. Next, a contrast is calculated for each column in the extended
matrix by multiplying the signs in a particular column by the column of response
values and then summing. In the case where there are repeated runs at each factor–
level setting (as illustrated in Example 10.5), the column signs are multiplied by the
total of the responses at each factor–level combination. Each contrast is given the
name of the column from which it is constructed. For instance, in a 23 design, there
will be contrasts for A, B, C, AB, AC, BC, and ABC. The final step is to divide each
contrast by half the number of runs:
contrast contrast
effect estimate 5 5
r2 k21
half the number of runs
The resulting values will be the estimates for each main effect and each interaction
effect.
Example 10.6 To calculate all main effects and interaction effects for the 24 design in Table 10.2,
the design matrix (in Yates standard order) is first extended to include all possible
products of columns. For illustration, the BC and ABC columns are shown here.
As part of Exercise 25, the reader should fill in the remaining columns.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 479
Analyzing a 2k Experiment
Having obtained estimates of all main effects and interaction effects, we must now use a
statistical procedure to sort the important effects from the unimportant ones. The particu-
lar procedure used depends on whether the experiment is replicated. If only one response
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
480 chapter 10 Experimental Design
value is measured for each test run, then there is no replication of test runs; consequently,
no estimate of experimental error is available. In this commonly occurring situation, the
recommended procedure is to create a normal quantile or probability plot of the effects and
fit, by eye, a straight line through the “small” effects, that is, the effects with magnitudes close
to zero. Only the effects that do not fall on or near the straight line are considered to be the
important ones. The effects falling near the line are thought to be due to experimental error
or “noise.” To date, there is no universally agreed-upon method for deciding which group
of “small” effects to fit with a straight line. Fortunately, though, decades of empirical stud-
ies have shown that the nonsignificant effects usually comprise the majority of the plotted
points, so fitting an appropriate straight line is usually fairly easy.
Example 10.7 The 15 effects A, B, C, D, AB, . . . , ABCD for the 24 experiment in Table 10.2 are shown
in a normal quantile plot in Figure 10.21. As expected, many of the small effects tend to
fall very close to a straight line (fit by eye). The effects that fall off the line appear to be
A, D, AB, BD, and BC, although it is possible the last three may be close enough to the
line to ignore. Based on these results, we tentatively propose that these five effects are
the only ones that matter in the experiment. In particular, factor D (temperature) has
a large positive effect on the response variable (% of chemical converted). Specifically,
changing D from its low to high level causes an increase in the percentage of chemi-
cal converted. The situation for the other factors is not as clear since the three smaller
interaction terms, AB, BD, and BC, are potentially significant, which means that their
interaction plots must be examined before deciding on the best settings for these factors.
Normal quantile
Effect
0 10 20
Another method for separating the important effects from the others is to assume
that certain higher-order effects are nonsignificant and to use these effects to obtain an
estimate of the experimental error. This procedure is based on decades of empirical
evidence suggesting that main effects and two-factor interaction effects are usually the
most important ones in an experiment. Given a choice, the method based on normal
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 481
probability plotting is usually more reliable than simply making assumptions about the
outcomes of an experiment. However, when a normal probability plot suggests that the
higher-order interactions may indeed be insignificant, there is more justification for
combining these effects to calculate an SSE.
Suppose that m effects, call them E1, E2, E3, . . . , Em, are thought to be insig-
nificant. To create an estimate of the variance of any effect, we form the average of the
squared effect values:
1 m 2
variance of any effect ^ Ei
m i51
This estimate has m degrees of freedom associated with it. Consequently, confi-
dence intervals for the remaining effects in the experiment can be constructed by using
the following formula:
1 m 2
confidence interval for effect E: E 6 (t critical value) ^ E (df 5 m)
A m i51 i
Example 10.8 The normal quantile plot in Figure 10.21 shows that only a few of the main
effects and two-factor interactions are likely to be significant for the experiment in
Table 10.2. Consequently, it is reasonable to assume that at least the three-factor and
four-factor interactions are negligible and can be safely used to derive confidence
intervals for the remaining 10 effects. Combining these m 5 5 effects allows us to
approximate the variance of any effect as follows:
Effect
name Effect estimate Squared effect
1 ABC 2.150 (2.150)2
2 ABD 2.185 (2.185)2
3 ACD .150 (.150)2
4 BCD .748 (.748)2
5 ABCD .255 (.255)2
Sum 5 .70375
variance of any effect .70375/5 5 .14075
We can then determine the 95% confidence interval for an effect E as follows:
1 m 2
E 6 ty2 ^ E 5 E 6 (t critical value for 5 df)2.14075
A m i51 i
5 E 6 (2.571)(.3752)
5 E 6 .965
For instance, a 95% confidence interval for the D effect is 18.87 6 .965. Since this in-
terval does not contain 0, we conclude that the D effect is significantly different from 0.
Similarly, a 95% interval estimate for the BC interaction is 3.08 6 .965, which indicates
that the BC interaction is also significant. The same effects identified by the normal
quantile plot (A, D, AB, BC, and BD) turn out to be the only significant effects identi-
fied when we assume that the three- and four-factor interactions are negligible.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
482 chapter 10 Experimental Design
When test runs are replicated, that is, when r $ 2, then a 2k experiment can be ana-
lyzed using ANOVA techniques. The sum of squares for any main effect or interaction
effect can easily be computed from the effect’s contrast:
contrast 2 contrast 2
sum of squares for an effect 5 5
r2k total number of runs
The error sum of squares, SSE, can be computed in two ways: (1) by calculating the
total sum of squares, SST, for the data and then subtracting the sums of squares of the
effect estimates or (2) directly, by finding the error variation for each of the 2k test runs.
Both methods are illustrated in Example 10.9.
Example 10.9 Compact discs (CDs) and digital video discs (DVDs) are manufactured by the same
process. First, a master disc is created by baking a photosensitive material on a round
glass plate. Next, timed pulses from a laser beam etch digital signals in a tight spiral
on the plate. The plate is then “developed” to reveal a sequence of surface “pits” that
encode the digital information. Master plates are electroplated to produce metal
“stampers,” which, when placed in plastic injection molding machines, press thou-
sands of copies of the final disc.
Some of the factors that affect the mastering stage are listed here, along with the
factor levels that were used in a 23 experiment on compact discs. The goal of the ex-
periment was to minimize an electronic response called “jitter,” which is a measure
of how well the CD can be read by a CD-ROM device. The factor “linear velocity” is
a measure of the speed with which the laser travels in a slowly increasing spiral path
as it burns the pits in the photosensitive material.
Factor Low level High level
Laser power 90% 110%
Developing time 20 sec 30 sec
Linear velocity 1.20 1.30
Data from two replicated runs of the 23 experiment is given in Table 10.3. The
16 test runs were conducted in random order, but the data is presented in the table
in Yates standard order. By extending the design matrix (see Example 10.6) and
applying the columns of “1” and “2” signs to the column of response totals, we
obtain the contrasts, effects, and sums of squares listed below Table 10.3. Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 483
The total sum of squares SST can be found by calculating the sample variance of all
16 measurements and then multiplying by 15. Thus SST 5 15(6.8869)2 5 711.441.
Subtracting all of the effects sums of squares from SST gives the value of SSE 5
711.441 2 138.063 2 85.563 2… 2 .063 2 3.063 5 26.5. Alternatively, the error
variation can be calculated separately for each run by finding ^ ri51(yi 2 y)2 and then
summing all 23 5 8 results:
r
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
484 chapter 10 Experimental Design
At a significance level of 5 .01, this table shows that the significant effects
are A, B, C, and the AC interaction. Because B does not appear to interact with A
or C (i.e., neither the AB nor the BC interaction is significant), we can immediately
conclude that increasing the linear velocity will, on average, cause the response vari-
able to decrease by about 4.625 units. Because the AC interaction is significant, it is
necessary to examine the AC interaction plot before deciding on the proper settings
for A and C (Figure 10.22). From the plot, we see that the settings that minimize the
response variable are A 5 21 and C 5 11. In this example, the conclusions from
the interaction plot do not agree with those from the main effects plots, which would
have (incorrectly) indicated that both A and C should be set at their 21 levels.
Mean
35
25
Fitting a Model
After the important effects have been identified, it is often useful to write an equation
for predicting the response value. Although in other applications this task would require
the methods of regression analysis (Chapters 3 and 11), the special arrangement of
factor levels in a factorial design makes it especially easy to find prediction equations. All Unless otherwise noted, all content on this page is © Cengage Learning.
that is needed is to define k predictor variables, one for each factor in a 2k experiment.
These predictor variables, also called indicator variables or dummy variables, use the
same 11 and 21 coding that we used to form the design matrix. For example, the indi-
cator variable for factor A is denoted by xA and is defined as follows:
11 When A is at its high level
xA 5 e
21 When A is at its low level
In the same fashion, indicator variables xB, xC, . . . are defined for the remaining factors
in the experiment. Interaction terms are represented by products of the indicator vari-
ables for the factors comprising the interaction term. For instance, the AB interaction
term in a prediction equation is represented by the product xAxB, the ABC interaction
by the product xAxBxC, and so forth.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 2 Designs 485
Example 10.10 In Example 10.9, our analysis showed that the important effects are A, B, C, and
the AC interaction. The corresponding effect estimates are 25.875 (A), 24.625 (B),
29.375 (C), and 5.125 (AC). Furthermore, the grand average of all 16 data points
in the experiment is 25.313. Using indicator variables xA, xB, and xC, the prediction
equation based on A, B, C, and AC is
predicted
5 yn 5 25.313 2 2.938xA 2 2.313xB 2 4.688xC 1 2.563xAxC
value of y
This equation can then be used to find predicted values of the response variable
for selected values of factors A, B, and C. For example, suppose that we set A high
and both B and C low. This corresponds to the choice xA 5 11, xB 5 21, xC 5 21.
Substituting these values into the prediction equation, we get
yn 5 25.313 2 2.938( 1 1) 2 2.313( 2 1) 2 4.688( 2 1) 1 2.563( 1 )( 2 1)
5 26.813
Notice that the predicted value of 26.813 agrees reasonably well with the average of
the two response values (26 and 29) that were measured at this combination of factor
settings.
Prediction equations are used for several purposes: (1) to generate diagnostic checks
on the adequacy of the chosen model, (2) to create response surface and contour plots,
and (3) to establish factor settings that lie between the 11 and 21 levels. Discussing all
of these applications is beyond the scope of our presentation. However, Example 10.11
illustrates how the prediction equation can help in choosing factor settings.
Example 10.11 Based on our analysis of the compact disc experiment in Examples 10.9 and 10.10,
the prediction equation
yn 5 25.313 2 2.938xA 2 2.313xB 2 4.688xC 1 2.563xAxC
should provide an adequate description of how the response variable is affected by
the factors A (laser power), B (linear velocity), and C (developing time). To increase
the speed with which discs are manufactured, the compact disc company would like
to set the linear velocity (factor B) as fast as possible, while shortening the developing
time as much as possible. Within the range of factor values studied in this experiment,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
486 chapter 10 Experimental Design
this means that they would like to operate the mastering process at the high setting for
B and the low setting for C. Given this situation, what setting should they choose for
the laser power if the goal is to minimize the response variable “jitter”?
Substituting xB 5 11 and xC 5 21 into the prediction equation and collecting
terms, we find that
yn 5 25.313 2 2.938xA 2 2.313( 1 1) 2 4.688( 2 1) 1 2.563xA( 2 1)
or
yn 5 27.688 2 5.501xA
From this equation, we see that minimizing the response value y can be accomplished
by making xA as large as possible. Within the range of values studied in the experiment,
the best setting for xA should be xA 5 11; that is, laser power should be set at its high
level of 110%. When this is done, the value of the response variable should be about
27.688 2 5.501(11) 5 22.187. If the value of 22.187 is small enough to satisfy customer
requirements for jitter, then the company can proceed to use these factor settings. If not,
then it can further reduce the jitter by choosing the settings xA 5 21, xB 5 11, and xC 5
11 as in Example 10.9, even though these settings will necessarily increase the produc-
tion time for each master disc (since developing time, C, will now be at its high level).
24. Write the design matrix (in Yates standard order) for Strength of Red Mud Filled PP/LLDPE Blends,”
a complete 23 experiment. Denote the response mea- J. of Materials Science Letters, 1996: 1343–1345).
surements associated with the runs as y1, y2, . . . , y8. The factors studied were the ratio of PP to LLDPE
a. Using the definition that the BC interaction is and the amount of red mud particles (in parts per
one-half the difference between the main ef- hundred parts of resin). The levels at which these
fect for B with C at its high level and the main factors were studied are given in the following
effect for B at its low level, write the formula table:
for the BC interaction in terms of the data y1,
y 2, . . . , y 8. Lower level Upper level
b. Reversing the order of the factors, repeat the cal- PP/LLDPE ratio .25 4
culation in part (a) for the CB interaction. RM particles 4 10
c. Show that the formulas in parts (a) and (b) are
Composites made with each combination of factor
equivalent.
levels were strength tested, with the following
25. Fill in the remaining columns of contrasts for the 24 results:
design in Example 10.6.
Strength
26. Polyolefin blends and composites can often (in MPa)
improve the strength of existing polymers. In PP/ Repli- Repli-
a study to determine which blends lead to in- LLPDE RM cation cation
creased material strength, composites of isotactic Run ratio particles 1 2
polypropylene (PP) and linear low-density polyeth- 1 4 10 19.3 20.2
ylene (LLDPE) were mixed with red mud (RM) 2 .25 10 8.1 9.7
particles (“Application of Factorial Design of Ex- 3 4 4 20.3 24.5
periments to the Quantitative Study of Tensile 4 .25 4 10.4 11.8
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.4 Exercises 487
a. Calculate the main effects and the two-factor 29. As with many dried products, sun-dried tomatoes
interaction effect for this experiment. can exhibit an undesirable discoloration during the
b. Create the ANOVA table for the experiment. drying and storage process. A replicated 23 experi-
Which factors appear to have an effect on ment was conducted in an effort to optimize color
strength? (Use 5 .05.) by considering storage time, temperature, and pack-
c. Draw the main effects and interaction effects aging type (“Use of Factorial Experimental Design
plots for the factors identified in part (b). for Analyzing the Effect of Storage Conditions on
d. Which settings (high or low) of the factors in Color Quality of Sun-Dried Tomatoes,” Sci. Res.
part (b) lead to maximizing the strength of a and Essays, 2012: 477–489). In the following table,
composite? higher values of the response variable (based on
e. Using the important effects identified in part (b), chromaticity measurements) are associated with
write a model for predicting strength of a composite. higher color quality:
27. The following data resulted from a study of the Color Quality
dependence of welding current on three factors: Storage Storage Replication Replication
welding voltage (A), wire feed speed (B), and tip- Run time temp Packaging 1 2
to-workpiece distance (C). Two levels of each factor 1 2 2 2 2.38 2.40
were used, with two replicate observations made at 2 1 2 2 2.38 2.40
each combination of factor levels. 3 2 1 2 2.42 2.40
Test run Response values 4 1 1 2 2.31 2.29
(1) 200.0, 204.2 5 2 2 1 2.38 2.40
a 215.5, 219.5 6 1 2 1 2.38 2.40
b 272.7, 276.9 7 2 1 1 1.94 1.94
ab 299.5, 302.7 8 1 1 1 1.93 1.92
c 166.6, 172.6
a. Calculate all main effects and two-factor inter-
ac 186.4, 192.0 action effects.
bc 232.6, 240.8 b. Construct an ANOVA table and use it as a basis
abc 253.4, 261.6 for deciding which factors appear to affect color
a. Create the ANOVA table for this experiment. quality (Use 5 .01).
b. At 5 .01, which effects appear to be important? c. Create main effects and interaction effects plots
for the factors identified in part (b).
28. The article “Effect of Cutting Conditions on Tool
d. Which settings (high or low) of the factors in
Performance in CBN Hard Turning” ( J. of Manuf.
part (b) lead to maximizing color quality?
Processes, 2005: 10–16) reported the accompanying
data, from a 23 design, on cutting speed (m/s), feed 30. Self-consolidating concrete (SCC) is a highly flow-
(mm/rev), depth of cut (mm), and tool life (min). able product that can easily fill heavily congested
Perform an ANOVA to investigate two-factor inter- reinforcement areas. Despite its low viscosity, SCC
actions and main effects. also maintains high stability to prevent segregation.
Obs Cut spd Feed Cut Depth Life The authors of “Effect of SCC Mixture Composi-
1 1.21 0.061 0.102 27.5 tion on Thixotropy and Formwork Pressure” (J. Ma-
2 1.21 0.168 0.102 26.5 ter. Civ. Engr., 2012: 876–888) conducted a study
3 1.21 0.061 0.203 27.0 to determine the effect of three mixture param-
4 1.21 0.168 0.203 25.0 eters—base material slump flow (A), sand-to-total
5 3.05 0.061 0.102 8.0 aggregate ratio by volume (B), and relative content
6 3.05 0.168 0.102 5.0 of coarse aggregate (C)—on characteristics of the
7 3.05 0.061 0.203 7.0 resulting SCC mixtures. The following table gives
8 3.05 0.168 0.203 3.5 the coded factor levels along with values of the time
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
488 chapter 10 Experimental Design
(s) required for the SCC mixture to reach 500-mm a. For the response variable combustion time, cal-
slump flow. culate all main effects and interaction effects for
this experiment.
Run Slump S/A Coarse Time
b. Create a probability plot of the effects in part (a).
1 21 21 21 1.71
Which effects appear to be important?
2 21 21 1 3.19
c. Which settings (high or low) of the factors in
3 21 1 21 1.75
part (b) lead to maximizing combustion time?
4 21 1 1 3.06
Which settings lead to minimizing combustion
5 1 21 21 .88
time?
6 1 21 1 2.44
d. Determine a model equation relating combus-
7 1 1 21 1.34
tion time to the effects identified in part (b).
8 1 1 1 3.37
e. Repeat parts (a)–(d) for the response variable
a. Calculate all main effects and interaction coke burnoff.
effects. 32. Impurities in the form of iron oxides lower the
b. Create a probability plot of the effects from part (a). economic value and usefulness of industrial miner-
Which effects appear to be important? als, such as kaolins, to ceramic and paper-processing
c. Which settings (high or low) of the factors in industries. A 24 experiment was conducted to assess
part (b) lead to maximizing the response vari- the effects of four factors on the percentage of iron re-
able? Which settings lead to minimizing the moved from kaolin samples (“Factorial Experiments
value of the response variable? in the Development of a Kaolin Bleaching Pro-
d. Determine a model equation relating time need- cess Using Thiourea in Sulphuric Acid Solutions,”
ed to reach 500-mm slump flow to the effects Hydrometallurgy, 1997: 181–197). The factors and
identified in part (b). their levels are displayed in the following table:
31. Combustion experiments of medium crude oil Low High
were conducted to determine which of three factors level level
(oxygen partial pressure, oxygen flow rate, and oxy-
Factor Description Units (21) (11)
gen molar concentration) affect various aspects of
the combustion process. (“Factorial Analysis of In A H2SO4 M .10 .25
Situ Combustion Experiments,” Trans. of the Insti- B Thiourea g/l 0.0 5.0
tution of Chemical Engineers, 1991: 237–244). Two C Temperature °C 70 90
response variables, combustion time (in hours) and D Time min 30 150
coke burnoff (in grams/hour), were studied using a
full 23 design with no replications: The data from an unreplicated 24 experiment is
given in the table below:
Molar Com- Coke
Iron Iron
Partial Flow concen- bustion burn
Test extraction Test extraction
Run pressure rate tration time off
run (%) run (%)
1 –1 –1 –1 10.6 5.73
(1) 7 d 28
2 1 –1 –1 11.2 5.70
a 11 ad 51
3 –1 1 –1 24.4 3.05
b 7 bd 33
4 1 1 –1 20.3 2.87 ab 12 abd 57
5 –1 –1 1 9.2 5.57 c 21 cd 70
6 1 –1 1 7.0 5.87 ac 41 acd 95
7 –1 1 1 14.3 3.13 bc 27 bcd 77
8 1 1 1 17.5 3.05 abc 48 abcd 99
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.5 Fractional Factorial Designs 489
a. Calculate all main effects and two-factor inter- Test Removal Test Removal Test Removal Test Removal
action effects for this experiment. run (%) run (%) run (%) run (%)
b. Create a probability plot of the effects. Which (1) 48.70 d 35.70 e 57.20 de 36.40
effects appear to be important? a 86.50 ad 59.60 ae 81.00 ade 52.50
c. Which settings (high or low) of the factors in b 89.10 bd 69.10 be 85.10 bde 61.00
part (b) lead to maximizing the percentage of ab 97.00 abd 89.10 abe 96.90 abde 89.30
iron extracted? c 58.30 cd 37.00 ce 57.60 cde 47.50
d. Write a model for predicting iron extraction per- ac 84.80 acd 64.80 ace 78.80 acde 55.90
centage from the factors identified in part (b). bc 90.90 bcd 71.70 bce 87.30 bcde 58.50
33. An unreplicated 25 experiment was performed to abc 95.20 abcd 93.90 abce 97.10 abcde 89.00
determine which factors affect the percent of arse- a. Calculate all main effects and two-factor inter-
nic removed from contaminated water by electro- action effects.
coagulation (EC) (“Prediction of Arsenic Removal b. Create a probability plot of the effects. Three
by Electrocoagulation: Model Development by effects in particular should appear to be impor-
Factorial Design,” J. Hazard. Toxic Radioact. Waste, tant; what are they?
2011: 48–54). The factors and corresponding levels c. Which settings (high or low) of the factors in
are shown here along with the resulting data. part (b) lead to maximizing the percentage of
Factor Description Units Low level High level arsenic extracted?
(21) (11) d. Develop a model equation for predicting arsenic
A Time s 30 120 removal percentage from the factors identified
in part (b).
B Currcnt amp .6 3.0
C EC area cm2 57 91.2
D Volume L 1 3
E Arsenic mg/L .23 1.18
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
490 chapter 10 Experimental Design
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.5 Fractional Factorial Designs 491
This four-column matrix is the design matrix of a fractional factorial design based on
four factors. In fact, these 8 test runs correspond to certain rows in the full 24 design, as
shown (shaded) here.
Run A B C D
1 21 21 21 21
2 11 21 21 21
3 21 11 21 21
4 11 11 21 21
5 21 21 11 21
6 11 21 11 21
7 21 11 11 21
8 11 11 11 21
9 21 21 21 11
10 11 21 21 11
11 21 11 21 11
12 11 11 21 11
13 21 21 11 11
14 11 21 11 11
15 21 11 11 11
16 11 11 11 11
Because the 8 test runs comprise only a fraction of the 16 runs required in a full 24
design, we say that the 8-run experiment is a fractional factorial experiment. Further-
more, since this design uses only half of the 16 runs, we say that it is a half fraction of
the full factorial design based on four factors.
All of the information about the 8-run design can be compactly summarized using
the following notation system. The particular fractional factorial design we have created
is denoted as a 2421 design. This notation carries the following information:
1. The design has 8 test runs (because 2421 5 23 5 8).
2. Four factors are studied in the experiment.
3. Each factor has two levels.
4. One factor (factor D) has been added to a full design based on 8 runs.
5. The design uses a fraction, 1/21, of the runs of a full 2k design.
In general, any fractional factorial design can be described by the notation 2k2p, which
is intended to convey that
1. The design has a total of 2k2p test runs.
2. k factors are studied in the experiment.
3. Each factor has two levels.
4. p factors have been added to a full design based on 2k2p runs.
5. The design uses a fraction, 1@2p, of the runs of a full 2k design.
The general procedure for creating a fractional factorial design is similar to that in
the previous example: First, create the extended design matrix for a full design based
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
492 chapter 10 Experimental Design
on k 2 p test runs, and then rename p of the interaction columns with the p additional
factors. It is convenient to use sequential capital English letters to denote the factors.
As we will see subsequently, the choice of which columns to replace with the additional
factors is important and cannot simply be made arbitrarily.
Example 10.12 Suppose that you want to study five factors using only 8 test runs. How do you
create a fractional factorial design to accomplish this? First, start with the full 23
design (i.e., the full 2k design that has 8 runs). Write the column headings of the
extended design matrix: A, B, C, AB, AC, BC, and ABC. Finally, choose two of
the interaction columns, say, ABC and AC, and assign the additional two factors,
D and E, to these columns. Denote this column assignment by writing D 5 ABC
and E 5 AC. Because we are adding two factors (D and E) to a full design based
on three factors (A, B, and C), this design is called a 2522 fractional factorial. To
create the design matrix for this particular 2522 experiment, first write the design
matrix in Yates standard order for the 23 experiment with factors A, B, and C.
Then append columns D and E. The entries in D are found by multiplying the
entries of columns A, B, and C. Similarly, the entries in column E are found by
multiplying the entries of columns A and C. Exercise 35 asks you to write this
design matrix.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.5 Fractional Factorial Designs 493
Example 10.13 Let’s determine the alias structure of the 2421 design where D is aliased with
ABC. The generator of this design is D 5 ABC. Multiplying both sides by D gives
D ? D 5 D(ABC), or I 5 ABCD. Since there is only one “word” in this equation,
the defining relation is also of the form I 5 ABCD. Multiplying each of the 24 21
effects by the relation I 5 ABCD yields the following:
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
494 chapter 10 Experimental Design
There is a lot of repetition in this list. Eliminating duplicate equations, we can sum-
marize the alias structure of the design as follows:
A 5 BCD AB 5 CD
B 5 ACD AC 5 BD
C 5 ABD AD 5 BC
D 5 ABC ABCD 5I
The alias structure can be summarized as follows: (1) Each main effect is aliased
with a three-factor interaction; (2) all two-factor interactions are aliased with one
another; and (3) the single four-factor interaction is aliased with the grand average
of the data.
Example 10.14 The 2522 design of Example 10.12 provides a better illustration of how the defining
relation is formed. Recall that the design generators in that example are D 5 ABC
and E 5 AC. Writing these in the form I 5 ABCD and I 5 ACE, we can see that the
defining relation is formed from the “words” ABCD and ACE and all possible prod-
ucts of these words. Since there is only one such product, namely, (ABCD)(ACE) 5
A2BC2DE 5 BDE, the defining relation is I 5 ACE 5 BDE 5 ABCD. Multiplying
each of the 25 2 1 effects through by the defining relation gives the following alias
structure (Exercise 37):
Notice that each main effect is now aliased with at least one two-factor interaction as
well as higher-order interactions in this design.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.5 Fractional Factorial Designs 495
each combination of factor levels. Therefore, normal quantile or probability plots are
generally used to analyze fractional designs. In those fortunate cases where replicated
test runs are available, ordinary ANOVA tests can be used to distinguish the important
effects from the others.
To begin the analysis of an unreplicated fractional design, first compute all 2k2p
effects (this includes the grand average) associated with the design. Then construct a
normal plot of all effects except the grand average. Analyze the plot in the usual fashion
by fitting a straight line, by eye, through the effects with small magnitudes. Finally, use
the alias structure to formulate the model that is most likely to explain the pattern in the
plot. One common practice is to opt for main effects and two-factor interactions rather
than higher-order effects when formulating a tentative model.
Example 10.15 Pyrometallurgical processes are normally used to extract manganese from raw
mineral ores, but alternative methods based on chemical reactions are currently
being studied. One such method, based on reductive chemical leaching, uses
sucrose in a solution of sulfuric acid to extract manganese dioxide (“Fractional
Factorial Experiments in the Development of Manganese Dioxide Leaching by
Sucrose in Sulfuric Acid Solutions,” Hydrometallurgy, 1994: 215–230). In this in-
vestigation, five factors were studied to determine their effect on the percentage
of manganese dioxide, MnO2, obtained from the leaching process (Table 10.5,
page 496).
A 2521 design with generator E 5 ABCD was used. From the data in Table 10.5,
a normal quantile plot of the effects was created (Figure 10.23, page 496). From this
plot, it appears that only factors A (sucrose concentration), B (particle size of ore),
and E (sulfuric acid concentration) have a significant effect on the percentage of
MnO2 extracted by the leaching process. None of the interaction terms appears to be
significant. The effect estimates are
From these results, we can conclude that raising the sucrose concentration and using
ores of larger particle size tend to increase the MnO2 yield. In addition, because rais-
ing the sulfuric acid concentration tends to reduce the yield, it would be better to use
the lower concentration. We divide the effects by 2 to obtain the model coefficients.
In addition, we can write a model for predicting the percentage yield, y, given the
(coded) values of the variables xA, xB, and xE.
The fact that lowering the sulfuric acid concentration has such a large effect on yield
suggests that further experiments be conducted with even lower H2SO4 levels.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
496 chapter 10 Experimental Design
Table 10.5 25–1 design for studying the effects of five factors
on percentage yield of a chemical process
Factor Factor name Low level High level
A Sucrose (g/L) 5 10
B Ore particle size (m) 90–125 200–300
C Mixing rate (min–1) 150 200
D Temperature (°C) 30 50
E Sulfuric acid (M) 1 2
Particle Yield
Run Sucrose size Agitation Temperature H2SO4 %
1 21 21 21 21 1 14.0
2 1 21 21 21 21 56.0
3 21 1 21 21 21 63.5
4 1 1 21 21 1 38.0
5 21 21 1 21 21 48.0
6 1 21 1 21 1 25.5
7 21 1 1 21 1 26.5
8 1 1 1 21 21 81.0
9 21 21 21 1 21 45.0
10 1 21 21 1 1 25.0
11 21 1 21 1 1 24.0
12 1 1 21 1 21 51.5
13 21 21 1 1 1 18.0
14 1 21 1 1 21 67.5
15 21 1 1 1 21 62.0
16 1 1 1 1 1 42.0
Normal quantile
0
: Sucrose
: Size
: Agitate
–1 : Temp.
: H2SO4
Effect
–30 –20 –10 0 10
Figure 10.23 Normal quantile plot of the effects (response is yield %)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10.5 Exercise 497
34. In a 2723 fractional factorial design, 40. Metal “leads” that protrude from electronic compo-
a. How many factors are being studied? nents often have their bases sealed with glass to protect
b. How many experimental runs are required against moisture ingress. Fractures in the glass can be
(assuming no replications)? caused by bending or twisting the leads and by large
c. What fraction of the runs of a full 27 design are thermal changes. In an experiment designed to evalu-
used by this experiment? ate how different factors affect the peak stress applied
to a glass seal, the following factors and factor levels
35. Fill in all the columns in the design matrix for the
were studied (“A Fractional Factorial Numerical
2522 design of Example 10.12.
Technique for Stress Analysis of Glass-to-Metal Lead
36. A 2421 design is specified by setting D 5 ABC. Seals,” J. of Electronic Packaging, 1994: 98–104):
a. Fill in the columns of the design matrix for this Low High
fractional factorial design. level level
b. By multiplying the appropriate columns in the
Factor Description (L, in.) (H, in.)
design matrix from part (a), show that AB and
s Half the distance between
CD contrasts are identical.
neighboring leads .025 .35
37. Using the design generators I 5 ABCD and I 5 wlead Horizontal width of lead .010 .020
ACE, verify all the entries in the alias structure of hlead Distance from package
the 2522 design of Example 10.14. base to center of lead .127 .381
7 rport Radius of port in package
38. A quarter-fraction of a 2 experiment (factors A,
for lead seal .4572 .5588
B, . . . , G) is constructed using the design genera-
tors ABCDE 5 F and CDE 5 G. twall Wall thickness of
a. How many experimental runs (assuming no rep- package .030 .050
lications) must be conducted? The design matrix for the study was
b. Write down the alias structure for this design.
Run s wlead hlead rport twall
39. A fractional factorial experiment with 16 test runs 1 L L L L L
was conducted to determine the effects of several 2 L L L H H
factors on the antioxidant capacity in carotenoid ex- 3 L L H L H
tracts of the bacterium Thermus filiformis (“Evalua- 4 L L H H L
tion of Biomass Production, Carotenoid Level and 5 L H L L H
Antioxidant Capacity Produced by Thermus Fili- 6 L H L H L
formis Using Fractional Factorial Design,” Braz. J. 7 L H H L L
Microbiol., 2012: 126–134). The variables studied 8 L H H H H
were temperature (at 65°C and 75°C), pH (at 7 and 8), 9 H L L L H
tryptone (at 5 and 10 g/L), yeast extract (at 5 and 10 H L L H L
10 g/L), and Nitsch’s trace elements (2 and 5 mL/L). 11 H L H L L
The Nitsch’s trace elements factor was aliased with 12 H L H H H
the highest-order interaction term. 13 H H L L L
a. What are k and p for this 2k2p design? 14 H H L H H
b. Determine the alias structure of the design. 15 H H H L H
c. Suppose that it is reasonable to assume that all in- 16 H H H H L
teractions consisting of three or more factors are
negligible. In this case, will any of the estimates of a. Find k and p for this 2k2p design.
the remaining effects be aliased with one another? b. Determine the alias structure of this design.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
498 chapter 10 Experimental Design
41. In an effort to reduce the variation in copper plat- Here is data from the resulting fractional factorial
ing thickness on printed circuit boards, a fraction- experiment:
al factorial design was used to study the effect of
pH Temp Agents Ratio Speed Iron Removal (%)
three factors—anode height (up or down), circuit
2 2 2 1 1 29.19
board orientation (in or out), and anode placement
1 2 2 2 1 84.72
(spread or tight)—on plating thickness (“Charac-
2 1 2 2 2 95.25
terization of Copper Plating Process for Ceramic
1 1 2 1 2 96.08
Substrates,” Quality Engr., 1990: 269–284). The
2 2 1 1 2 49.89
following factor combinations were run:
1 2 1 2 2 87.92
Anode Board Anode Thickness 2 1 1 2 1 89.22
height orientation placement variation 1 1 1 1 1 96.17
2 2 2 11.63
a. What are k and p for this 2k2p design?
2 1 1 3.57
b. Determine the alias structure of this design.
1 2 1 5.57
Hint: Each of the last two design columns is a
1 1 2 7.36 product of two of the initial three columns.
a. Find k and p for this 2k2p design. c. Calculate estimates of the effects for this study.
b. Determine the alias structure of this design. d. Create a normal probability plot for the effects
c. Calculate estimates of the effects for this experi- determined in part (b) and identify any effects
ment. that appear to be important.
d. Assuming that the AB interaction is negligible, 43. Exercise 39 described a half-fraction of a factorial
use this information to obtain an estimate of experiment in which the Nitsch’s trace elements
SSE and perform hypothesis tests for both main factor was aliased with the highest-order interaction
effects. (Use 5 .05.) term. The response variable, antioxidant capacity,
e. From the results in part (d), which factors have a was measured in percent protection against singlet
significant effect on plating thickness variation? oxygen [O2(1 D g)]. The cited article reported the fol-
f. If the objective of the study is to minimize the lowing data:
variation in plating thickness, what setting of
each factor do you recommend? Temp pH Yeast Tryptone Nitsch %Prot
2 2 2 2 1 51.5
42. Lateritic nickel ore deposits are an important 1 2 2 2 2 85.1
source of nickel. Atmospheric acid leaching (AL) 2 1 2 2 2 46.1
has grown in popularity as a method to extract 1 1 2 2 1 49.0
nickel from such deposits. In the AL process, a 2 2 1 2 2 33.6
high concentration of ferric iron may remain 1 2 1 2 1 82.9
in the leach solution which would diminish the 2 1 1 2 1 57.1
purity of the desired nickel. A study was con- 1 1 1 2 2 71.9
ducted to investigate how five AL process factors 2 2 2 1 2 34.4
impact iron removal efficiency (%) from leach 1 2 2 1 1 42.7
solutions. These factors were pH (2 versus 4), tem- 2 1 2 1 1 31.4
perature (25°C and 85°C), neutralizing agents 1 1 2 1 2 64.8
[15% (W/W) MgO and 25% (W/W) CaCO3], Fe/ 2 2 1 1 1 4.3
Ni ratio (6 versus 18), and stirring speed (200 and 1 2 1 1 2 40.4
500 rpm) (“The Effect of Iron Precipitation Upon 2 1 1 1 2 48.9
Nickel Losses from Synthetic Atmospheric Nickel 1 1 1 1 1 60.5
Laterite Leach Solutions: Statistical Analysis and
Modelling,” Hydrometallurgy, 2011: 140–152). a. Calculate estimates of the various effects.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 499
b. Suppose that additional experimentation shows d. If the objective of the study is to maximize
that only those effects whose magnitudes exceed percent protection, what setting of each factor
15 are important. Which factors or interactions do you recommend?
have a significant effect on percent protection?
c. Create an effects plot for the important effects
identified in part (b).
Supplementary Exercises
44. The following data was used to investigate whether gap (C); the experiment involved three sizes, three
the compressive strength of concrete depends on the quantities, and three gaps, with two replicates at
type of capping material used or on type of curing each of the factor combinations. The resulting sums
method used. The numbers in the matrix are totals, of squares were SSA 5 12,209.77 SSB 5 19,641.09
each based on three replications. In addition, SSE 5 SSC 5 367,688.98 SS(AB) 5 8721.72 SS(AC) 5
4716.67 and SST 5 35,954.31 for this data. 40,008.11 SS(BC) 5 44,347.01 SS(ABC) 5 94,554.41
SSE 5 334,393.64 and SST 5 921,564.7275.
Curing method
a. Construct an ANOVA table for this data.
1 2 3 4 5 b. Test to see whether any interaction effects are
1 1847 1942 1935 1891 1795 significant at 5 .05.
Capping
2 1779 1850 1795 1785 1626 c. Test to see whether any main effects are signifi-
material
3 1806 1892 1889 1891 1756 cant at 5 .05.
a. Construct an ANOVA table for this experiment. 47. Exercise 20 described an experiment involving
b. Using 5 .01, test to see whether either factor three processing parameters: laser power (A), scan-
or their interaction is significant. Describe your ning velocity (B), and powder flow rate (C). Another
conclusions from these tests. experiment considered how depth penetration of
the cladding layer is affected by these same factors.
45. In an experiment to assess the effects of curing
Each factor had three levels and there was one ob-
time (factor A) and type of mix (factor B) on the
servation at each factor combination. Here is the
compressive strength of concrete cylinders, three
ANOVA table from the article, which only consid-
different curing times were used in combination
ered main effects and two-factor interactions:
with four different mixes, with three replicate ob-
servations obtained for each of the 12 factor–level SOURCE DF SS MS
combinations. The resulting sums of squares were ? ? ? 162.38
SSA 5 30,763.0, SSB 5 34,185.6, SSE 5 97,436.8, ? 0.080570 ? ?
and SST 5 205,966.6. ? ? 0.130195 ?
? ? ? 0.56
a. Construct an ANOVA table for this experiment.
? 0.145137 ? ?
b. Using 5 .05, can you conclude that there is a
? ? ? 0.76
significant interaction between the two factors?
Error ? ? 0.006387
c. Test, at 5 .05, the hypothesis that factor A has Total ? ?
no effect on compressive strength.
d. Test, at 5 .05, the hypothesis that factor B has a. Fill in the missing entries in the table.
no effect on compressive strength. b. Identify significant effects using 5 .01
46. The authors of the article cited in Exercise 15 also 48. The article “An Assessment of the Effects of Treat-
performed an experiment to see whether the maxi- ment, Time, and Heat on the Removal of Erasable
mum peak to valley profile height (Rmax) is affected by Pen Marks” (J. Testing and Eval., 1991: 394–397)
the abrasive size (A), abrasive quantity (B), and quill reports the following sums of squares for the response
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
500 chapter 10 Experimental Design
variable “degree of removal of marks” (larger values a. Construct an ANOVA table for this experiment
of this variable are associated with more complete re- including only main effects and two-factor inter-
moval of marks): SSA 5 39.171, SSB 5 .665, SSC 5 actions (as did the authors of the cited article).
21.508, SS(AB) 5 1.432, SS(AC) 5 15.953, SS(BC) 5 b. Use the appropriate F ratios to show that none
1.382, SS(ABC) 5 9.016, and SSE 5 115.820. Four of the two-factor interactions are significant at
different laundry treatments (factor A), three differ- 5 .05.
ent types of pen (factor B), and six different fabrics c. Which main effects are significant at 5 .05?
(factor C) were used in the experiment. Three obser-
50. Even under the increased levels of security sought
vations were obtained for each combination of the
by current airport security practices, airports try to
factor levels. Perform an analysis of variance using
assure rapid processing of individuals through secu-
5 .01 for all tests, and state your conclusions.
rity checkouts. In an experiment designed to find
49. The article cited in Exercise 21 also reported on an- combinations of factors that will minimize travelers’
other experiment in which the authors investigated processing times at security checkpoints, three fac-
whether the percent by weight of nickel in the al- tors were studied: the number of ticket checkers (2
loy layer is affected by niobium powder paste thick- or 3), the number of X-ray machines (1 or 2), and
ness (A, at three levels), scanning speed (B, at three the number of metal detectors (1 or 2) (“Operation
levels), and laser power (C, at three levels). One ob- of Airport Security Checkpoints Under Increased
servation was made at each factor-level combination, Threat Conditions,” J. of Transp. Engr.,1996: 264–
yielding the accompanying data (Note: Thickness 269). Each of the possible combinations of these
column headings were incorrect in the cited article): factors was studied by using eight separate random
samples of 67 travelers. The processing times
Paste Thickness
(in seconds) are summarized in the table below.
Power Speed .2 .3 .4
a. Calculate all main effects and interaction effects
700 600 17.14 20.16 18.73 for this experiment.
900 24.75 17.19 26.54 b. Pool the standard deviations of the replicated
1200 18.78 18.80 21.42 runs to find a value for SSE.
800 600 26.55 13.03 18.92 c. Using the SSE from part (b), determine which
900 19.96 29.37 21.41 effects are significant (at 5 .05).
1200 26.66 19.80 22.01 d. Which settings (high or low) of the factors in
900 600 33.33 27.65 28.71 part (c) lead to minimizing processing time?
900 37.33 28.81 23.22 e. What is the best way to staff a security check-
1200 34.98 26.40 15.44 point if management wants to limit the number
Processing time
Ticket X-ray Metal Number of Standard
Test checkers machines detectors replicates Mean deviation
1 2 2 2 67 39.10 1.29
2 3 2 1 67 46.50 4.30
3 2 2 1 67 50.56 5.41
4 3 2 2 67 35.07 1.05
5 2 1 2 67 93.37 37.75
6 3 1 1 67 90.55 33.52
7 2 1 1 67 97.70 34.79
8 3 1 2 67 88.86 37.58
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 501
of employees to five per checkpoint? Note: X-ray amount of chemical sprayed (spray volume), and
machines and metal detectors each require one the brand of chemical used (brand) are factors that
operator. may affect the uniformity of the coating applied.
f. Is the disparity in magnitudes of the standard A replicated 23 experiment was conducted in an
deviations a possible cause for concern in this effort to increase the coating uniformity. In the
experiment? following table, higher values of the response
variable are associated with higher surface uni-
51. Shea tree oxidation experiments were conducted to
formity:
determine which of three factors (reaction time, air
pressure, reaction temp.) affect various aspects in con- Surface
verting the woody biomass into a renewable biofuel. uniformity
Optimal enzymatic conversion of the Shea tree into Repli- Repli-
ethanol occurs when the cellulose content is maxi- Spray Belt cation cation
mized and lignin content is minimized (“Optimiza- Run volume speed Brand 1 2
tion of Pretreatment Conditions Using Full Factorial 1 2 2 2 40 36
Design and Enzymatic Convertibility of Shea Tree 2 1 2 2 25 28
Sawdust,” Biomass and Bioenergy, 2013: 130–138). 3 2 1 2 30 32
The response variable lignin removal (g/kg) was stud- 4 1 1 2 50 48
ied using a full 23 design with no replication: 5 2 2 1 45 43
Run Time Pressure Temp Lignin 6 1 2 1 25 30
1 21 21 21 30 7 2 1 1 30 29
8 1 1 1 52 49
2 1 21 21 110
3 21 1 21 241 a. Calculate all main effects and two-factor inter-
4 1 1 21 192 action effects for this experiment.
5 21 21 1 116 b. Create the ANOVA table for this experiment.
6 1 21 1 201 Which factors appear to have an effect on sur-
7 21 1 1 230 face uniformity? (Use 5 .01).
8 1 1 1 191 53. A half-fraction of a 25 experiment is used to study
a. Calculate all main effects and interaction effects the effects of heating time (A), quenching time (B),
for this experiment. drawing time (C), position of heating coils (D), and
b. Create a probability plot of the effects in part (a). measurement position (E) on the hardness of steel
c. Suppose that additional experimentation shows castings. The following data was obtained:
that only those effects whose magnitudes exceed
Test run Obs Test run Obs
40 are important. Which factors or interactions
a 70.4 acd 66.6
have a significant effect on lignin removal?
b 72.1 ace 67.5
d. Draw an effects plot for the important effects
identified in part (c). c 70.4 ade 64.0
e. Suppose that additional experiments show that d 67.4 bcd 66.8
the AB and BC interactions are not significant. e 68.0 bce 70.3
If the objective of the study is to maximize lig- abc 73.8 bde 67.9
nin removal, what setting of each factor do you abd 67.0 cde 65.9
recommend? abe 67.8 abcde 68.0
52. ln an automated chemical coating process, the Assuming that second- and higher-order interac-
speed with which objects on a conveyor belt are tions are negligible, conduct tests (at 5 .01) for
passed through a chemical spray (belt speed), the the presence of main effects.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
502 chapter 10 Experimental Design
Bibliography
Box, G. E. P., W. G. Hunter, and J. S. Hunter, Statis- Montgomery, D. C., Design and Analysis of Experi-
tics for Experimenters (2nd ed.), Wiley, New York, ments (8th ed.), Wiley, New York, 2012. This book
2005. This is one of the definitive texts on industrial gives complete coverage of experimental designs, in-
experimental design, with emphasis on 2k designs and cluding general factorials, blocking, 2k designs, frac-
fractional factorial designs. tional factorial designs, and more. Rigorous treatment,
Daniel, C., Applications of Statistics to Industrial Ex- good examples, and easy to read.
perimentation, Wiley, New York, 1976. A classic Myers, R.H., D.C. Montgomery, and C.M. Anderson-
text that briefly, yet eloquently, explains 2k and Cook, Response Surface Methodology: Process
fractional factorial designs from the point of view of and Product Optimization Using Designed Experi-
the practitioner. The author’s considerable experience ments (3rd ed.), Wiley, New York, 2009. Easy-to-read
in applying these designs makes it a very valuable presentations of response surface analysis and factorial
reference. designs. Includes some of the most recent developments
and tools in experimental design.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11
Nightman1965/Shutterstock.com
Inferential Methods
in Regression
and Correlation
11.1 Regression Models Involving a Single
Independent Variable
11.2 Inferences About the Slope Coefficient b
11.3 Inferences Based on the Estimated
Regression Line
11.4 Multiple Regression Models
11.5 Inferences in Multiple Regression
11.6 Further Aspects of Regression Analysis
Introduction
Regression and correlation were introduced in Chapter 3 as techniques for
describing and summarizing data consisting of observations on a dependent or
response variable and one or more independent variables. We first focused
on the case of a single independent variable and suggested constructing a
scatterplot of sample data ( 1, 1), . . . , ( , ) to gain preliminary insight into
the nature of any relationship between the two variables. When the scatterplot
exhibits a linear pattern, a line fit to the data by the principle of least squares
provides a convenient summary of the approximate relationship; the coefficient
of determination 2 describes what proportion of the total variation in the ob-
served values can be attributed to this relation. Substituting a particular
value into the linear equation results in a for the value of that
would be observed if one more observation were made at this particular value.
503
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
504 chapter 11 Inferential Methods in Regression and Correlation
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Regression Models Involving a Single Independent Variable 505
the plot shows a linear pattern, it is natural to take f (x) to be a linear function, resulting
in what is called the simple linear regression model.
DEFINITIONS The simple linear regression model assumes that there is a line with slope and
vertical or y intercept , called the true or population regression line. When a
value of the independent variable x is fixed and an observation on the dependent
variable y is made, the variables are related by the model equation
y 5 1 x 1 e
Without the random deviation e, all points would fall exactly on the population
regression line. We shall assume that for any fixed x value, e has a normal distri-
bution with mean value 0 (e 5 0) and standard deviation (e 5 ). We also
assume that the random deviations e1, e2, . . . , en associated with different obser-
vations are independent of one another.
Figure 11.1 shows several observations in relation to the population regression line.
Population regression
Observation when = 1 line (slope )
(positive deviation)
2
1
0
Unless otherwise noted, all content on this page is © Cengage Learning.
0 = =
1 2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
506 chapter 11 Inferential Methods in Regression and Correlation
y ? x * 5 1 x*
which is just the height of the population regression line above the value x 5 x . Simi-
larly, taking the variance on both sides of the equation and using the fact that the vari-
ance of a constant is zero gives
2y ? x * 5 2e 5 2 y ? x * 5
That is, for a given value of x, the amount of variability in y is the same as the amount
of variability in e, which in turn is the amount of variability about the population line.
Finally, e is assumed to have a normal distribution, and the sum of a constant 1 x*
and a normally distributed variable itself has a normal distribution. Thus the distribu-
tion of y for any fixed x value is normal.
For any fixed value, the dependent variable has a normal distribution with
(so the population regression line is the line of mean values) and
The key features of the model are illustrated in Figures 11.2 and 11.3. The three
normal curves in Figure 11.2 have identical spreads because the amount of variability
in y is the same at each x value.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Regression Models Involving a Single Independent Variable 507
= +
the population
regression line
+ 3 (line of mean values)
Mean value + 3
Standard deviation
+ 2 Normal curve
Mean value + 2
Standard deviation
Normal curve
+ 1
Mean value + 1
Standard deviation
Normal curve
1 2 3
(a) (b)
Figure 11.3 Data from the simple linear regression model: (a) s small; (b) s large
Unless otherwise noted, all content on this page is © Cengage Learning.
Example 11.1 Recently the use of granite in construction and as an ornamental material has grown
in popularity. However, due to its textural properties, granite is a difficult material to
process by traditional machining methods. Abrasive waterjet (AWJ) is an advanced
cutting process that has shown promise in improving granite machining. The authors
of “Performance of Abrasive Waterjet in Granite Cutting: Influence of the Textural
Properties” (J. of Materials in Civil Engr., 2012: 944–949) examined the effect of
textural properties on the cutting performance of AWJ. The article suggested the
simple linear regression model as a way to relate y 5 AWJ cut depth (mm) to x 5
granite grain size (mm).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
508 chapter 11 Inferential Methods in Regression and Correlation
Suppose that the parameter values for the actual model (as suggested by data in
the cited article) are
5 2.4 5 25.5 5 .9 mm
Then for any particular fixed x value, y is normally distributed with
mean value 5 y ? x 5 25.5 2 .4x
standard deviation 5 y ? x 5 .9
For example, when x 5 5, AWJ cut depth has mean value 5 25.5 2 .4(5) 5 23.5 mm.
Because 23.5 6 2 5 21.7 and 25.3, roughly 95% of all AWJ cut depths made when
granite grain size is 5 mm will be between these limits. The slope 5 2.4 is the
mean decrease in AWJ cut depth associated with a 1-mm increase in granite grain
size. Thus, if we make one observation on AWJ cut depth when x 5 5 and another
when x 5 6, we expect the former cut depth to exceed the latter by .4 mm (but the
actual difference in y values will almost always be either larger or smaller than this
because observations will deviate from the population line).
In practice, the judgment as to whether the simple linear regression model is ap-
propriate is virtually always based on sample data and a scatterplot. The plot should
show a linear rather than a curved pattern, and the vertical spread of points should be
relatively homogeneous throughout the range of x values. Figure 11.4 shows plots with
three different patterns, only one of which is consistent with the model.
(a) (b) (c) Unless otherwise noted, all content on this page is © Cengage Learning.
Figure 11.4 Some commonly encountered patterns in scatterplots: (a) consistent with
the simple linear regression model; (b) suggests a nonlinear probabilistic model;
(c) suggests that variability in changes with
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Regression Models Involving a Single Independent Variable 509
The least squares estimates of the slope and intercept of the population regression
line are the slope and intercept, respectively, of the least square line, given by
5 point estimate of 5
5 point of estimate of 5 2
where
5^ 2°
^ ^ ¢
1 ^ 22
5^ 2
2
The estimate of the population regression line is then just the least squares line
n5 1
Let x denote some particular value of the predictor variable x. Then a 1 bx has two
different interpretations:
1. It is a point estimate of the mean y value when x 5 x (i.e., of 1 x ).
2. It is a point prediction of an individual y value to be observed when x 5 x .
Example 11.2 Variations in clay brick masonry weight have implications not only for structural and
acoustical design but also for design of heating, ventilating, and air conditioning sys-
tems. The article “Clay Brick Masonry Weight Variation” (J. of Architectural Engr., 1996:
135–137) gave a scatterplot of y 5 mortar dry density (lb/ft3) versus mortar air content (%)
for a sample of mortar specimens, from which the following representative data was read:
x: 5.7 6.8 9.6 10.0 10.7 12.6 14.4 15.0 15.3
y: 119.0 121.3 118.2 124.0 112.3 114.1 112.2 115.1 111.3
x: 16.2 17.8 18.7 19.7 20.6 25.0
y: 107.2 108.9 107.8 111.0 106.2 105.0
The scatterplot of this data in Figure 11.5 certainly suggests the appropriateness of the
simple linear regression model; there appears to be a substantial negative linear rela-
tionship between air content and density, one in which density tends to decrease as air
content increases.
The values of the summary statistics required for calculation of the least squares
estimates are
^ xi 5 218.1 ^ yi 5 1693.6 ^ x2i 5 3577.01
^ xiyi 5 24,252.54 ^ y2i 5 191,672.90
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
510 chapter 11 Inferential Methods in Regression and Correlation
Density
125
115
105
Air content
5 15 25
from which
(218.1)(1693.6)
Sxy 5 24,252.54 2 5 2372.404000
15
(218.1)2
Sxx 5 3577.01 2 5 405.836000
15
2372.404000
b5 5 2.917622 2.9176
405.836000
1693.6 218.1
a5 2 (2.917622) a b 5 126.248889 126.25
15 15
The equation of the estimated regression line (the least squares line) is then
yn 5 126.25 2 .9176x
Substitution of the air content value 12.0 into this equation gives yn 5 115.24, which
can be interpreted either as a point estimate of the mean dry density for all specimens
whose air content is 12% or as a prediction for the dry density of a single mortar speci-
men whose air content is 12%.
Unless otherwise noted, all content on this page is © Cengage Learning.
Inferences based on the fitted model require that the error standard devia-
tion be estimated. The estimate is based on calculating the vertical deviations from
the estimated regression line. First, the predicted or fitted values are obtained by
substituting the x values from the sample into the equation of the estimated regres-
sion line: yn1 5 a 1 bx1, yn2 5 a 1 bx2, and so on. The residuals are then the differences
between the observed y values and the predicted y values: y1 2 yn1, . . . , yn 2 ynn. These are
the vertical deviations from the points in the scatterplot to the estimated regression line
(least squares line). Squaring and summing these residuals gives residual or error sum
of squares, denoted by either SSResid or by SSE:
SSResid 5 ^ (yi 2 yni)2 5 Syy 2 bSxy
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Regression Models Involving a Single Independent Variable 511
Each sum of squares in statistics has associated with it a specified number of degrees
of freedom. In simple linear regression, SSResid is based on n 2 2 df, because before
SSResid can be calculated, the two parameters and must be estimated, resulting
in a loss of 2 df (just as in the case of a single sample, estimating by x gives the sum
of squares ^ (xi 2 x)2 based on n 2 1 df). The statistic for estimating the third model
parameter 2 is the mean square error, obtained by dividing error SS by its df:
SSResid
estimate of 2 5 s2e 5
n22
estimate of 5 se 5 2s2e
Roughly speaking, se is the size of a typical deviation in the sample from the estimated
regression line.
In Chapter 3, SSResid was interpreted as a measure of the variation in observed y
values not explained by the approximate linear relationship between x and y. We also
introduced total sum of squares
1 ^ yi 2 2
SSTo 5 Syy 5 ^ (yi 2 y)2 5 ^ y2i 2
n
interpreted as a measure of total variation in the observed y values. In the present con-
text, the coefficient of determination
SSResid
r2 5 1 2
SSTo
Example 11.3 Let’s reconsider the data on x 5 air content and y 5 mortar dry density from
Example 11.2. The first predicted value and residual are
yn1 5 126.248889 2 .917622(5.7) 5 121.0184
y1 2 yn1 5 119.0 2 121.0184 5 22.0184
(The negative residual implies that the point (5.7, 119.0) lies below the estimated
regression line.) The relevant sums of squares are
SSTo 5 Syy 5 191,672.90 2 (1693.6)2y15 5 454.1693
SSResid 5 Syy 2 bSxy 5 454.1693 2 (2.917622)(2372.4040) 5 112.4432
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
512 chapter 11 Inferential Methods in Regression and Correlation
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Regression Models Involving a Single Independent Variable 513
Exponential Regression
A scatterplot of data obtained in a scientific or engineering investigation will often show
curvature rather than a linear pattern. The scatterplot of Figure 11.7 shows a monotonic
pattern, a tendency for y to decrease as x increases (alternatively, y might tend to in-
crease as x increases). In this case, an exponential regression model may be a reasonable
way to relate y to x. The model equation is multiplicative rather than additive:
y 5 ex ? «, « . 0
(The multiplicative random deviation is denoted by « to avoid confusion with the base e
of the natural logarithm system, whose value is approximately 2.7182818.) The popula-
tion regression function is e x. When « . 1, the point (x, y) lies above the graph of the
regression function, and « , 1 implies that the point lies below the graph. Now consider
the percentage change in the population regression function when x increases by 1:
3 e (x11) 2 e x 4
100 5 100(e 2 1)
e x
a constant not dependent on x. In simple linear regression, when x increases by 1 unit,
on average y will increase by a constant amount ; in this case, when x increases by
1 unit, on average y will increase (or decrease, if , 0) by a constant percentage.
Rupture time
45
40
35
30
25
20
15
10
5
0 Applied stress
20 30 40 50 60 70
Let’s now take the logarithm of both sides of the model equation:
where 5 ln (), 5 , and « 5 ln («). This is exactly the equation for simple linear
regression. Thus to say that y and x are related via the exponential regression model is
the same as saying that ln(y) and x are related by the simple linear regression model
(provided that ln(«) is normally distributed, which is equivalent to « itself having a
lognormal distribution). In particular, using the previous formulas for the slope and
intercept of the least squares line on the (xi, ln(yi)) pairs gives point estimates of and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
514 chapter 11 Inferential Methods in Regression and Correlation
ln (), respectively. A point estimate of results from taking the antilog of the estimate
for ln (). Figure 11.8 shows the result of transforming the y values in Figure 11.7 by logs
and then fitting the simple linear regression model. The r2 value from this regression is
obviously very high, so the simple linear regression model explains virtually all of the
observed variation in ln(time to rupture).
Regression Plot
Y 5.08298 5.55E-02X
R-Sq 98.8%
4
3
In (time)
20 30 40 50 60 70
applstrss
Figure 11.8 Minitab output from fitting the simple linear regression
model to the ( , ln( )) pairs resulting from the data of Figure 11.7
The key point here is that making a transformation [transformed y 5 ln(y)] results
in the simple linear regression model. There are many other models nonlinear in y or x
for which a transformation on one or both of the variables recaptures the simple linear
regression model. Parameters of the original model can then be estimated in a relatively
straightforward way. Unless otherwise noted, all content on this page is © Cengage Learning.
1. The flow rate y (m3/min) in a device used for air- b. What change in flow rate can be expected
quality measurement depends on the pressure drop when pressure drop increases from 10 in. to
x (in. of water) across the device’s filter. Suppose 15 in.?
that for x values between 5 and 20, the two variables c. What is the expected (i.e., true average) flow
are related according to the simple linear regression rate when the pressure drop is 10 in.? When the
model with true regression line y 5 2.12 1 .095x. pressure drop is 15 in.?
a. What is the expected (i.e., true average) change d. Suppose that 5 .025 and consider making
in flow rate associated with a 1-in. increase in repeated observations on flow rate when the
pressure drop? Explain. pressure drop is 10 in. What is the long-run
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.1 Exercises 515
proportion of observed flow rates that will exceed c. Calculate a point estimate of the true average
.835 [that is, what is P(y . .835 when x 5 10)]? runoff volume when rainfall volume is 50.
d. Calculate a point estimate of the error standard
2. In a certain chemical process the reaction time y (hr)
deviation .
is known to be related according to the simple
e. What proportion of the observed variation in run-
linear regression model to the temperature x (°F)
off volume can be attributed to the simple linear
in the chamber in which the reaction takes place.
regression relationship between runoff and rainfall?
The model equation is y 5 5.00 2 .01x 1 e, with
5 .075. 5. The bond behavior of reinforcing bars is an im-
a. What is the true average change in reaction portant determinant of strength and stability. The
time associated with a 1°F increase in tempera- article “Experimental Study on the Bond Behavior
ture? A 10°F increase in temperature? of Reinforcing Bars Embedded in Concrete Sub-
b. What is the true average reaction time when jected to Lateral Pressure” (J. of Materials in Civil
temperature is 200°F? When temperature is Engr., 2012: 125–133) reported the results of one ex-
250°F? periment in which the researchers applied varying
c. What is P(2.4 , y , 2.6 when x 5 250)? If an in- levels of lateral pressure on 21 concrete cube speci-
vestigator makes five independent experimental mens, each with an embedded 16-mm plain steel
runs, each for a temperature of 250°F, what is round bar, and measured the corresponding bond
the probability that all five observed reaction capacity. Due to differing concrete cube strengths
times are between 2.4 and 2.6? ( fcu, in MPa), the applied lateral pressure was
equivalent to a fixed proportion of the specimen’s
3. Let V be the vapor pressure of water (mm Hg) at a
fcu (0, .1fcu, . . . , .6fcu). Also, since bond strength can
specific temperature T (°K). The Clausius–Clapeyron
be heavily influenced by the specimen’s fcu, bond
equation from physical chemistry suggests that y 5
capacity was expressed as the ratio of bond strength
ln(V) is related to x 5 1/ T according to the simple
(MPa) to 1fcu.
linear regression model.
a. What is the implied probabilistic relationship Pressure: 0 0 0 .1 .1 .1 .2
between V and T? Ratio: 0.123 0.100 0.101 0.172 0.133 0.107 0.217
b. If the coefficients in the simple linear regres-
Pressure: .2 .2 .3 .3 .3 .4 .4
sion model are 5 20.607 and 5 25200.762,
Ratio: 0.172 0.151 0.263 0.227 0.252 0.310 0.365
what would you predict for the value of vapor
pressure when temperature is 300? Pressure: .4 .5 .5 .5 .6 .6 .6
Ratio: 0.239 0.365 0.319 0.312 0.394 0.386 0.320
4. The article “Characterization of Highway Runoff
in Austin, Texas, Area” (J. of Envir. Engr., 1998: a. Does a scatterplot of the data support the use of
131–137) gave a scatterplot, along with the least the simple linear regression model?
squares line, of x 5 rainfall volume (m3) and y 5 b. Calculate point estimates of the slope and inter-
runoff volume (m3) for a particular location. The cept of the population regression line.
accompanying values were read from the plot: c. Calculate a point estimate of the true average
bond capacity when lateral pressure is .45fcu.
x: 5 12 14 17 23 30 40 47
d. Calculate a point estimate of the error standard
y: 4 10 13 15 15 25 27 46 deviation .
x: 55 67 72 81 96 112 127
6. A study reported in the article “The Effects of Water
y: 38 46 53 70 82 99 100
Vapor Concentration on the Rate of Combustion
a. Does a scatterplot of the data support the use of of an Artificial Graphite in Humid Air Flow”
the simple linear regression model? (Combustion and Flame, 1983: 107–118) gave
b. Calculate point estimates of the slope and inter- data on x 5 temperature of a nitrogen–oxygen mix-
cept of the population regression line. ture (1000s of °F) under specified conditions and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
516 chapter 11 Inferential Methods in Regression and Correlation
y 5 oxygen diffusivity. Summary quantities are rating and pile length as timber damage changes
from 0%, to 20%, and to 40%.
n59 ^ xi 5 12.6 ^ yi 5 27.68 d. Calculate a point estimate of the error standard
^ x2i 5 18.24 ^ xiyi 5 40.968 deviation for each of the pairs. How do these
point estimates change as timber damage in-
^ y2i 5 93.3448
creases from 0% to 20% and then to 40%?
a. Assuming that the variables are related by the
simple linear regression model, determine the 8. Exercise 30 in Section 3.4 gave data on x 5 testing
equation of the estimated regression line. temperature and y 5 dynamic shear modulus for a
b. Calculate a point estimate of mean diffusivity particular asphalt binder type. A scatterplot of x and
when temperature is 1.5. How does this point y 5 log(y) shows a substantial linear pattern, sug-
estimate compare to a point prediction of the gesting that these variables are related by the simple
diffusivity value that would result from making linear regression model.
one more observation when temperature is 1.5? a. What probabilistic model for relating y =
c. Estimate the error standard deviation . dynamic shear modulus to x 5 testing tempera-
d. Calculate and interpret the coefficient of deter- ture is implied by the simple linear regression
mination. relationship between x and y9?
b. Summary quantities calculated from the data are
7. Timber piles are often used to buttress multiple-
span simply supported (MSSS) bridges that are com- n 5 7 ^ xi 5 211.4 ^ y i 5 40.64
monly found in rural areas. The authors of “Bridge ^ x2i 5 8449.68 ^ (y i)2 5 282.58
Timber Piles Load Rating under Eccentric Loading
Conditions” (J. Bridge Engr., 2012: 700–710) exam- ^ xi y i 5 917.48
ined the effect of various geometric and structural Calculate estimates of the parameters for the model
characteristics on the critical rating (an overall struc- in part (a), and then obtain a point prediction of
tural assessment score) of MSSS bridges. The article dynamic shear modulus when temperature is 35°F.
reported the following data (read from a graph) for x 5
timber pile length (m) and y 5 critical rating for a 9. The authors of the article “Long-Term Effects
particular timber profile at various damage levels. of Cathodic Protection on Prestressed Concrete
Structures” (Corrosion, 1997: 891–908) presented
x 5 Timber pile a scatterplot of y 5 steady-state permeation flux
length (m): 7.32 7.93 8.54 9.14 9.75 ( Aycm2) versus x 5 inverse foil thickness (cm21);
y 5 Critical rating the substantial linear pattern was used as a basis for
(damage 5 0%): 59.09 54.79 49.74 44.11 37.99 an important conclusion about material behavior.
y 5 Critical rating This is the Minitab output from fitting the simple
(damage 5 20%): 57.52 52.63 44.28 33.85 25.74 linear regression model to the data.
y 5 Critical rating
(damage 5 40%): 43.94 30.70 19.12 9.77 2.48 The regression equation is
flux = –0.398 + 0.260 invthick
a. Create the scatterplots for the pairs (x, y ), (x, y ),
Predictor Coef Stdev t-ratio p
and (x, y ). Does each scatterplot suggest that
Constant –0.3982 0.5051 –0.79 0.460
a simple linear regression model holds for the
invthick 0.26042 0.01502 17.34 0.000
respective variables?
b. For each pair, calculate point estimates of the s = 0.4506 R-sq = 98.0% R-sq(adj) = 97.7%
slope and intercept of the respective population Analysis of Variance
regression line and determine the correspond- Source DF SS MS F p
ing coefficients of determination. Regression 1 61.050 61.050 300.64 0.000
c. Given the slope coefficients from the regression, Error 6 1.218 0.203
summarize the relationship between critical Total 7 62.269
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.2 Inferences About the Slope Coefficient 517
inv- Stdev. St. a. Interpret the estimated slope and the coefficient
Obs. thick flux Fit Fit Residual Resid of determination.
1 19.8 4.3 4.758 0.242 –0.458 –1.20 b. Calculate a point estimate of true average flux
2 20.6 5.6 4.966 0.233 0.634 1.64 when inverse foil thickness is 23.5.
3 23.5 6.1 5.722 0.203 0.378 0.94 c. Predict the value of flux that would result from a
4 26.1 6.2 6.399 0.182 –0.199 –0.48 single observation made when inverse foil thick-
5 30.3 6.9 7.493 0.161 –0.593 –1.41 ness is 45.
6 43.5 11.2 10.930 0.236 0.270 0.70 d. Verify that the sum of the residuals is zero and
7 45.0 11.3 11.321 0.253 –0.021 –0.06 that squaring and summing the residuals results
8 46.5 11.7 11.711 0.271 –0.011 –0.03 in the value of SSResid given in the output.
has a t distribution with n 2 1 degrees of freedom, from which the interval estimate x 6
(t critical value) (sy1n) emerges.
In the same way that the statistic x varies in value from sample to sample, the sta-
tistic b does also. For example, if the slope of the population regression line is actually
5 25.0, a first sample might result in b 5 24.2, a second in an estimate of 26.5, a third
in 25.4, and so on.
3. is normally distributed (because in the model equation is assumed to have a normal
distribution).
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
518 chapter 11 Inferential Methods in Regression and Correlation
The smaller the value of b, the more precisely will tend to be estimated. Because
is in the numerator, the less variability there is about the population line, the smaller is
the standard deviation of b and the more concentrated is its sampling distribution. The
value of is, of course, not under our control. However, we may be able to have an
impact on the value of Sxx. Because this quantity is in the denominator of b, the larger
its value, the smaller is the value of the standard deviation. Since Sxx is a measure of how
much the xi values in the sample spread out, the implication is that spreading out the
values of the independent variable tends to give a more precise estimate than if these
values were quite close together. Intuitively, if the sample xi values were highly concen-
trated, very small changes in the resulting yi’s might substantially affect the slope of the
least squares line, whereas such changes would have little effect on the slope if the xi’s
were quite spread out. So if the investigator can select the x values at which observations
will be made (frequently not possible in social science and business scenarios), they
should be spread out as much as possible while still preserving the approximate linearity
of the relationship between x and y.
has a distribution based on 2 2 df. This in turn implies that a confidence interval
for is
6 ( critical value)
Appendix Table IV contains critical values corresponding to the most frequently used
confidence levels.
Example 11.4 Let’s reconsider the data on air content and mortar dry density introduced in
Examples 11.2 and 11.3. In this context, is the average or expected change in dry
density associated with an increase of 1% in air content. We previously calculated
Sxx 5 405.836000, b 5 2.918, and se 5 2.941, from which the estimated standard
deviation (standard error) of b is
2.941
sb 5 5 .1460
1405.836
The confidence interval is based on n 2 2 5 15 2 2 5 13 df, and the correspond-
ing t critical value for a confidence level of 95% is 2.160. The confidence interval is
2 .918 6 (2.160)(.1460) 5 2.918 6 .315 5 (21.233,2.603)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.2 Inferences About the Slope Coefficient 519
Null hypothesis: 0: 5 0
2 0
Test statistic: 5 , which is based on 2 2 df
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
520 chapter 11 Inferential Methods in Regression and Correlation
In practice, the most frequently tested null hypothesis is H0: 5 0. When the slope
of the population regression line is zero, there is no useful linear relationship between x
and y. The usual alternative hypothesis is Ha: Þ 0, according to which there is a useful
linear relationship between the two variables. A test of these two hypotheses is often re-
ferred to as the model utility test in simple linear regression. Unless H0 can be rejected
at a reasonably small significance level, the simple linear regression model should not be
used as a basis for making various inferences (e.g., for predicting y from knowledge of x).
In practice, the model will generally be judged useful by this test when r2 is reasonably
large. On occasion, the alternatives Ha: . 0 or Ha: , 0 may be of interest; the for-
mer says that there is in fact a positive linear relationship between the two variables (a
tendency for y to increase linearly as x increases). The test statistic in all three cases is
the t-ratio, bysb.
Example 11.5 The presence of hard alloy carbides in high chromium white iron alloys results in
excellent abrasion resistance, making them suitable for materials handling in the
mining and materials processing industries. The accompanying data on x 5 retained
austentite content (%) and y 5 abrasive wear loss (mm3) in pin wear tests with garnet
as the abrasive was read from a plot in the article “Microstructure-Property Relation-
ships in High Chromium White Iron Alloys” (Intl. Materials Reviews, 1996: 59–82).
x: 4.6 17.0 17.4 18.0 18.5 22.4 26.5 30.0 34.0
y: .66 .92 1.45 1.03 .70 .73 1.20 .80 .91
x: 38.8 48.2 63.5 65.8 73.9 77.2 79.8 84.0
y: 1.19 1.15 1.12 1.37 1.45 1.50 1.36 1.29
A scatterplot of the data (not shown) suggests that the simple linear regression may
specify a useful relationship between these two variables. Is this indeed the case?
Let’s base our analysis on the SAS output in Figure 11.9.
Figure 11.9 SAS output from a simple linear regression of the data in Example 11.5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.2 Inferences About the Slope Coefficient 521
The parameter of interest is , the average change in wear loss associated with a
1% (i.e., 1-unit) increase in austentite content. The relevant hypotheses are
H0: 5 0 (the model is not useful)
Ha: Þ 0 (there is a useful linear relationship between the variables)
The test statistic is the model utility t-ratio t 5 bysb. From the Parameter Estimates
table in Figure 11.9,
.007570
b 5 .007570 sb 5 .00192626 t5 5 3.93 3.9
.00192626
The two-tailed test is based on n 2 2 5 15 df. In Appendix Table VI, the area under
the 15 df t curve to the right of 3.9 is .001, so the P-value for the test is roughly .002.
Figure 11.9 gives this P-value as .0013 (so the area to the right of 3.93 must be about
.00065). Clearly the P-value is smaller than either .05 or .01. H0 can obviously be
rejected in favor of the conclusion that there is a useful linear relationship. Notice
that the r2 value is .507, which is not terribly impressive. But as long as n is not too
small, the model will be judged useful even when r2 is moderate to small.
The article’s authors asserted that “increasing the austentite content leads to
greater wear rates with garnet as the abrasive.” The implied alternative hypothesis
is Ha: . 0 (a positive linear relationship). The P-value for this upper-tailed test is
about .001 (more exactly, .00065), which clearly supports the authors’ contention.
where df 5 1 for SSRegr and df 5 n 2 2 for SSResid. The two mean squares are then
MSRegr 5 SSRegry 1 and MSResid 5 SSResidy (n 2 2), and the F ratio is given by F 5
MSRegry MSResid. The calculations are usually summarized in an ANOVA table, as
shown in Table 11.1.
Unless otherwise noted, all content on this page is © Cengage Learning.
Source of Sum of
variation df squares Mean square F P-value
Model Area to right of
(Regression) 1 SSRegr SSRegr MSRegry MSResid calculated F
Error n22 SSResid SSResidy(n 2 2)
Total n21 SSTo
Looking at the ANOVA table on the SAS output of Figure 11.9, we see that the
calculated F ratio for the data of Example 11.5 is F 5 15.444, and the corresponding
P-value (the area under the F curve with 1 numerator and 15 denominator df to the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
522 chapter 11 Inferential Methods in Regression and Correlation
right of 15.444) is .0013. That this P-value is identical to the P-value for the model utility
t test is no accident: It can be shown that t2 5 F [(3.930)2 5 15.444 in Example 11.5],
and the distribution of the square of a t variable with df is the F distribution with 1
numerator and denominator df. However, in multiple regression, the test for model
utility is an F test, and t tests are used for another purpose.
Correlation Revisited
The sample correlation coefficient r was introduced in Chapter 3 as a measure of the
extent of linear association between values of x and y in a sample. An analogous mea-
sure for the entire population from which the sample of pairs was selected is called the
population correlation coefficient and is denoted by . The most important properties
of r are also satisfied by ; in particular,21 # # 1, so the closer is to 1 or 21, the
stronger the linear relationship within the population. The value 5 0 indicates the
complete absence of any linear relationship in the population. Even if 5 0, the value
of r will usually differ somewhat from zero because of sampling variability—r is a sta-
tistic and its value will vary from sample to sample in the same way that x and b do. It
is therefore important to have a formal test of the null hypothesis that 5 0. The usual
test procedure assumes that (x1, y1), . . . , (xn, yn) have been randomly selected from a
bivariate normal population distribution (introduced in Section 3.6). This assumption
is difficult to check. A partial assessment of plausibility is based on constructing one
normal quantile plot of the x’s and another of the y’s. A nonlinear pattern in either plot
is a warning of implausibility.
When 0 is true, the test statistic has a distribution based on 2 2 df, so a -value is
computed as was done for previous tests. In particular, the usual alternative hypothesis is
a: Þ 0 (no linear association, positive or negative, in the population), for which the test
is two-tailed and the -value is twice the tail area captured by the calculated | .
Example 11.6 Neurotoxic effects of manganese are well known and are usually caused by high oc-
cupational exposure over long periods of time. In the fields of occupational hygiene
and environmental hygiene, the relationship between lipid peroxidation, which is
responsible for deterioration of foods and damage to live tissue, and occupational
exposure has not been previously reported. The article “Lipid Peroxidation in Work-
ers Exposed to Manganese” (Scand. J. Work and Environ. Health, 1996: 381–386)
gave data on x 5 manganese concentration in blood (ppb) and y 5 concentration
(molyL) of malondialdehyde, which is a stable product of lipid peroxidation, both
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.2 Exercises 523
for a sample of 22 workers exposed to manganese and for a control sample of 45 indi-
viduals. The value of r for the control sample was .29, from which
(.29)145 2 2
t5 2.0
21 2 (.29)2
The corresponding P-value for a two-tailed t test based on 43 df is roughly .052 (the
cited article reported only that P-value ..05). We would not want to reject the as-
sertion that 5 0 at either significance level .01 or .05. For the sample of exposed
workers, r 5 .83 and t 6.7, clear evidence that there is a linear relationship in the
entire population of exposed workers from which the sample was selected.
The hypothesis H0: 5 0 for the model utility test in regression also asserts that
there is no linear relationship between x and y. Although it is certainly not obvious by
inspection, it can be shown that the t-ratio bysb is algebraically identical to the t statistic
in the previous box for testing 5 0. The value of the latter statistic is easier to compute,
since it requires only r and not any of the calculations appropriate for regression.
Test procedures for H0: 5 0 when 0 Þ 0 are rather complicated, as is the proce-
dure for obtaining a confidence interval for . Please consult one of the chapter refer-
ences for further information.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
524 chapter 11 Inferential Methods in Regression and Correlation
for 18 asphalt mixture samples having 5% asphalt 17. A sample of n 5 13 steel specimens was selected,
content. The following R output is from a simple and the values of x 5 nickel content and y 5 per-
linear regression of y on x: centage austentite were determined, resulting in
^ (xi 2 x)2 5 1.183 ^ (yi 2 y)2 5 .05080
Std. t Pr
Estimate Error value (>|t|)
(Intercept) 4.858691 0.059768 81.293
AirVoid
<2e-16
-0.074676 0.009923 -7.526 1.21e-06
^ (xi 2 x)(yi 2 y) 5 .2073
Residual standard error: 0.03551 on 16 Does there appear to be a positive linear relation-
degrees of freedom ship between these two variables in the sampled
Multiple R-squared: 0.7797, population? State and test the relevant hypotheses.
Adjusted R-squared: 0.766
F-statistic: 56.63 on 1 and 16 DF, p-value: 1.214e-06 18. In what was surely an unpleasant data collection ex-
Analysis of Variance Table perience, the article “Annual Variations of Odor Con-
Response: Dielectric centrations and Emissions from Swine Gestation,
DF Sum Sq Mean Sq F value Pr(>F) Farrowing, and Nursery Buildings” (J. of the Air
AirVoid 1 0.071422 0.071422 56.635 1.214e-06
and Waste Mgmnt., 2011: 1361–1368) reported on
Residuals 16 0.020178 0.001261
monthly odor concentrations and emission rates from
a. What are the values of SSRegr, SSResid, and a Canadian swine farm for a period of one year. One
SSTo? study objective was to identify possible relationships,
b. Determine and interpret the value of r2 for this if any, between odor and presence of other gases such
regression. What is the corresponding value of r? as ammonia (NH3), hydrogen sulfide (H2S), carbon
Note that the sign of r can be determined based dioxide (CO2), and methane (CH4). Identifying such
on the output. relationships would be helpful in that the gas concen-
c. Use the output to calculate a confidence interval tration could be used as an odor indicator.
with a confidence level of 95% for the slope of a. A scatterplot of the n 5 32 observations on y 5
the population regression line and interpret the odor concentration (OU/m3) and x 5 H2S con-
resulting interval. centration (ppb) suggested the plausibility of a
d. Suppose it had previously been believed that when positive linear relationship. The coefficient of
air void increased by 1 percent, the associated true determination for the simple linear regression
average change in dielectric constant would be at of y on x was .58. State and test the relevant hy-
least 2.05. Does the sample data contradict this potheses to see if the message from the scatter-
belief? State and test the relevant hypotheses. plot can be confirmed.
15. Suppose that the unit of measurement for y 5 wear b. A scatterplot of the n 5 32 observations on y 5
loss in Example 11.5 is changed from mm3 to in3, odor concentration (OU/m3) and x 5 CH4 con-
which amounts to multiplying each y value by the centration (ppm) also suggested the plausibility of
same conversion factor c. How does this change a positive linear relationship. The coefficient of
affect the value of the t-ratio for testing model determination for the simple linear regression of y
utility? Explain your reasoning. on x was 0.33. State and test the relevant hypoth-
eses to see if the message from the scatterplot can
16. The value of the sample correlation coefficient is
be confirmed.
.722 for the n 5 14 observations on average ante-
rior maximum inclination angle (AMIA) in both 19. How does lateral acceleration—side forces ex-
the clockwise (Cl) and counterclockwise (Co) di- perienced in turns that are largely under driver
rections given in Exercise 10 (Section 3.2) of Chap- control—affect nausea as perceived by bus pas-
ter 3. Carry out a test at significance level .05 to de- sengers? The article “Motion Sickness in Public
cide whether these two variables are linearly related Road Transport: The Effect of Driver, Route, and
in the population from which the data was selected Vehicle” (Ergonomics, 1999: 1646–1664) reported
(assuming that the population distribution is bivari- data on x 5 motion sickness dose (calculated in
ate normal). accordance with a British standard for evaluating
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.3 Inferences Based on the Estimated Regression Line 525
similar motion at sea) and y 5 reported nausea (%). Here is data on x 5 leaching time (h), yfw 5 nitrate
Relevant summary quantities are extraction percentage (freshwater), and ysw 5 nitrate
extraction percentage (seawater):
n 5 17 ^ xi 5 221.1 ^ yi 5 193
^ x2i 5 3056.69 ^ xiyi 5 2759.6 x: 25.5 31.5 37.5 43.5 49.5 55.5
yfw: 25.7 43.2 55.3 62.9 68.6 73.2
^ y2i 5 2975 ysw: 26.4 40.1 50.2 57.4 62.7 67.3
Values of dose in the sample ranged from 6.0 to 17.6.
a. Assuming that the simple linear regression x: 61.5 67.5 73.5 79.5 85.5 91.5
model is valid for relating these two variables yfw: 76.7 79.4 81.8 83.7 85.1 86.5
(this is supported by the raw data), calculate and ysw: 71.4 74.7 77.8 80.3 82.3 84.1
interpret an estimate of the slope parameter that x: 97.5 103.5 109.5 115.5 121.5 127.5
conveys information about the precision and yfw: 87.7 88.6 89.6 90.5 90.7 91.2
reliability of estimation.
ysw: 85.5 86.6 87.9 89.0 89.9 90.6
b. Does it appear that there is a useful linear rela-
tionship between these two variables? x: 133.5 139.5 145.5 151.5 157.5
c. Would it be sensible to use the simple linear re- yfw: 91.9 92.5 93.1 93.9 94.7
gression model as a basis for predicting % nausea ysw: 91.2 91.8 92.3 92.8 93.3
when dose 5 5.0? Explain your reasoning.
a. Construct scatterplots of yfw versus x and ysw ver-
d. When Minitab was used to fit the simple linear
sus x. Note the nonlinearity of the plots. Would
regression model to the raw data, the observa-
it be reasonable to describe the patterns in both
tion (6.0, 2.50) was flagged as possibly having
plots as curved and monotonic?
a substantial impact on the fit. Eliminate this
b. In Section 3.4, we described how a power trans-
observation from the sample and recalculate the
formation can be applied to create a linear pat-
estimate of part (a). Based on this, does the obser-
tern in the transformed data. Using the trans-
vation appear to be exerting an undue influence?
formation x 5 1yx, construct scatterplots of yfw
20. Mineral mining is one of the most important eco- versus x , and ysw versus x . For each set of pairs,
nomic activities in Chile. Mineral products are fre- calculate point estimates of the slope and inter-
quently found in saline systems composed largely of cept of the respective population regression line.
natural nitrates. Freshwater is often used as a leach- c. Does the simple linear regression model appear
ing agent for the extraction of nitrate, but the Chilean to specify a useful relationship between either
mining regions have scarce freshwater resources. An dependent variable and x in part (b)? State and
alternative leaching agent is seawater. The authors test the relevant hypotheses.
of “Recovery of Nitrates from Leaching Solutions d. The researchers concluded that the freshwater
Using Seawater” (Hydrometallurgy, 2013: 100–105) and seawater leaching agents yield similar nitrate
evaluated the recovery of nitrate ions from discarded extraction efficiencies. Using the regression mod-
salts using freshwater and seawater leaching agents. els from part (b), calculate a point estimate of true
Tests were performed in salt columns irrigated at nitrate extraction percentage when leaching time
the same rate for a period of more than 150 hours. is 150 hours. Are the two estimates similar?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
526 chapter 11 Inferential Methods in Regression and Correlation
x 5 x and also how to calculate a prediction interval for the value of a single y to be
observed at some time in the future when x 5 x . For example, x might be the tensile
force applied to a steel specimen (1000s of lb) and y the resulting amount of elonga-
tion (thousandths of an inch). Then we might wish to calculate a confidence interval
(interval of plausible values) for the average amount of elongation for all specimens to
which a tensile force of 5000 lb is applied (so x 5 5). Alternatively, we might subject a
single specimen to a force of 5000 lb and wish to calculate a prediction interval (interval
of plausible values) for the resulting amount of elongation.
Recall that substituting a particular value x into the equation of the estimated re-
gression line gives a number yn 5 a 1 bx* that has two different interpretations: It can be
regarded either as a point estimate of the mean y value when x 5 x or as a point predic-
tion of the y value that would result from making a single observation when x has this
value. Because the point estimate and point prediction are single numbers, they convey
no information about the reliability or precision of estimation or prediction. An interval
gives information about reliability through its confidence or prediction level (e.g., 95%)
and about precision from the width of the interval.
Before we obtain sample data, both a and b are subject to sampling variability—that
is, they are both statistics whose values will vary from sample to sample. Suppose, for
example, that 5 50 and 5 2. Then a first sample of (x, y) pairs might give a 5 52.35,
b 5 1.895, a second sample might result in a 5 46.52, b 5 2.056, and so on. It follows that
yn 5 a 1 bx* itself varies in value from sample to sample, so it is a statistic. If the intercept
and slope of the population line are the aforementioned values 50 and 2, respectively, and
x 5 10, then this statistic is trying to estimate the value 50 1 2(10) 5 70. The estimate
from a first sample might be 52.35 1 1.895(10) 5 71.30, from second sample might be
46.52 1 2.056(10) 5 67.08, and so on. In the same way that a confidence interval for
was based on properties of the sampling distribution of b, a confidence interval for a mean
y value in regression is based on properties of the sampling distribution of the statistic yn.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.3 Inferences Based on the Estimated Regression Line 527
The values of both yn and syn increase as the value of (x* 2 x)2 gets larger. That is,
these standard deviations increase in value as the specified value x deviates farther from
x, the center of the x values for the sample observations. Thus the farther x is from x, the
less precisely yn tends to estimate 1 x*.
1 ( * 2 )2
n5 1
B
This implies that a confidence interval for a bx*, the mean value when 5 *, is
n 6 ( critical value) n
The critical values corresponding to the usual confidence levels are given in Appendix
Table IV; a value from the 2 2 df row of this table should be used.
Example 11.7 Corrosion of steel reinforcing bars is the most important durability problem for
reinforced concrete structures. Carbonation of concrete results from a chemical
reaction that lowers the pH value by enough to initiate corrosion of the rebar.
Representative data on x 5 carbonation depth (mm) and y 5 strength (MPa) for
a sample of core specimens taken from a particular building follow (read from
a plot in the article “The Carbonation of Concrete Structures in the Tropical
Environment of Singapore,” Magazine of Concrete Res., 1996: 293–300):
x: 8.0 15.0 16.5 20.0 20.0 27.5 30.0 30.0 35.0
y: 22.8 27.2 23.7 17.1 21.5 18.6 16.1 23.4 13.4
x: 38.0 40.0 45.0 50.0 50.0 55.0 55.0 59.0 65.0
y: 19.5 12.4 13.2 11.4 10.3 14.1 9.7 12.0 6.8
A scatterplot of the data (see Figure 11.11 on p. 529) gives strong support to use of the
simple linear regression model. Relevant quantities are as follows:
^ xi 5 659.0 ^ x2i 5 28,967.50 x 5 36.61111 Sxx 5 4840.7778
^ yi 5 293.2 ^ xiyi 5 9293.95 ^ y2i 5 5335.76
b 5 2.297561 a 5 27.182936 SSResid 5 131.2402
2
r 5 .766 se 5 2.8640
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
528 chapter 11 Inferential Methods in Regression and Correlation
Let’s now calculate a confidence interval, using a 95% confidence level, for the
mean strength for all core specimens having a carbonation depth of 45 mm—that is,
a confidence interval for 1 (45). The interval is centered at
The 16 df t critical value for a 95% confidence level is 2.120, from which we deter-
mine the desired interval to be
The narrowness of this interval suggests that we have reasonably precise informa-
tion about the mean value being estimated. Remember that if we recalculated this
interval for sample after sample, in the long run about 95% of the calculated intervals
would include 1 (45). We can only hope that this mean value lies in the single
interval that we have calculated.
Figure 11.10 shows Minitab output resulting from a request to fit the simple
linear regression model and calculate confidence intervals for the mean value of
strength at depths of 45 mm and 35 mm. The intervals are at the bottom of the
output; note that the second interval is narrower than the first, because 35 is much
closer to x than is 45. Figure 11.11 (on page 529) shows a Minitab scatterplot with
(1) curves corresponding to the confidence limits for each different x value and
(2) prediction limits, to be discussed shortly. Notice how the curves get farther and
farther apart as x moves away from x.
Figure 11.10 Minitab regression output for the data of Example 11.7
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.3 Inferences Based on the Estimated Regression Line 529
Regression Plot
Y 27.1829 0.297561X
R-Sq 76.6%
30
20
strength
10
Regression
95% Cl
95% Pl
0
0 10 20 30 40 50 60 70
depth
Figure 11.11 Minitab scatterplot with confidence intervals and prediction intervals for
the data of Example 11.7
The estimation error is the difference between a random quantity (yn) and a fixed quan-
tity, whereas the prediction error is the difference between two random quantities. This
implies that there is more uncertainty associated with making a prediction than with
estimating a mean y value. The mean value of the prediction error is
Furthermore, yn and y are independent of one another, because the former is based on
the sample data and the latter is to be observed at some future time. This implies that
2
1 (x* 2 x)
2yn 2y* 5 2yn 1 2y* 5 2 c 1 d 1 2
n Sxx
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
530 chapter 11 Inferential Methods in Regression and Correlation
The standard deviation of the prediction error is the square root of this expression, and
the estimated standard deviation results from replacing 2 by s2e . Using these results to
standardize the prediction error gives a t variable from which the prediction interval is
obtained.
has a distribution with 2 2 df. This implies that a prediction interval for a future
y value y* to be observed when x x* is
n 6 ( critical value) 2 2 1 s2n
Without s2e under the square root in the prediction interval formula, we would have
the confidence interval formula. This implies that the prediction interval (PI) is wider
than the confidence interval (CI)—often much wider because s2e is frequently much
larger than sn2y . The prediction level for the interval is interpreted in the same way that
a confidence level was previously interpreted. If a prediction level of 95% is used in
calculating interval after interval from different samples, in the long run about 95% of
the calculated intervals will include the value y that is being predicted. Of course, we
will not know whether the single interval that we have calculated is one of the good 95%
until we have observed y .
Example 11.8 Let’s return to the carbonation depth–strength data of Example 11.7 and calculate a
95% prediction interval for a strength value that would result from selecting a single
core specimen whose carbonation depth is 45 mm. Relevant quantities from that
example are
yn 5 13.79 syn 5 .7582 se 5 2.8640
For a prediction level of 95% based on n 2 2 5 16 df, the t critical value is 2.120,
exactly what we previously used for a 95% confidence level. The prediction interval
is then
Plausible values for a single observation on strength when depth is 45 mm are (at
the 95% prediction level) between 7.51 MPa and 20.07 MPa. The 95% confidence
interval for mean strength when depth is 45 was (12.18, 15.40). The prediction in-
terval is much wider than this because of the extra (2.8640)2 under the square root.
Figure 11.10, the Minitab output in Example 11.7, shows this interval as well as the
confidence interval.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.3 Exercises 531
Simultaneous Intervals
Suppose we wish to calculate a confidence interval for the mean y value or a prediction
interval for a future y value both when x 5 x1* and also when x 5 x2* , two different values
of the predictor variable. If the confidence or prediction level for each individual inter-
val is 95%, then the joint or simultaneous level of confidence for both intervals will be
smaller than 95%. For example, from Examples 11.7 and 11.8, we can be 95% confident
that a y value to be observed when x 5 45 will be in the interval (7.51, 20.07) and also
95% confident that a y value to be observed when x 5 35 will lie in the interval (10.53,
23.01). The degree of confidence in the simultaneous statements
must be less than 95%. It is very difficult to say exactly what the degree of simulta-
neous confidence is, because the two intervals are not based on independent data
sets [if they were, the simultaneous level would be 100(.95)2 90%]. What can be
said is that the simultaneous confidence level will be at least 100(1 2 2(.05))%, that
is, at least 90%. More generally, if k different intervals are calculated, each using a
confidence or prediction level of 100(1 2 )%, then the simultaneous confidence
or prediction level for all k intervals will be at least 100(1 2 k)%. Thus if three
different 99% confidence intervals were computed, the simultaneous confidence
level would be at least 97%. There is a special table of t critical values for which the
simultaneous level for k intervals is at least 95% (k 5 2, 3, 4, . . .) and another such
table for at least 99%; the tabulated numbers are called Bonferroni t critical values
after the mathematician whose inequality justifies the “at least” statement. If more
than two or three of these intervals are calculated, they will have to be quite wide to
guarantee at least the desired level.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
532 chapter 11 Inferential Methods in Regression and Correlation
herbicide manufacturing, and fiberglass manufac- 24. The simple linear regression model provides a very
turing. These compounds are toxic, carcinogenic, good fit to the data on rainfall and runoff volume
and have contributed over the past decades to given in Exercise 4 of Section 11.1. The equation
environmental pollution of aquatic environments. of the least squares line is yn 5 21.128 1 .82697x,
In one study reported in “Photolysis, Biodegrada- r2 5 .975, and se 5 5.24. Use the fact that syn 5 1.44
tion, and Sorption Behavior of Three Selected when rainfall volume is 40 m3 to predict runoff in a
Phenolic Compounds on the Surface and Sediment way that conveys information about reliability and
of Rivers” (J. of Envir. Engr., 2011: 1114–1121), the precision. Does the resulting interval suggest that
authors examined the sorption characteristics of precise information about the value of runoff for
three selected phenolic compounds. The following this future observation is available? Explain your
data on y 5 sorbed concentration (g/g) and x 5 reasoning.
equilibrium concentration (g/mL) of 2, 4-Dinitro-
25. The article “Root Dentine Transparency: Age
phenol (DNP) in a particular natural river sediment
Determination of Human Teeth Using Comput-
was read from a graph in the article.
erized Densitometric Analysis” (Amer. J. of Physi-
x: 0.11 0.13 0.14 0.18 0.29 0.44 0.67 0.78 0.93 cal Anthro., 1991: 25–30) reported on an inves-
y: 1.72 2.17 2.33 3.00 5.17 7.61 11.17 12.72 14.78 tigation of methods for age determination based
a. Calculate point estimates of the slope and inter- on tooth characteristics. A single observation on
cept of the population regression line. y 5 age (yr) was made for each of the following
b. Using the simple linear regression model fit values of x 5 % of root with transparent dentine:
to this data, confirm that yn 5 3.404, syn 5 .107 15, 19, 31, 39, 41, 44, 47, 48, 55, 64. Consider
when x 5 .2, and yn 5 6.616, syn 5 .088 when x 5 the following six intervals based on the resulting
.4. Explain why syn is larger when x 5 .2 than data: (i) a 95% CI for mean age when x 5 35;
when x 5 .4. (ii) a 95% PI for age when x 5 35; (iii) a 95% CI
c. Calculate a confidence interval with a confi- for mean age when x 5 42; (iv) a 95% PI for age
dence level of 95% for the true average DNP when x 5 42; (v) a 99% CI for mean age when
sorbed concentration of all river sediment speci- x 5 42; (vi) a 99% PI for age when x 5 42. With-
mens using an equilibrium concentration of .4. out computing any of these intervals, what can be
d. Calculate a prediction interval with a prediction said about their relative widths?
level of 95% for the DNP sorbed concentration
26. During oil drilling operations, components of the
of a single river sediment specimen using an
drilling assembly may suffer from sulfide stress
equilibrium concentration of .4.
cracking. The article “Composition Optimiza-
e. If a 95% CI is calculated for true average DNP
tion of High-Strength Steels for Sulfide Cracking
sorbed concentration when equilibrium con-
Resistance Improvement” (Corrosion Sci., 2009:
centration is .2, what will be the simultaneous
2878–2884) reported on a study in which the
confidence level for both this interval and the
composition of a standard grade of steel was ana-
interval calculated in part (c)?
lyzed. The following data on y 5 threshold stress
23. Refer to Exercise 6 of Section 11.1. (% SMYS) and x 5 yield strength (MPa) was read
a. Predict oxygen diffusivity for a single observa- from a graph in the article (which also included the
tion to be made when temperature is 1500°F, equation of the least squares line).
and do so in a way that conveys information x: 635 644 711 708 836 820 810
about reliability and precision.
y: 100 93 88 84 77 75 74
b. Would a prediction interval for diffusivity when
temperature is 1200°F using the same predic- x: 870 856 923 878 937 948
tion level as in part (a) be wider or narrower y: 63 57 55 47 43 38
than the interval of part (a)? Answer without a. Does a scatterplot support the use of the simple
computing this second interval. linear regression model for relating y to x?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.4 Multiple Regression Models 533
b. What proportion of observed variation in stress Relevant calculated values include Sxx 5 762.012,
can be attributed to the approximate linear rela- b 5 .024576, a 5 .175576, SSTo 5 .48144, and
tionship between the two variables? SSResid 5 .02120.
c. Determine a 90% confidence interval for the a. Does the simple linear regression model specify
true average threshold stress of all similar steel a useful relationship between production and
specimens whose yield strength is 800 MPa. protein?
d. Determine a 90% prediction interval for the b. Estimate true average protein for all cows whose
threshold stress of a single steel specimen whose production is 30 kg/day; use a confidence interval
yield strength is 800 MPa. with a confidence level of 99%. Does the result-
ing interval suggest that this mean value has been
27. Milk is an important source of protein. How does
precisely estimated? Explain your reasoning.
the amount of protein in milk from a cow vary
c. Calculate a 99% prediction interval for the protein
with milk production? The article “Metabolites of
from a single cow whose production is 30 kg/day.
Nucleic Acids in Bovine Milk” (J. of Dairy Science,
1984: 723–728) reported the accompanying data 28. Obtain an expression for sa, the estimated standard
on x 5 milk production (kg/day) and y 5 milk deviation of the intercept of the least squares line.
protein (kg/day) for Holstein-Friesan cows: Then use the fact that t 5 (a 2 )ysa has a t distribu-
tion with n 2 2 df to test H0: 5 0 for the data in
x: 42.7 40.2 38.2 37.6 32.2 32.2 28.0 Exercise 27 (this null hypothesis says that the popu-
y: 1.20 1.16 1.07 1.13 .96 1.07 .85 lation regression line passes through the origin).
x: 27.2 26.6 23.0 22.7 21.8 21.3 20.2 Hint: When x 5 0, yn 5 a 1 b(0) 5 a, and we have
y: .87 .77 .74 .76 .69 .72 .64 a general expression for syn.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
534 chapter 11 Inferential Methods in Regression and Correlation
definitions A general additive multiple regression model, which relates a dependent vari-
able y to k predictor variables x1, x2, . . . , xk, is given by the model equation
y 5 1 1x1 1 2x2 1 … 1 k xk 1 e
The random deviation e is assumed to be normally distributed with mean value
0 and variance 2 for any particular values of the predictors, and the e’s resulting
from different observations are assumed to be independent of one another. The
i >s are called population regression coefficients, and the deterministic portion
1 1x1 1 … 1 k xk is the population regression function.
Let x1* , x2* , . . . , xk* denote particular values of the predictors. Then the model equa-
tion and assumptions about e imply that
Example 11.9 Cardiorespiratory fitness is widely recognized as a major component of overall physi-
cal well-being. Direct measurement of maximal oxygen uptake (VO2max) is the
single best measure of such fitness, but direct measurement is time-consuming and
expensive. It is therefore desirable to have a prediction equation for VO2max in terms
of easily obtained quantities. Consider the variables
y 5 VO2max (Lymin) x1 5 weight (kg) x2 5 age (yr)
x3 5 time necessary to walk 1 mile (min)
x4 5 heart rate at the end of the walk (beatsymin)
Here is one possible model, for male students, consistent with the information given
in the article “Validation of the Rockport Fitness Walking Test in College Males and
Females” (Research Quarterly for Exercise and Sport, 1994: 152–158):
y 5 5.0 1 .01x1 2 .05x2 2 .13x3 2 .01x4 1 e 5 .4
The population regression function is
mean y valued for fixed x1, . . . , xk 5 5.0 1 .01x1 2 .05x2 2 .13x3 2 .01x4
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.4 Multiple Regression Models 535
For individuals whose weight is 76 kg, age is 20 yr, walk time is 12 min, and heart
rate is 140 beats/min,
mean value of VO2max 5 5.0 1 .01(76) 2 .05(20) 2 .13(12) 2 .01(140)
5 1.80 L/min
With 2 5 .80, it is quite likely (a probability of roughly .95) that an actual y value
observed when the xi’s are as stated will be within .80 of the mean value, that is, in
the interval from 1.00 to 2.60.
The value 2 5 2.05 is interpreted as the average change in VO2max (here a
decrease) associated with a 1-year increase in age while weight, walk time, and heart
rate are all held fixed. The three other i >s associated with predictors have similar
interpretations.
y 5 1 1x 1 2x2 1 e
Unless otherwise noted, all content on this page is © Cengage Learning.
If we rewrite this equation with x1 5 x and x2 5 x2, a special case of the general
multiple regression model with k 5 2 results. Notice that one of the two predictors is a
mathematical function of the other one: x2 5 (x1)2. In general, in a multiple regression
model, it is perfectly legitimate to have one or more of the k predictors that are mathemati-
cal functions of other predictors. For example, we will shortly discuss models that include
an interaction predictor of the form x3 5 x1x2, a product of two other predictors. In particu-
lar, the general polynomial regression model begins with a single independent variable x
and creates predictors x1 5 x, x2 5 x2, . . . , xk 5 xk for some specified value of k.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
536 chapter 11 Inferential Methods in Regression and Correlation
The most important special case other than simple linear regression (k 5 1) is the
quadratic regression model
y 5 1 1x 1 2x2 1 e
This model replaces the line of mean y values in simple linear regression with a
parabolic curve of mean values 1 1x 1 2x2. If 2 , 0, the curve opens down-
ward, as in Figure 11.13(a), whereas it opens upward when 2 . 0. A less fre-
quently encountered case is that of cubic regression, in which k 5 3.
Example 11.10 Researchers have examined a variety of climatic variables in an attempt to gain an
understanding of the mechanisms that govern rainfall runoff. The article “The Appli-
cability of Morton’s and Penman’s Evapotranspiration Estimates in Rainfall-Runoff
Modeling” (Water Resources Bull., 1991: 611–620) reported on a study in which data
on x 5 cloud cover and y 5 daily sunshine (hr) was gathered from a number of dif- Unless otherwise noted, all content on this page is © Cengage Learning.
ferent locations. The authors used a cubic regression model to relate these variables.
Suppose that the actual model equation for a particular location is
y 5 11 2 .400x 2 .250x2 1 .005x3 1 e
Then the regression function is
(mean daily sunshine for given cloud cover x) 5 11 2 .400x 2 .250x2 1 .005x3
For example,
(mean daily sunshine when cloud cover is 4) 5 11 2 .400(4) 2 .250(4)2 1 .005(4)3
5 5.72
If 5 1, it is quite likely that an observation on daily sunshine made when x 5 4
would be between 3.72 and 7.72 hr.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.4 Multiple Regression Models 537
for temperature values between 80 and 100 in combination with pressure values rang-
ing from 50 to 70. The population regression function 1200 1 15x1 2 35x2 gives the
mean y value for any particular values of the predictors. Consider this mean y value for
three different particular temperature values:
x1 5 90: mean y value 5 1200 1 15(90) 2 35x2 5 2550 2 35x2
x1 5 95: mean y value 5 2625 2 35x2
x1 5 100: mean y value 5 2700 2 35x2
Graphs of these three mean y value functions are shown in Figure 11.14(a). Each graph
is a straight line, and the three lines are parallel, each with a slope of 235. Thus irre-
spective of the fixed value of temperature, the average change in yield associated with a
1-unit increase in pressure is 235.
30
26
0
25
26
0
35
25
2
5
50
40
35
35
1
2
2
22
=1
35
(
(
2
50
00
1
1
(
2
=1
=9
)
(
1
=9
30
5)
00
1
=9
5)
)
2
0)
(1
=
90
)
2 2
(a) (b)
Figure 11.14 Graphs of the mean y value for two different models:
(a) 1200 1 15 1 2 35 2; (b) 24500 1 75 1 1 60 2 2 1 2
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
538 chapter 11 Inferential Methods in Regression and Correlation
Since chemical theory suggests that the decline in average yield when pressure x2
increases should be more rapid for a high temperature than for a low temperature, the
chemist now has reason to doubt the appropriateness of the proposed model. Rather
than the lines being parallel, the line for a temperature of 100 should be steeper than
the line for a temperature of 95, and that line in turn should be steeper than the line for
x1 5 90. A model that has this property includes, in addition to predictors x1 and x2, a
third predictor variable, x3 5 x1x2. One such model is
for which the population regression function is 24500 175x1 1 60x2 2 x1x2. This gives
These are graphed in Figure 11.14(b), where it is clear that the three slopes are differ-
ent. Now each different value of x1 yields a line with a different slope, so the average
change in yield associated with a 1-unit increase in x2 depends on the value of x1. When
this is the case, the two variables are said to interact.
definition If the change in the mean y value associated with a 1-unit increase in one inde-
pendent variable depends on the value of a second independent variable, there is
interaction between these two variables. Denoting the two independent variables
by x1 and x2, we can model this interaction by including as an additional predictor
x3 5 x1x2, the product of the two independent variables.
The general equation for a multiple regression model based on two independent
variables x1 and x2 that also includes an interaction predictor is
When x1 and x2 do interact, this model will usually give a much better fit to resulting
data than would the no-interaction model. Failure to consider a model with interaction
too often leads an investigator to conclude incorrectly that the relationship between y
and a set of independent variables is not very substantial.
More than one interaction predictor can be included in the model when more than
two independent variables are available. If, for example, three independent variables x1,
x2, and x3 are available, one possible model is
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.4 Multiple Regression Models 539
One could even include a three-way interaction x7 5 x1x2x3, although in practice this
is rarely done. In applied work, quadratic predictors such as x21 and x22 are often in-
cluded to model a curved relationship between y and several independent variables. A
frequently used model with k 5 5 based on two independent variables x1 and x2 is the
full quadratic or complete second-order model
This model replaces the straight lines of Figure 11.14 with parabolas (each one is
the graph of the population regression function as x2 varies when x1 has a particular
value). Starting with four independent variables x1,…, x4, one could create a model
with four quadratic predictors and six two-way interaction predictor variables. Clearly,
a great many different models can be created from just a small number of independent
variables. In Section 11.6 we briefly discuss methods for selecting one model from a
number of competing models.
Example 11.11 The article “Estimating Urban Travel Times: A Comparative Study” (Trans. Res.,
1980: 173–175) described a study relating the dependent variable y 5 travel time
between locations in a certain city and the independent variable x2 5 distance be-
tween locations. Two types of vehicles, passenger cars and trucks, were used in the
study. Let
1 if the vehicle is a truck
x1 5 e
0 if the vehicle is a passenger car
One possible multiple regression model is
y 5 1 1x1 1 2x2 1 e
The mean value of travel time depends on whether a vehicle is a car or a truck:
mean time 5 1 2x2 when x1 5 0 (cars)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
540 chapter 11 Inferential Methods in Regression and Correlation
For each model, the graph of the mean time versus distance is a straight line for
either type of vehicle, as illustrated in Figure 11.15. The two lines are parallel for
the first (no-interaction) model, but in general they will have different slopes when
the second model is correct. For this latter model, the change in mean travel time
associated with a 1-mile increase in distance depends on which type of vehicle is
involved—the two variables “vehicle type” and “travel time” interact. Indeed, data
collected by the authors of the cited article suggested the presence of interaction.
Mean Mean
) 1)
=1 =
( 1 1
2
(
2 ) 2
1
+ =0 )
3
+ ( 1 +
2
2 ( )
+ =0
2
+ 1 ( 1
+
2 2
+
2 2
(a) (b)
Figure 11.15 Regression functions for models with one dummy variable ( 1) and
one quantitative variable 2: (a) no interaction; (b) interaction
You might think that the way to handle a three-category situation is to define a
single numerical variable with coded values such as 0, 1, and 2 corresponding to the Unless otherwise noted, all content on this page is © Cengage Learning.
three categories. This is incorrect, because it imposes an ordering on the categories that
is not necessarily implied by the problem context. The correct approach to incorporat-
ing three categories is to define two different dummy variables. Suppose, for example,
that y is the lifetime of a certain cutting tool, x1 is cutting speed, and there are three
brands of tool being investigated. Then let
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.4 Exercises 541
A and brand B. The no-interaction model would have only the predictors x1, x2, and x3.
The following interaction model allows the mean change in lifetime associated with a
1-unit increase in speed to depend on the brand of tool:
Construction of a picture like Figure 11.14 with a graph for each of the three possible
(x2, x3) pairs gives three nonparallel lines (unless 4 5 5 5 0).
More generally, incorporating a categorical variable with c possible categories into
a multiple regression model requires the use of c 2 1 indicator variables (e.g., five
brands of tools would necessitate using four indicator variables). Thus even one cat-
egorical variable can add many predictors to a model.
29. A trucking company considered a multiple regres- (% of crude oil), x1 5 crude oil gravity ( API), x2 5
sion model for relating the dependent variable y 5 crude oil vapor pressure (PSIA), x3 5 crude oil ASTM
total daily travel time for one of its drivers (hours) 10% point ( F), and x4 5 gasoline end point ( F).
to the predictors x1 5 distance traveled (miles) and a. Interpret the population regression coefficients
x2 5 the number of deliveries made. Suppose that 1 and 3.
the model equation is b. What is the mean yield when x1 5 40, x2 5 5,
y 5 2.800 1 .060x1 1 .900x2 1 e x3 5 230, and x4 5 360?
a. What is the mean value of travel time when dis-
31. High-alumina refractory castables have been exten-
tance traveled is 50 miles and three deliveries
sively investigated in recent years because of their
are made?
significant advantages over other refractory brick of
b. How would you interpret 1 5 .060, the coeffi-
the same class: lower production and application
cient of the predictor x1? What is the interpreta-
costs, versatility, and performance at high tempera-
tion of 2 5 .900?
tures. The authors of “Processing of Zero-Cement
c. If 5 .5 hour, what is the probability that travel
Self-Flow Alumina Castables” (The Amer. Ceramic
time will be at most 6 hours when three deliveries
Soc. Bull., 1998: 60–66) proposed a quadratic regres-
are made and the distance traveled is 50 miles?
sion model to describe the relationship between x 5
30. Consider the regression model y 5 26.50 1.250x1 1 viscosity (MPa ∙ sec) and y 5 free flow (%). Suppose
.600x2 2 .150x3 1 .160x4 1 e, where y 5 gasoline yield the actual model is y 5 2296 1 2.20x 2 .003x2 1 e.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
542 chapter 11 Inferential Methods in Regression and Correlation
a. Graph the true regression function y 5 2296 1 in mean life associated with an increase of 1 in
2.20x – .003x2 for x values between 350 and 485. load?
b. Would mean free flow percentage be higher for
33. Let y 5 sales at a fast-food outlet (1000s of $), x1 5
a viscosity value of 450 or 470?
number of competing outlets within a 1-mile radius,
c. What is the change in mean free flow percent-
x2 5 population within a 1-mile radius (1000s of
age when the viscosity increases from 450 to
people), and x3 be an indicator variable that equals
460? From 460 to 470?
1 if the outlet has a drive-up window and 0 other-
32. Let y 5 wear life of a bearing, x1 5 oil viscosity, wise. Suppose that the true regression model is
and x2 5 load. Suppose that the multiple regression y 5 10.00 2 1.2x1 1 6.8x2 1 15.3x3 1 e
model relating life to viscosity and load is a. What is the mean value of sales when the
y 5 125.0 1 7.750x1 1 .0950x2 2 .0090x1x2 1 e number of competing outlets is 2, there are
a. What is the mean value of life when viscosity is 8000 people within a 1-mile radius, and the
40 and load is 1100? outlet has a drive-up window?
b. When viscosity is 30, what is the change in b. What is the mean value of sales for an out-let
mean life associated with an increase of 1 in without a drive-up window that has 3 competing
load? When viscosity is 40, what is the change outlets and 5000 people within a 1-mile radius?
y 5 1 1x1 1 … 1 kxk 1 e
discussed in Section 11.4. Estimation of model parameters and other inferences are
based on a sample of n observations, each one consisting of k 1 1 numbers: a value
of x1, a value of x2, . . . , a value of xk, and a value of y. As in simple linear regression,
the principle of least squares is used to estimate the population regression coefficients
, 1, . . . , k. The least squares estimates a, b1, b2, . . . , bk are chosen to minimize the
sum of squared deviations:
^ [y 2 (a 1 b1x1 1 … 1 bk xk)]2
all obs
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Inferences in Multiple Regression 543
Example 11.12 The article “How to Optimize and Control the Wire Bonding Process: Part II (Solid
State Technology, Jan. 1991: 67–72) described an experiment carried out to assess the
impact of the variables x1 5 force (g), x2 5 power (mW), x3 5 temperature (°C), and
x4 5 time (ms) on y 5 ball bond shear strength (g). The following data1 was gener-
ated to be consistent with the information given in the article:
1
From the book Statistics Engineering Problem Solving by Stephen Vardeman, an excellent exposition
of the territory covered by our book, albeit at a somewhat higher level.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
544 chapter 11 Inferential Methods in Regression and Correlation
Thus we estimate that .1297 gm is the average change in strength associated with a
1-degree increase in temperature when the other three predictors are held fixed; the
other estimated coefficients are interpreted in a similar manner.
The estimated regression equation is
yn 5 237.48 1 .2117x1 1 .4983x2 1 .1297x3 1 .2583x4
A point prediction of strength resulting from a force of 35 g, power of 75 mW, tem-
perature of 200 degrees, and time of 20 ms is
yn 5 237.48 1 (.2117)(35) 1 (.4983)(75) 1 (.1297)(200) 1 (.2583)(20)
5 38.41 g
This is also a point estimate of the mean value of strength for the specified values of
force, power, temperature, and time.
Substituting the values of the predictors from the successive observations into the
equation for the estimated regression gives the predicted or fitted values yn1, yn2,…, ynn. For
example, since the values of the four predictors for the last observation in Example 11.12
are 35, 75, 200, and 20, respectively, the corresponding predicted value is yn30 5 38.41.
The residuals are the differences y1 2 yn1,…, yn 2 ynn ? The last residual in Example 11.12
is 40.3 2 38.41 5 1.89. The closer the residuals are to zero, the better the job our estimat-
ed equation is doing in predicting the y values corresponding to values of the predictors in
our sample. Squaring these residuals and summing gives residual or error sum of squares
^ (yi 2 yni)2, denoted by SSResid. The number of df associated with SSResid is n 2 (k 1 1).
The explanation is that the k 1 1 parameters , 1,…, k have to be estimated from the
data before SSResid can be calculated, resulting in a loss of this many df (in simple linear
regression, k 5 1 so df 5 n 2 2). The variance 2 of a random deviation e in the model
equation is estimated by s2e 5 SSResidy[n 2 (k 1 1)], and se is the estimate of . For the
data of Example 11.12, SSResid 5 665.12, so 665.12y[30 2 (4 1 1)] 5 26.60 and the
estimated standard deviation is 5.16. We estimate that, roughly speaking, the size of a typi-
cal deviation of y from its mean value will be about 5.2 g.
Model Utility
A very important quantity introduced in Section 3.5 is the coefficient of multiple
determination, R2, given by
SSResid
R2 5 1 2 where SSTo 5 ^ (yi 2 y)2
SSTo
R2 is interpreted as the proportion of variation in the observed y values that can be attributed
to (or explained by) the model relationship between y and the predictors. The closer R2 is to
1, the more effectively the model has explained variation in y by relating it to the predictors.
The coefficient of multiple determination for the data of Example 11.12 is .714, so some-
what more than 70% of the observed variation in strength can be attributed to the model
relationship between strength and the four predictors force, power, temperature, and time.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Inferences in Multiple Regression 545
The value of R2 cannot decrease when an extra predictor is added to the model,
and it will generally increase. Furthermore, the value of R2 can almost always be made
very close to 1 simply by using a model whose number of predictors is quite close to the
sample size, even if many of these predictors are “frivolous” in the sense that they would
contribute only marginally to explaining variation in y. Because R2 can be misleading in
this way, a quantity called adjusted R2 is included on multiple regression output from
most statistical computer packages. It is defined by
SSResidy[n 2 (k 1 1)] n21 SSResid
adjusted R2 5 1 2 512 c d
SSToy(n 2 1) n 2 (k 1 1) SSTo
Replacing the expression in brackets on the far right by 1 gives R2 itself. Since this expres-
sion is less than 1, the adjusted R2 is smaller than R2. This downward adjustment will
be small when R2 is reasonably high and this has been achieved by using a model with
relatively few predictors compared to the sample size. For example, adjusted R2 for the
model fit in Example 11.12 is .668, which is not all that much smaller than R2 itself. The
adjustment will be more dramatic when R2 is not so high or when k is large relative to n.
High values of R2 and adjusted R2 certainly suggest that the model fit is a useful one.
But how large should these values be before we draw this conclusion? It is desirable to
have a formal test procedure so that we will not be led astray by intuition. Recall that the
null hypothesis for the model utility test in simple linear regression was that 5 0; its
interpretation was that there is no useful linear relationship between y and the single pre-
dictor x. Here, the null hypothesis states that there is no useful linear relationship between
y and any of the k predictors included in the model. The test is based on F distributions,
which were first encountered in Chapter 9 in connection with the analysis of variance.
where
MSResid 5 SSresidy[( 2 ( 1 1)]
MSRegr 5 SSRegryk
SSRegr 5 SSto 2 SSResid
The larger the value of 2, the larger the value of will be, implying that the test is upper-
tailed (as were tests in ANOVA). When 0 is true, the test statistic has an distribution
based on numerator and 2 ( 1 1) denominator df. The -value for the test is the area
under the corresponding curve to the right of the calculated value of . Partial informa-
tion about this -value can be obtained from the table of critical values given in Appendix
Table VIII. As usual, the null hypothesis is rejected if the -value is less than or equal to the
chosen significance level.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
546 chapter 11 Inferential Methods in Regression and Correlation
A large value of R2 is no guarantee that the model will be judged useful by the F test. If k is
large relative to n, F will not exceed 0 by a great deal and the P-value will not be very small.
Example 11.13 Returning to the bond shear strength data of Example 11.12, a model with k 5 4
predictors was fit, so the relevant hypotheses are
H0: 1 5 2 5 3 5 4 5 0
Ha: at least one of these four ’s is not zero
Figure 11.16 shows output from the JMP statistical package. The values of the es-
timated coefficients, se (Root Mean Square Error), R2, and adjusted R2 agree with
those given previously.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Inferences in Multiple Regression 547
This value also appears in the F Ratio column of the ANOVA table in Figure 11.16.
The largest F critical value for 4 numerator and 25 denominator df in our F table is
6.49, which captures an upper-tail area of .001. Thus P-value < .001. The ANOVA
table in the JMP output (Figure 11.16) shows that P-value < .0001. This is a highly
significant result. The null hypothesis should be rejected at any reasonable signifi-
cance level. We conclude that there is a useful linear relationship between y and at
least one of the four predictors in the model. This does not mean that all four predic-
tors are useful; we will say more about this subsequently.
has a distribution based on 2 ( 1 1) df. This implies that a confidence interval for is
6 ( critical value)
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
548 chapter 11 Inferential Methods in Regression and Correlation
Example 11.14 The JMP output of Figure 11.16 gives b2 5 .498333, sb2 5 .070191, and error df 5
n 2 (k 1 1) 5 25. The t critical value for a confidence interval for 2 with a
confidence level of 95% is 2.060. The confidence interval is
.498333 6 (2.060)(.070191) .498 6 .145 5 (.353, .643)
We therefore estimate with a high degree of confidence that, when the value of
power is increased by 1 mw while force, temperature, and time are all held fixed, the
associated change in average strength will be between .353 gm and .643 gm.
Example 11.15 In Example 3.15 from Section 3.5, we gave a data set consisting of 13 observa-
tions on the variables y 5 adsorption, x1 5 extractable iron, and x2 5 extractable
aluminum. Figure 11.17 is the Minitab output from fitting the model y 5 1
1x1 1 2x2 1 3x3 1 e, where x3 5 x1x2 is an interaction predictor.
Judging from the P-value of .000 for the model utility test, the fitted model
specifies a very useful relationship between y and the predictors. Provided that iron
and aluminum are retained in the model, does the interaction predictor appear to
provide useful information about adsorption? The relevant hypotheses are
H0: 3 5 0
Ha: 3 Þ 0
The test statistic is the t-ratio b3ysb3, with value .0005278y.0006610 5 .80. Our table
of t curve tail areas shows that the area under the 13 2 (3 1 1) 5 9 df curve to the
right of .8 is .222 (see Appendix Table VI), so the P-value for the two-tailed test is
.444 (.445 according to Minitab). The null hypothesis should not be rejected at any
reasonable significance level. It is very plausible that 3 5 0, from which we con-
clude that the interaction predictor does not appear to provide useful information
beyond what is provided by the predictors iron and aluminum.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Inferences in Multiple Regression 549
More Intervals
Because the individual estimated coefficients vary from sample to sample, so will
the value of yn 5 a 1 b1x1 1 … 1 bk xk for fixed values of x1, . . . , xk. Properties of the
sampling distribution of the statistic yn can be used to obtain both a confidence inter-
val for a mean y value and a prediction interval for a single y value when the predictors
have specified values. Both intervals are based on n 2 (k 1 1) df and have the same
general form as in the case of simple linear regression. The CI for a mean y value is
yn 6 (t critical value)syn
and the PI for a single as-yet-unobserved y value is
yn 6 (t critical value)2s2e 1 s2yn
where syn is the estimated standard deviation of the statistic yn. The PI is always wider
than the corresponding CI.
Example 11.16 Figure 11.18 shows Minitab output from fitting the model, using only the predictors x1
and x2, to the adsorption data referred to in Example 11.15. About 95% of the observed
variation in adsorption can be attributed to the model relationship. The P-value for
model utility is .000, confirming the utility of the chosen model. The P-values corre-
sponding to t-ratios for the two coefficients are .004 and .000, respectively, indicating
that neither of these predictors should be deleted from the model when the other one is
retained. That is, both predictors appear to provide useful information about y. The last
line of the output gives estimation and prediction information when x1 5 200 and x2 5
40. The values of yn and syn are 29.16 and 1.76, respectively. The limits of both a 95%
CI for mean adsorption and a 95% PI for a single adsorption value are also displayed.
Notice how much wider the PI is than the CI. Even with a very high R2 value, there
is still a reasonable amount of uncertainty involved in predicting a single value of
adsorption.
Unless otherwise noted, all content on this page is © Cengage Learning.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
550 chapter 11 Inferential Methods in Regression and Correlation
The test is upper-tailed and is based on the F distribution having g numerator df and
n 2 (k 1 1) denominator df.
Example 11.17 For the bond shear strength data given in Example 11.12, the model with the four
predictors force, power, temperature, and time gave SSResid 5 665.12, R2 5 .714,
and adjusted R2 5 .668. Now consider as the full model the complete second-order
model containing not only x1–x4 but also 4 quadratic predictors and 6 interaction
predictors, for a total of 14 predictors. The estimated regression equation is
strength 5 21 2 2.30force 2 .08power 1 .836temp 2 3.99time
1.0240for* pow 2 .0093for* temp 1 .0755for* time
2.00467pow* temp 1 .0237pow* time
1.0007temp* tim 1 .0152forsqd 1 .00130powsqd
2.00011tempsqd 2 .0078timesqd
with SSResid(full) 5 426.93, R2 5 .816, adjusted R2 5 .645, and P-value 5 .002 for
the model utility F test. Should any of the second-order predictors be retained in the
model? The relevant null hypothesis is
H : 5 5 … 5 5 0 0 5 6 14
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Exercises 551
whereas the alternative hypothesis states that at least one of these ’s is not zero (that
is, there is at least one useful second-order predictor). The number of predictors in
the subset being considered for deletion is g 5 10, which is numerator df; denomina-
tor df is 30 2 (14 1 1) 5 15. The test statistic value is
(665.12 2 426.93)y10
F5 5 .84
426.93y15
for which P-value > .10. The null hypothesis should not be rejected at any reason-
able significance level. None of the second-order predictors appears to provide useful
information beyond what is contained in the four first-order predictors.
34. The article “Validation of the Rockport Fitness Walk- d. Using SSResid 5 30.1033 and SSTo 5 102.3922,
ing Test in College Males and Females” (Research what proportion of observed variation in VO2max
Quarterly for Exercise and Sport, 1994: 152–158) can be attributed to the model relationship?
recommended the following estimated regression
35. Exercise 35 of Section 3.5 gave data on x1 5 wire
equation for relating y 5 VO2max (L/min, a mea-
feed rate, x2 5 welding speed, and y 5 deposition
sure of cardiorespiratory fitness) to the predictors
rate of a welding process. Minitab output from fit-
x1 5 gender (female 5 0, male 5 1), x2 5 weight
ting the multiple regression model with x1 and x2 as
(lb), x3 5 1-mile walk time (min), and x4 5 heart rate
predictors is given here.
at the end of the walk (beats/min):
The regression equation is
yn 5 3.5959 1 .6566x1 1 .0096x2 2 .0996x3 DepRate = 0.0558 + 0.375 FeedRate + 0.00278 WeldSpd
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
552 chapter 11 Inferential Methods in Regression and Correlation
c. When x1 5 11.5 and x2 5 40, the estimated stan- b. Again using n 5 25, calculate the value of ad-
dard deviation of yn is syn 5 .02438. Calculate a justed R2.
95% confidence interval for true average deposi- c. Calculate a 99% confidence interval for true
tion rate for the given values of x1 and x2. mean yarn tenacity when yarn count is 16.5, yarn
d. Calculate a 95% prediction interval for the de- contains 50% polyester, first nozzle pressure is 3,
position rate resulting from a single experimen- and second nozzle pressure is 5 if the estimated
tal run with x1 5 11.5 and x2 5 40. standard deviation of predicted tenacity under
these circumstances is .350.
36. Exercise 37 of Section 3.5 gave R output for a re-
gression of y 5 deposition over a specified time pe- 38. A regression analysis carried out to relate y 5 repair
riod on two complex predictors x1 and x2 defined in time for a water filtration system (hr) to x1 5 elapsed
terms of PAH air concentrations for various species, time since the previous service (months) and x2 5
total time, and total amount of precipitation. Use type of repair (1 if electrical and 0 if mechani-
the output in that exercise to answer the following: cal) yielded the following model based on n 5 12
a. Does there appear to be a useful linear relation- observations: yn 5 .950 1 .400x1 1 1.250x2. In addi-
ship between y and at least one of the predictors? tion, SSTo 5 12.72, SSResid 5 2.09, and sb2 5 .312.
b. The estimated standard deviation of yn when x1 a. Does there appear to be a useful linear relation-
is 20,000 and x2 is .002 is syn 5 21.7. Calculate a ship between repair time and the two model
95% confidence interval for the mean value of predictors? Carry out a test of the appropriate
deposition under these circumstances. hypotheses using a significance level of .05.
c. Fitting the model with predictors x1 and x2 gave b. Given that elapsed time since the last service re-
SSResid 5 27,454, whereas fitting with x1, x2, mains in the model, does type of repair provide
and x3 5 x1x2 resulted in SSResid 5 20519. useful information about repair time? State and
Using 5 .01, can we conclude that the x1x2 test the appropriate hypotheses using a signifi-
term adds useful information to a ‘reduced’ cance level of .01.
model containing only x1 and x2? Note: when c. Calculate and interpret a 95% confidence inter-
g 5 1, the resulting F test gives the same con- val for 2.
clusion as the t-test for whether a single vari- d. The estimated standard deviation of a prediction
able (here, x1x2) contributes useful information for repair time when elapsed time is 6 months and
to a model. the repair is electrical is .192. Predict repair time
under these circumstances by calculating a predic-
37. The article “Analysis of the Modeling Method-
tion interval with a 99% prediction level. Does the
ologies for Predicting the Strength of Air-Jet Spun
resulting interval suggest that the estimated model
Yarns” (Textile Res. J., 1997: 39–44) reported on a
will give an accurate prediction? Why or why not?
study carried out to relate yarn tenacity (y, in g/tex) to
yarn count (x1, in tex), percentage polyester (x2), first 39. The accompanying data on x 5 frequency (MHz)
nozzle pressure (x3, in kg/cm2), and second nozzle and y 5 power (W) for a certain laser configuration
pressure (x4, in kg/cm2). The estimate of the constant was read from a graph in the article “Frequency
term in the corresponding multiple regression equa- Dependence in RF Discharge Excited Waveguide
tion was 6.121. The estimated coefficients for the CO2 Lasers” (IEEE J. of Quantum Electronics,
four predictors were 2.082, .113, .256, and 2.219, 1984: 509–514):
respectively, and the coefficient of multiple determi-
x: 60 63 77 100 125 157 186 222
nation was .946.
y: 16 17 19 21 22 20 15 5
a. Assuming that the sample size was n 5 25, state
and test the appropriate hypotheses to decide Fitting a quadratic regression model to this
whether the fitted model specifies a useful linear data yielded the following summary quantities:
relationship between the dependent variable a 5 21.5127, b1 5 .391902, b2 5 2.00163141,
and at least one of the four model predictors. SSResid 5 .29, SSTo 5 202.87, and sb2 5 .00003391.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.5 Exercises 553
a. Why is b2 negative rather than positive? (this model was also fit by the investigators).
b. What proportion of observed variation in output Does it appear that at least one of the inter-
power can be attributed to the model relation- action predictors provides useful information
ship between power and frequency? about power over and above what is provided
c. Carry out a test of hypotheses to decide whether by the first-order predictors? State and test the
the quadratic regression model is useful. appropriate hypotheses using a significance
d. Carry out a test of hypotheses to decide whether level of .05.
the quadratic predictor should be retained in
41. The article “The Undrained Strength of Some
the model.
Thawed Permafrost Soils” (Canadian Geotechnical
e. When x 5 150, the estimated standard deviation
J., 1979: 420–427) reported the following data on
of yn is syn 5 .1410. Calculate a 99% confidence
undrained shear strength of sandy soil (y, in kPa),
interval for true average power when frequency
depth (x1, in m), and water content (x2, in %):
is 150, and also a 99% prediction interval for
a single output power observation to be made Obs Depth Watcont Shstren
when frequency is 150. 1 8.9 31.5 14.7
2 36.6 27.0 48.0
40. The article “Sensitivity Analysis of a 2.5 kW Proton
3 36.8 25.9 25.6
Exchange Membrane Fuel Cell Stack by Statistical
Method” (J. of Fuel Cell Sci. and Tech., 2009: 1–6) 4 6.1 39.1 10.0
used regression methodology to investigate the rela- 5 6.9 39.2 16.0
tionship between fuel cell power (W) and the inde- 6 6.9 38.3 16.8
pendent variables x1 5 H2 pressure (psi), x2 5 H2 flow 7 7.3 33.9 20.7
(stoc), x3 5 air pressure (psi), and x4 5 airflow (stoc). 8 8.4 33.8 38.8
Here is the Minitab output from fitting the mod- 9 6.5 27.9 16.9
el with the aforementioned independent variables as 10 8.0 33.1 27.0
predictors (also fit by the authors of the cited article): 11 4.5 26.3 16.0
Predictor Coef SE Coef T p 12 9.9 37.8 24.9
Constant 1507.3 206.8 7.29 0.000 13 2.9 34.6 7.3
x1 -4.282 4.969 -0.86 0.407 14 2.0 36.4 12.8
x2 7.46 62.11 0.12 0.907
x3 -0.9162 0.6227 -1.47 0.169 Fitting the model with predictors x1 and x2 only
x4 90.60 24.84 3.65 0.004 gave SSResid 5 894.95, whereas fitting the com-
plete second-order model with predictors x1, x2,
s = 4.6885 R-sq = 59.6% R-sq(adj) = 44.9%
x21, x22, and x1x2 resulted in SSResid 5 390.64. Carry
SOURCE DF SS MS F p out a test at significance level .01 to decide whether
Regression 4 40048 10012 4.06 0.029 at least one of the second-order predictors provides
Res.Error 11 27158 2469 useful information about shear strength.
Total 15 67206
42. Soluble dietary fiber (SDF) can provide health ben-
a. Does there appear to be a useful relationship be- efits by lowering blood cholesterol and glucose lev-
tween power and at least one of the predictors? els. The article “Effects of Twin-Screw Extrusion on
Carry out a formal test of hypotheses. Soluble Dietary Fiber and Physicochemical Prop-
b. Fitting the model with predictors x3, x4, and the erties of Soybean Residue” (Food Chemistry, 2013:
interaction x3x4 gave R2 5 .834. Does this model 884–889) reported the following data on y 5 SDF
appear to be useful? Can an F test be used to com- content (%) in soybean residue and the three predic-
pare this model to the model of part (a)? Explain. tors x1 5 extrusion temperature (in C), x2 = feed
c. Fitting the model with all 4 predictors as well moisture (in %), and x3 5 screw speed (in rpm) of a
as all second-order interactions gave R2 5 .960 twin-screw extrusion process.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
554 chapter 11 Inferential Methods in Regression and Correlation
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 555
Obs cont lngth grad vel Obs cont lngth grad vel
1 0.0 0 0.400 0.027 26 1.5 50 1.141 0.058
2 0.0 0 0.716 0.050 27 1.5 50 1.474 0.082
3 0.0 0 0.925 0.080 28 1.5 50 1.581 0.112
4 0.0 0 1.098 0.099 29 1.5 50 1.983 0.144
5 0.0 0 1.226 0.107 30 1.0 25 0.462 0.028
6 0.0 0 1.427 0.140 31 1.0 25 0.705 0.059
7 0.0 0 1.709 0.178 32 1.0 25 0.987 0.084
8 0.0 0 1.872 0.200 33 1.0 25 1.154 0.101
9 0.5 50 0.380 0.022 34 1.0 25 1.479 0.150
10 0.5 50 0.774 0.040 35 1.0 25 1.786 0.194
11 0.5 50 1.056 0.060 36 1.0 25 1.957 0.218
12 0.5 50 1.329 0.111 37 1.0 40 0.419 0.030
13 0.5 50 1.598 0.158 38 1.0 40 0.705 0.050
14 0.5 50 1.799 0.188 39 1.0 40 0.979 0.068
15 1.0 50 0.410 0.026 40 1.0 40 1.226 0.091
16 1.0 50 0.577 0.038 41 1.0 40 1.470 0.126
17 1.0 50 0.748 0.049 42 1.0 40 1.744 0.168
18 1.0 50 0.927 0.060 43 1.0 60 0.436 0.034
19 1.0 50 1.090 0.070 44 1.0 60 0.650 0.051
20 1.0 50 1.239 0.088 45 1.0 60 0.889 0.068
21 1.0 50 1.496 0.111 46 1.0 60 1.222 0.093
22 1.0 50 1.744 0.134 47 1.0 60 1.477 0.112
23 1.0 50 1.915 0.145 48 1.0 60 1.726 0.139
24 1.5 50 0.444 0.014 49 1.0 60 1.983 0.173
25 1.5 50 0.821 0.037
a. Here is output from fitting the model with the c. Fitting the model with just fiber length and hy-
three xi’s as predictors: draulic gradient as predictors gave the estimated
Predictor Coef SE Coef T p regression coefficients a 5 2.005315, b1 5
Constant -0.002997 0.007639 -0.39 0.697 2.0004968, and b2 5 .102204 (the t-ratios for
fib cont -0.012125 0.007454 -1.63 0.111 these two predictors are both highly significant).
fib lngth -0.0003020 0.0001676 -1.80 0.078
hyd grad 0.102489 0.004711 21.76 0.000
In addition, syn 5 .00286 when fiber length 5 25
and hydraulic gradient 5 1.2. Is there convinc-
s = 0.0162355 R-sq = 91.6% R-sq(adj) = 91.1%
ing evidence that true average velocity is some-
Source DF SS MS F p
Regression 3 0.129898 0.043299 164.27 0.000
thing other than .1 in this situation? Carry out a
Residual Error 45 0.011862 0.000264 test using a significance level of .05.
Total 48 0.141760 d. Fitting the complete second-order model (as
How would you interpret the number –.0003020 did the article’s authors) resulted in SSResid 5
in the Coef column on the output? .003579. Does it appear that at least one of the
b. Does fiber content appear to provide useful second-order predictors provides useful informa-
information about velocity provided that fiber tion over and above what is provided by the three
length and hydraulic gradient remain in the first-order predictors? Test the relevant hypotheses
model? Carry out a test of hypotheses at 5 .05. at 5 .05.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
556 chapter 11 Inferential Methods in Regression and Correlation
relationship among the predictors, and a model appropriate when y is a 0–1 variable
corresponding to a success–failure dichotomy.
The most popular statistical packages will produce these standardized residuals on request.
Example 11.18 The adsorption data introduced in Example 3.15, repeated here, is used in several
examples in the previous section. The residuals are based on the model with the two
predictors x1 5 iron content and x2 5 aluminum content.
Estimated
standard Standardized
Obs Iron Aluminum Adsorption Residual deviation residual
1 61 13 4 2.06305 3.64425 2.01730
2 175 21 18 21.70661 3.72079 2.45867
3 111 24 14 .46130 4.02690 .11455
4 124 23 18 3.34477 4.04931 .82601
5 130 64 26 23.64064 3.50644 21.03827
6 173 38 26 .58585 4.14741 .14126
7 169 33 21 22.21821 4.09222 2.54206
8 169 61 30 22.99022 4.07688 2.73346
9 160 39 28 3.70238 4.18323 .88505
10 244 71 36 28.93520 4.03193 22.21611
11 257 112 65 4.29026 2.98776 1.43595
12 333 88 62 1.09857 2.99775 .36647
13 199 54 40 6.07079 4.18560 1.45040
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 557
Notice that the estimated standard deviations for the 11th and 12th observations
are much smaller than those of most other observations. This is because the x1 and
x2 values for these two observations are quite far from the center of the data. This is
analogous to the least squares line in simple linear regression being pulled toward an
observation whose x value is far to the left or right of the other x values; there is less
variability in the corresponding residual than for the other observations. The only
unusually large residual here is for the 10th observation; because the standardized
residual is 22.22, the residual 28.94 is more than 2 standard deviations smaller than
what would be expected if the correct model had been fit.
Example 11.19 Figure 11.19 shows a normal quantile plot of the standardized residuals for the ad-
sorption data given in Example 11.18. The straightness of the plot casts little doubt
on the assumption that the random deviation e is normally distributed.
Standardized residual
1.5
.5
–.5
–1.5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
558 chapter 11 Inferential Methods in Regression and Correlation
incorrectly specified. It would then be necessary to try transforming one or more of the vari-
ables or introducing new predictors, for example, quadratic predictors. Some statisticians
suggest replacing plots of the standardized residuals (or residuals) versus each predictor by
a single omnibus plot of the standardized residuals (or residuals) versus the predicted values
(yn’s). Again, any marked deviation from randomness is a call for remedial action. A plot of
yn versus y gives a visual impression of how well the model is predicting for the observations
in the sample. The closer the points in this plot are to a 45° line, the better the predictions;
the vertical deviations from this line are just the residuals. Finally, if the observations were
obtained in time sequence, the standardized residuals should be plotted in time order to
see whether there is an effect over time. Such an effect might indicate that the e’s for suc-
cessive observations are not independent, necessitating a more complex model.
Example 11.20 Figure 11.20 shows the suggested plots for the adsorption data. Given that there are
only 13 observations in the data set, there is not much evidence of a pattern in any
of the first three plots other than randomness. The point at the bottom of each of
these three plots corresponds to the observation with the large residual. We will say
more about such observations subsequently. For the moment, there is no compelling
reason for remedial action.
Standardized residual Standardized residual
1.5 1.5
.5 .5
–.5 –.5
–1.5 –1.5
1.5 60 Unless otherwise noted, all content on this page is © Cengage Learning.
50
.5
40
–.5 30
20
–1.5
10
–2.5 Predicted 0 Adsorption
0 10 20 30 40 50 60 0 10 20 30 40 50 60 70
(c) (d)
Figure 11.20 Diagnostic plots for the adsorption data: (a) standardized residual
versus 1 (b) standardized residual versus 2 (c) standardized residual versus n
and (d) n versus
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 559
The hij coefficients depend only on the values of the predictors for the various observa-
tions and not on the resulting y values. The coefficient h11 is the weight given to y1 in
computing the corresponding predicted value, and an analogous interpretation applies
to h22, . . . , hnn. Intuitively, a large value of hii for any particular i identifies an observa-
tion that is heavily weighted in calculating the corresponding predicted value. The first
observation is said to have high leverage—high potential influence—if h11 is large rela-
tive to the other hii’s. The influence is only potential because whether an observation is
actually influential depends on its y value as well as the values of the predictors. Minitab
will flag any observation whose hii exceeds 3(k 1 1)yn ( ^ hii 5 k 1 1, so an observation
is flagged if its hii is three times the average of all the hii’s). The hii’s for the adsorption
data are as follows:
.308 .278 .154 .145 .359 .103 .127 .133 .088 .152 .535 .531 .087
Since 3(k 1 1)yn 5 3(3)y13 5 .692, no observation can be characterized as having high
leverage.
A commonly used strategy for assessing the impact of an “unusual” observation—
either a large standardized residual or high leverage—is to remove the observation from
the data set and refit the same model using the remaining observations. If any of the
calculated quantities, such as the bi’s, R2, and se, change substantially from their values
before deletion of the unusual observation, the regression analysis is unstable. When the
observation with the large standardized residual was removed from the adsorption data,
estimated coefficients and other quantities changed very little. When large changes do
occur, one possibility is to use a “robust” fitting technique for which estimated coef-
ficients are not so heavily affected by unusual observations as they may be for a least
squares fit. Consult one of the chapter references for more information on these matters.
Model Selection
An investigator has obtained data on a response variable y and a “candidate pool” of p
predictors (some of which may be mathematical functions of others, such as interaction
or quadratic predictors) and wishes to fit a multiple regression model. Frequently, some
of these p predictors are only weakly related to y or contain information that duplicates
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
560 chapter 11 Inferential Methods in Regression and Correlation
information provided by some of the other predictors. So the issue is how to select a
subset of predictors from the candidate pool to obtain an effective model.
One type of model selection strategy involves fitting all possible models, computing
one or more summary quantities from each fit, and comparing these quantities to iden-
tify the most satisfactory model. With p predictors in the pool, there are 2p possible mod-
els when the model that contains none of the predictors is counted (because there are
two possibilities for each predictor, it could be included in the model or not included).
When p exceeds 5, it is obviously time-consuming to sit in front of a computer and
explicitly request that each possible model in turn be fit. Several of the most powerful
statistical computer packages have an all-subsets option, which will give limited output
from several of the best (according to criteria discussed shortly) models of each different
size. Once the field has been narrowed, the fit of each finalist can then be examined in
more detail. Minitab can be used for this purpose as long as p # 31 (for p 5 31, over
2 billion models are under consideration).
Suppose that p is small enough for the all-subsets option to be feasible. What crite-
ria can be used to select a winner? An obvious and appealing choice is the coefficient of
multiple determination, R2. Certainly for two models containing the same number of
predictors, if the corresponding R2 values are quite different, the model with the larger
value should be preferred to the one with the smaller value. However, using R2 as a
basis for choosing between models that contain different numbers of predictors is not
so straightforward. The reason is that adding a predictor to a model can never result in
a decrease in R2; there is almost always an increase, though it may be quite small. In
particular, let
R2i 5 largest R2 value for any model containing i predictors (i 5 1, 2, . . . , p)
Then R21 # R21 # # R2p. The objective then is not simply to find the model with the
2
largest R value; the model with all p predictors from the candidate pool does that. In-
stead, we should look for a model that contains relatively few predictors but has a large
R2 value. The model should be such that no other model containing more predictors
yields much of an improvement in R2 value. Suppose, for example, that p 5 5 and that
R21 5 .427 R22 5 .733 R23 5 .885 R24 5 .898 R25 5 .901
The best three-predictor model seems to be a good choice, since it substantially im-
proves on the best one- and two-predictor models, whereas very little is gained by using
more than three predictors.
A small increase in R2 resulting from the addition of a predictor to a model may be
offset by the increased complexity of the new model and the reduction in df associated
with SSResid (resulting in less precise estimates and predictions). This is the rationale
for adjusted R2, which can either decrease or increase when a predictor is added to the
model. We can then think of identifying the model whose adjusted R2 is largest and then
consider only this model and any others whose adjusted R2 values are nearly as large.
When considering models containing some fixed number of predictors, for exam-
ple, k 5 8, there may be several different models whose R2 and adjusted R2 values are
rather close to one another. By focusing only on the model with the highest values of
these two criterion measures, we may miss out on other good models that are easier
to interpret and use for estimation and prediction. For this reason, most all-subsets
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 561
procedures allow the analyst to specify some number of models c of each given size
(e.g., c 5 3) for which output should be provided.
One other criterion for model selection that has been used with increasing
frequency in recent years is Mallows’ CP. Let i denote the mean or expected value
of yi, which is the value of the response variable for the ith observation in our sample.
Then after fitting any particular model, yni calculated from the fit provides an estimate
of i, and the total expected estimation error for all observations in the data set is
^ E [(yni 2 i)2]. Mallows’ CP is an estimate of this total expected estimation error nor-
malized in a certain way. It is desirable to choose a model for which CP is small. One
additional consideration is that to protect against possible biases in estimates of popula-
tion regression coefficients, it is desirable to have CP k 1 1 when the model under
consideration has k predictors.
Example 11.21 The bond shear strength data introduced in Section 11.5 contains values of four
different independent variables x1–x4. We found that the model with only these four
variables as predictors was useful and that there was no compelling reason to con-
sider the inclusion of second-order predictors. Figure 11.21 is the Minitab output
that results from a request to identify the two best models of each given size.
The best two-predictor model, with predictors power and temperature, seems to
be a very good choice on all counts: R2 is significantly higher than for models with
fewer predictors yet almost as large as for any larger models, adjusted R2 is almost at
Unless otherwise noted, all content on this page is © Cengage Learning.
The choice of a “best” model in Example 11.21 seemed reasonably clear-cut. This
is often not the case. More typically, there will be several different models that are more
or less equally appealing in terms of the criteria discussed here. These finalists would
then have to be examined in more detail to choose the best model.
If the number of predictors in the candidate pool is too large or if suitable software
is not available, an alternative to an all-subsets or best regression approach is to use an
automatic selection procedure. The most easily understood such procedure is backward
elimination. First, fit the model containing all predictors in the candidate pool, then
eliminate predictors one by one until at some point all remaining predictors seem im-
portant. This involves looking at the t-ratios biysbi on all coefficients for predictors in the
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
562 chapter 11 Inferential Methods in Regression and Correlation
model at each stage of the process. The obvious candidate for elimination is the predic-
tor corresponding to the t-ratio closest to zero. The most frequently used rule of thumb
in practice is to stop eliminating predictors when all t-ratios either exceed 2 or are less
than 22. Some packages use F ratios, which are the squares of t-ratios, with a cutoff of 4.
Example 11.22 Figure 11.22 shows Minitab output from the backward elimination procedure ap-
plied to the bond shear strength data (this was done within Minitab’s Stepwise op-
tion). At the first stage, the t-ratio closest to zero was 1.01 for the coefficient cor-
responding to the predictor force. Since this t-ratio is between 22 and 2, force is
eliminated (this would also have been the case if the t-ratio had been 21.01). At
the next stage, the model with the three remaining predictors was fit. The predictor
time now qualifies for elimination, since the corresponding t-ratio 1.23 is closest
to 0 and between 22 and 2. When the model with the two remaining predictors is
fit, both the corresponding t-ratios exceed 2 in absolute value, and the procedure is
terminated. The resulting model is the same one that we suggested previously based
on all-subsets considerations.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 563
Multicollinearity
When the values of the single predictor x in a simple linear regression analysis are all
quite close to one another, sb will usually be quite large, indicating that the slope coeffi-
cient has been imprecisely estimated. The analogous situation in multiple regression is
referred to as multicollinearity. When the model to be fit includes the k predictors x1, . . . ,
xk, there is said to be multicollinearity if there is a strong linear relationship between these
predictors (so multicollinearity has nothing to do with the response variable y). Severe
multicollinearity leads to poorly estimated population regression coefficients and various
other problems. The most straightforward way to recognize the presence of multicol-
linearity is to fit k different regression models, each of which has one of the x variables as
the dependent variable and the other k 2 1 predictors as the independent variables (e.g.,
if k 5 5, there would be five regressions, the first with x1 as the dependent variable, the
second with x2 playing this role, and so on). If one or more of the resulting R2 values is
close to 1, multicollinearity exists. If you use Minitab to regress y against the k predictors,
a warning message will appear if any of these R2’s exceeds .99, and the package will not
allow you to include all predictors if any R2 exceeds .9999. Many analysts would be more
conservative and say that multicollinearity is a problem if any R2 exceeds .9.
When values of the predictor variable are under the control of the experimenter,
as was the case in the bond shear strength example, a careful choice of values will
preclude multicollinearity from arising. It is, however, often a problem in social sci-
ence or business applications, where data results simply from observation rather than
from intervention by an investigator. Statisticians have proposed various remedies
for the problems associated with multicollinearity, but a discussion would take us
beyond the scope of this book (after all, we want to leave something for your next
statistics course!).
Logistic Regression
The simple linear regression model is appropriate for relating a quantitative response
variable y to a quantitative predictor x. Suppose that y is a dichotomous variable with
possible values 1 and 0 corresponding to success and failure. Let 5 P(S) 5 P(y 5 1).
Frequently, the value of will depend on the value of some quantitative variable x. For
example, the probability that a car needs warranty service of a certain kind might well
depend on the car’s mileage, or the probability of avoiding an infection of a certain type
might depend on the dosage in an inoculation. Instead of using just the symbol for the
success probability, we now use (x) to emphasize the dependence of this probability
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
564 chapter 11 Inferential Methods in Regression and Correlation
Whereas (x) is a probability and therefore must be between 0 and 1, 1 x need not
be in this range.
Instead of letting the mean value of y be a linear function of x, we now consider a
model in which some function of the mean value of y is a linear function of x. In other
words, we allow (x) to be a function of 1 x rather than 1 x itself. A function that
has been found quite useful in many applications is the logit function,
e1x
(x) 5
1 1 e1x
Figure 11.23 shows a graph of (x) for particular values of and with . 0. As x
increases, the probability of success increases. For negative, the success probability
would be a decreasing function of x.
( )
1.0
.5
0
10 20 30 40 50 60 70 80
Logistic regression means assuming that (x) is related to x by the logit function.
Straightforward algebra shows that
(x)
5 e1x
1 2 (x)
The expression on the left-hand side is called the odds. Suppose, for example, that
(60)y[1 2 (60)] 5 3. Then when x 5 60 a success is three times as likely as a fail-
ure. We now see that the logarithm of the odds is a linear function of the predictor. In
particular, the slope parameter is the change in the log odds associated with a 1-unit
increase in x. This implies that the odds itself changes by the multiplicative factor e
when x increases by 1 unit.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 565
Fitting the logistic regression to sample data requires that the parameters and
be estimated. This is usually done using the maximum likelihood technique described
in Chapter 7. The details are quite involved, but fortunately the most popular statistical
computer packages will do this on request and provide quantitative and pictorial indica-
tions of how well the model fits.
Example 11.23 Here is data on launch temperature and the incidence of failure for O-rings in
24 space shuttle launches prior to the Challenger disaster of 1986.
Figure 11.24 shows Minitab output for a logistic regression analysis and a graph of
the estimated logit function from the R software. We have chosen to let denote
the probability of failure. The graph of decreases as temperature increases because
failures tended to occur at lower temperatures than did successes. The estimate of
is b 5 2.232, and the estimated standard deviation of b is sb 5 .1082. Provided
that n is large enough, and we assume it is in this case, b has approximately a normal
distribution. If 5 0 (temperature does not affect the likelihood of O-ring failure),
z 5 bysb has approximately a standard normal distribution. The value of this z-ratio is
22.14, and the P-value for a two-tailed test is .032 (some packages report a chi-square
value, which is just z2, with the same P-value). At significance level .05, we reject the
null hypothesis of no temperature effect.
The estimated odds of failure for any particular temperature value x is
(x)
5 e15.04292.232163x
1 2 (x)
This implies that the odds ratio, the odds of failure at a temperature of x 1 1 divided
by the odds of failure at a temperature of x, is
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
566 chapter 11 Inferential Methods in Regression and Correlation
(a)
1.0 Y YY Y Y Y
Y
0.8
Y Failure
N No Failure
0.6 Predicted Probability of Failure
Failure
0.4
0.2
N
N N N
0.0 NN N NN NN NN NN N
55 60 65 70 75 80
Temperature
(b)
Figure 11.24 (a) Logistic regression output from Minitab for Example 11.23;
(b) graph of estimated logistic function from R
The interpretation is that for each additional degree of temperature, we estimate that
the odds of failure will decrease by a factor of .79 (21%). A 95% CI for the true odds
ratio also appears on the output. Unless otherwise noted, all content on this page is © Cengage Learning.
The launch temperature for the Challenger mission was only 31°F. This temper-
ature is much smaller than any value in the sample, so it is dangerous to extrapolate
the estimated relationship. Nevertheless, it appears that O-ring failure is virtually a
sure thing for a temperature this small.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Further Aspects of Regression Analysis 567
1 2 (x1, . . . , xi 1 1, . . . , xk)
… …
5 e11x1 1 ixi 1 1kxk 1i
(x1, . . . , xk)
5 e i
1 2 (x1, . . . , xk)
Again, statistical software must be used to estimate parameters, calculate relevant stan-
dard deviations, and provide other inferential information.
Example 11.24 Data was obtained from 189 women who gave birth during a particular period
at the Baystate Medical Center in Springfield, Massachusetts, in order to iden-
tify factors associated with low birth weight. The accompanying Minitab output
resulted from a logistic regression in which the dependent variable indicated
whether (1) or not (0) a child had low birth weight (,2500 g), and predictors were
weight of the mother at her last menstrual period, age of the mother, and an indi-
cator variable for whether (1) or not (0) the mother had smoked during pregnancy.
Logistic Regression Table
Odds 95% CI
Predictor Coef SE Coef z p Ratio Lower Upper
Constant 2.06239 1.09516 1.88 0.060
Wt -0.01701 0.00686 -2.48 0.013 0.98 0.97 1.00
Age -0.04478 0.03391 -1.32 0.187 0.96 0.89 1.02
Smoke 0.65480 0.33297 1.97 0.049 1.92 1.00 3.70
It appears that age is not an important predictor of low birth weight, provided that the
two other predictors are retained. The other two predictors do appear to be informa-
tive. The point estimate of the odds ratio associated with smoking status is 1.92 (ratio
of the odds of low birth weight for a smoker to the odds for a nonsmoker); at the 95%
confidence level, the odds of a low-birth-weight child could be as much as 3.7 times
higher for a smoker than what it is for a nonsmoker.
Please see one of the chapter references for more information on logistic regres-
sion, including methods for assessing model effectiveness and adequacy.
We have reached the end of our exposition, but hopefully this is not the end of your
statistical education. Our hope is that you have enjoyed the journey through statistics
thus far and that you will find many opportunities to apply the concepts and methods
in the near future. Enjoy!!
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
568 chapter 11 Inferential Methods in Regression and Correlation
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Exercises 569
Standardized
Obs Velocity Viscosity Mesh size Response residual hii
1 2.14 10.00 .34 28.9 2.01721 .202242
2 4.14 10.00 .34 26.1 1.34706 .066929
3 8.15 10.00 .34 22.8 .96537 .274393
4 2.14 2.63 .34 24.2 1.29177 .224518
5 4.14 2.63 .34 15.7 2.68311 .079651
6 8.15 2.63 .34 18.3 .23785 .267959
7 5.60 1.25 .34 18.1 .06456 .076001
8 4.30 2.63 .34 19.1 .13131 .074927
9 4.30 2.63 .34 15.4 2.74091 .074927
10 5.60 10.10 .25 12.0 21.38857 .152317
11 5.60 10.10 .34 19.8 2.03585 .068468
12 4.30 10.10 .34 18.6 2.40699 .062849
13 2.40 10.10 .34 13.2 21.92274 .175421
14 5.60 10.00 .55 22.8 21.07990 .712933
15 2.14 112.00 .34 41.8 21.19311 .516298
16 4.14 112.00 .34 48.6 1.21302 .513214
17 5.60 10.10 .25 19.2 .38451 .152317
18 5.60 10.10 .25 18.4 .18750 .152317
19 5.60 10.10 .25 15.0 2.64979 .152317
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
570 chapter 11 Inferential Methods in Regression and Correlation
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
11.6 Exercises 571
53. The article “Response Surface Methodology for S = 0.268703 R-Sq = 96.7% R-Sq(adj) = 93.4%
Protein Extraction Optimization of Red Pepper
Analysis of Variance
Seed” (Food Sci. and Tech., 2010: 226–231) gave
data on the response variable y 5 protein yield (%) Source DF SS MS F P
and the independent variables x1 5 temperature Regression 14 29.4287 2.1020 29.11 0.000
(°C), x2 5 pH, x3 5 extraction time (min), and x4 5
Res. Error 14 1.0108 0.0722
solvent/meal ratio.
a. Fitting the model with the four xi’s as predictors Total 28 30.4395
yielded the following output:
Does at least one of the second-order predictors
appear to be useful? Carry out an appropriate
Predictor Coef SE Coef T P
test of hypotheses.
Constant -4.586 2.542 -1.80 0.084 c. From the output in part (b), we conjecture that
x1 0.01317 0.02707 0.49 0.631 none of the predictors involving x1 are provid-
x2 1.6350 0.2707 6.04 0.000 ing useful information. When these predictors
x3 0.02883 0.01353 2.13 0.044 were eliminated, the value of SSResid for the
x4 0.05400 0.02707 1.99 0.058 reduced regression model is 1.1887. Does this
support the conjecture?
d. Here is output from Minitab’s best subsets op-
Source DF SS MS F P tion, with just the single best subset of each
Regression 4 19.8882 4.9721 11.31 0.000 size identified. Which model(s) would you
Res. Error 24 10.5513 0.4396 consider using (subject to checking model
Total 28 30.4395 adequacy)?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
572 chapter 11 Inferential Methods in Regression and Correlation
1 2 3 4 x x x x x x
s s s s 1 1 1 2 2 3
Mallows x x x x q q q q x x x x x x
Vars R-Sq R-Sq(adj) Cp S 1 2 3 4 d d d d 2 3 4 3 4 4
1 52.7 50.9 174.4 0.73030 X
2 67.9 65.4 112.5 0.61349 X X
3 77.5 75.0 73.1 0.52124 X X X
4 83.4 80.7 50.8 0.45835 X X X X
5 90.9 88.9 21.4 0.34731 X X X X X
6 94.6 93.1 7.9 0.27422 X X X X X X
7 95.8 94.4 4.7 0.24683 X X X X X X X
8 96.2 94.6 5.1 0.24137 X X X X X X X X
9 96.4 94.7 6.1 0.23962 X X X X X X X X X
10 96.6 94.6 7.5 0.24132 X X X X X X X X X X
11 96.6 94.4 9.4 0.24716 X X X X X X X X X X X
12 96.6 94.1 11.2 0.25328 X X X X X X X X X X X X
13 96.7 93.8 13.1 0.26041 X X X X X X X X X X X X X
14 96.7 93.4 15.0 0.26870 X X X X X X X X X X X X X X
54. It seems reasonable that the size of a cancerous out to determine risk factors for kyphosis reported
tumor should be related to the likelihood that the the accompanying ages (months) for 40 subjects at
cancer will spread (metastasize) to another site. the time of the operation; the first 18 subjects did
The article “Molecular Detection of p16 Promoter have kyphosis and the remaining 22 did not.
Methylation in the Serum of Patients with Esopha- Kyphosis: 12 15 42 52 59 73
geal Squamous Cell Carcinoma” (Cancer Res., 82 91 96 105 114 120
2001: 3135–3138) investigated the spread of esoph- 121 128 130 139 139 157
ageal cancer to the lymph nodes. With x 5 size of
No kyphosis: 1 1 2 8 11 18
a tumor (cm) and y 5 1 if the cancer does spread,
consider the logistic regression model with a 5 22 22 31 37 61 72 81
and b 5 .5 (values suggested by data in the article). 97 112 118 127 131 140
a. Tabulate values of x, (x), the odds (x)y[1 2 151 159 177 206
(x)], and the log odds for x 5 0, 1, 2, . . . , 10. Use the Minitab logistic regression output below
b. Explain what happens to the odds when x is in- to decide whether age appears to have a significant
creased by 1. Your explanation should involve impact on the presence of kyphosis.
the .5 that appears in the formula for (x).
56. The following data resulted from a study commis-
c. For what value of x are the odds 1? 5? 10?
sioned by a large management consulting company
55. Kyphosis refers to severe forward flexion of the spine to investigate the relationship between amount of
following corrective spinal surgery. A study carried job experience (months) for a junior consultant
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 573
Supplementary Exercises
58. Suppose data was collected on y 5 bulk density Here is the Mintab output from a request to fit a
(kg/m3) and x 5 moisture content (%) for a sample simple linear regression model of y on x:
of six seeds of a particular type resulting in the ac-
Unless otherwise noted, all content on this page is © Cengage Learning.
460
Noticing the relatively small P-value for the mois-
440 ture predictor, a fellow student concludes that,
based on the model utility test, there is a useful lin-
420
ear relationship between the two variables. Com-
400 ment on the validity of this conclusion. How useful
5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 is this Minitab output (keeping in mind the scat-
moisture terplot of the data)?
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
574 chapter 11 Inferential Methods in Regression and Correlation
59. The accompanying data was read from a scatterplot Relevant summary qualities are as follows:
in the article “Urban Emissions Measured with
Aircraft” (J. of the Air and Waste Mgmt. Assoc., ^ xi 5 19.404, ^ yi 5 2.549, ^ x2i 5 13.248032
1998: 16–25). The response variable is DNOy and ^ y2i 5 11.835795, ^ xiyi 5 3.497811
the explanatory variable is DCO. Sxx 5 13.248032 2 (19.404)2y32 5 1.48193150,
DCO: 50 60 95 108 135 Syy 5 11.82637622
DNOy: 2.3 4.5 4.0 3.7 8.2 Sxy 5 3.497811 2 (19.404)(2.549)y32 5 3.83071088
DCO: 210 214 315 720
a. Fit the simple linear regression model to this
DNOy: 5.4 7.2 13.8 32.1
data. Then determine the proportion of ob-
a. Fit an appropriate model to the data and judge served variation in astringency that can be at-
the utility of the model. tributed to the model relationship between as-
b. Predict the value of DNOy that would result from tringency and tannin concentration.
making one more observation when DCO is b. Calculate and interpret a confidence interval for
400, and do so in a way that conveys information the slope of the true regression line.
about precision and reliability. Does it appear c. Estimate true average astringency when tannin
that DNOy can be accurately predicted? Explain. concentration is .6, and do so in a way that con-
c. The largest value of DCO is much greater than the veys information about reliability and precision.
other values. Does this observation appear to have d. Predict astringency for a single wine sample
had a substantial impact on the fitted equation? whose tannin concentration is .6, and do so in
60. Astringency is the quality in a wine that makes a wine a way that conveys information about reliability
drinker’s mouth feel slightly rough, dry, and puckery. and precision.
The paper “Analysis of Tannins in Red Wine Us- 61. In a discussion of the article “Tensile Behavior of
ing Multiple Methods: Correlation with Perceived Slurry Infiltrated Mat Concrete (SIMCON)” (ACI
Astringency” (Amer. J. Enol. Vitic., 2006: 481–485) Materials J., 1998: 77–79), the discussant presented
reported on an investigation to assess the relationship data on y 5 toughness (psi) and x 5 aspect ratio. He
between perceived astringency and tannin concen- stated that “a (simple linear) regression analysis clear-
tration using various analytic methods. Here is data ly shows that the aspect ratio is not a reliable variable
provided by the authors on x 5 tannin concentra- that can be used to predict toughness.” The following
tion by protein precipitation and y 5 perceived observations were read from a graph in the article:
astringency as determined by a panel of tasters.
x: 0.718 0.808 0.924 1.000 x: 500 500 500 500 500 715 715 715 715 715
y: 0.428 0.480 0.493 0.978 y: 33 34 35 38 40 35 36 37 39 44
x: 0.667 0.529 0.514 0.559
a. Why is the relationship between these two vari-
y: 0.318 0.298 20.224 0.198
ables clearly not deterministic?
x: 0.766 0.470 0.726 0.762
b. Fit the simple linear regression model, and state
y: 0.326 20.336 0.765 0.190
whether you agree with the discussant’s assessment.
x: 0.666 0.562 0.378 0.779
c. Even if the y values had been much closer to-
y: 0.066 20.221 20.898 0.836
gether, so that the model could be judged use-
x: 0.674 0.858 0.406 0.927
ful, would there be any way to check model
y: 0.126 0.305 20.577 0.779
adequacy to decide whether a quadratic regres-
x: 0.311 0.319 0.518 0.687
sion model would be more appropriate? Explain
y: 20.707 20.610 20.648 20.145
your reasoning.
x: 0.907 0.638 0.234 0.781
y: 1.007 20.090 21.132 0.538 62. The accompanying data on y 5 energy output (W)
x: 0.326 0.433 0.319 0.238 and x 5 temperature difference (K) was provided
y: 21.098 20.581 20.862 20.551 by the authors of the article “Comparison of Energy
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 575
and Exergy Efficiency for Solar Box and Parabolic Settleability Measures” (Water Environ. Research,
Cookers” (J. of Energy Engr., 2007: 53–62). 1998: 87–93) included a scatterplot of y 5 final set-
tled height fraction versus x 5 initial solids concen-
x: 23.20 23.50 23.52 24.30 25.10 26.20
tration (g/L), from which the following data was read:
y: 3.78 4.12 4.24 5.35 5.87 6.02
x: 27.40 28.10 29.30 30.60 31.50 32.01 x: .5 .9 1.1 1.7 2.0 2.2 2.7 3.0 3.3 4.2
y: 6.12 6.41 6.62 6.43 6.13 5.92 y: .06 .08 .10 .13 .15 .16 .18 .17 .15 .27
x: 32.63 33.23 33.62 34.18 35.43 35.62 x: 4.5 5.3 5.8 5.9 6.2 6.8 7.2 9.1 9.4 10.4
y: 5.64 5.45 5.21 4.98 4.65 4.50 y: .30 .25 .31 .32 .48 .43 .32 .40 .61 .57
x: 36.16 36.23 36.89 37.90 39.10 4166
Summary quantities include n 5 20,
y: 4.34 4.03 3.92 3.65 3.02 2.89
^ xi 5 92.2 ^ x2i 5 591.46
The article’s authors fit a cubic regression model to
the data. Here is Minitab output from such a fit.
^ yi 5 5.44 ^ y2i 5 1.9674
The regression equation is
^ xiyi 5 33.577
y = -134 + 12.7 x - 0.377 x**2 + 0.00359 x**3 a. The article included the statement “the linear
Predictor Coef SE Coef T P correlation coefficient, r2 5 .89.” Is this entire
Constant -133.787 8.048 -16.62 0.000 statement correct? If not, why, and what part is
x 12.7423 0.7750 16.44 0.000 correct?
x**2 -0.37652 0.02444 -15.41 0.000 b. Carry out a test of appropriate hypotheses to see
x**3 0.0035861 0.0002529 14.18 0.000 whether there is in fact a linear relationship be-
tween the two variables.
s = 0.168354 R-Sq = 98.0% R-Sq(adj) = 97.7% c. The standardized residuals from fitting the sim-
ple linear regression model are (in increasing
Analysis of Variance
order of x values) 2.04, 2.05, .14, .13, .22, .21,
Source DF SS MS F P
.11, 2.37, 21.04, .36, .63, 21.08, 2.43, 2.34,
Regression 3 27.9744 9.3248 329.00 0.000
2.40, .88, 21.62, 22.04, 1.90, and .05. Does a
Res. Error 20 0.5669 0.0283
plot of the standardized residuals versus x show
Total 23 28.5413
a disturbing pattern? Explain.
a. What proportion of observed variation in energy
64. The use of microorganisms to dissolve metals from
output can be attributed to the model relationship?
ores has offered an ecologically friendly and less
b. Fitting a quadratic model to the data results in
expensive alternative to traditional methods. The
R2 5 .780. Calculate adjusted R2 for this model
dissolution of metals by this method can be done
and compare to adjusted R2 for the cubic
in a two-stage bioleaching process: (1) microorgan-
model.
isms are grown in culture to produce metabolites
c. Does the cubic predictor appear to provide use-
(e.g. organic acids) and (2) ore is added to the
ful information about y over and above that
culture medium to initiate leaching. The article
provided by the linear and quadratic predictors?
“Two-Stage Fungal Leaching of Vanadium from
State and test the appropriate hypotheses.
Uranium Ore Residue of the Leaching Stage us-
d. When x 5 30, syn5.0611. Calculate a 95% CI
ing Statistical Experimental Design” (Annals of
for true average energy output in this case, and
Nuclear Energy, 2013: 48–52) reported on a two-
also a 95% PI for a single energy output to be
stage bioleaching process of vanadium by using
observed when temperature difference is 30.
the fungus Aspergillus niger. In one study, the au-
63. Secondary settling tanks play an important role in thors examined the impact of the variables x1 5
the performance of suspended-growth activated- pH, x2 5 sucrose concentration (g/L), and x3 5
sludge processes. The article “Sludge Volume Index spore population (106 cells/ml) on y 5 oxalic acid
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
576 chapter 11 Inferential Methods in Regression and Correlation
production (mg/L). The accompanying SAS out- dependent variable is durable press rating, a quan-
put resulted from a request to fit the model with titative measure of wrinkle resistance, and the four
predictors x1, x2, and x3 only. independent variables are formaldehyde concentra-
tion, catalyst ratio, curing temperature, and curing
Source DF Sum of Mean F Pr > F time, respectively.
Squares Square Value a. Fitting the model with the four independent
Model 3 5861301 1953767 7.53 0.0052 variables as predictors resulted in the following
Error 11 2855951 259632 Minitab output. Does the fitted model appear to
C. Total 14 8717252
be useful?
Fitting the complete second-order model resulted The regression equation is
in SSResid 5 541,632. Carry out a test at signifi- durpr = –0.912 + 0.161 formconc
cance level .01 to decide whether at least one of the + 0.220 catratio + 0.0112 temp
second-order predictors provides useful informa- + 0.102 time
tion about oxalic acid production. Predictor Coef StDev T p
Constant –0.9122 0.8755 –1.04 0.307
65. The article cited in Exercise 64 also examined the formconc 0.16073 0.06617 2.43 0.023
effect of x1 5 pH, x2 5 sucrose concentration (g/L), catratio 0.21978 0.03406 6.45 0.000
and x3 5 spore population (106 cells/ml) on y 5 temp 0.011226 0.004973 2.26 0.033
gluconic acid production (mg/L). The accompany- time 0.10197 0.05874 1.74 0.095
ing SAS output resulted from a request to fit the S = 0.8365 R-Sq = 69.2% R-Sq(adj) = 64.3%
model with predictors x1, x2, and x3 only. Analysis of Variance
Source DF SS MS F P
Source DF Sum of Mean F Pr > F
Regression 4 39.3769 9.8442 14.07 0.000
Squares Square Value
Error 25 17.4951 0.6998
Model 3 74027925 24675975 178.18 <.0001
Total 29 56.8720
Error 11 1523351 138486
C. Total 14 75551276 b. Estimate, in a way that conveys information
about precision and reliability, the average
Fitting the complete second-order model resulted
change in durability press rating associated with
in SSResid 5 805,534. Carry out a test at signifi-
a 1-degree increase in curing temperature when
cance level .01 to decide whether at least one of the
concentration, catalyst ratio, and curing time all
second-order predictors provides useful informa-
remain fixed.
tion about oxalic acid production.
c. Given that catalyst ratio, curing temperature,
66. The accompanying data was taken from the article and curing time all remain in the model, do you
“Applying Stepwise Multiple Regression Analysis think that formaldehyde concentration provides
to the Reaction of Formaldehyde with Cotton Cel- useful information about durable press rating?
lulose” (Textile Research J., 1984: 157–165). The
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 577
d. Now consider models based not only on these of CEHDP Bleaching for High Brightness Kraft Pulp
four independent variables but also on second- Production,” TAPPI, 1964: 170A–173A). Each inde-
order predictors (four x2i predictors and six xixj pendent variable was allowed to assume five different
predictors). Use a statistical computer package values, and these values were coded for regression
to identify a good model based on this candidate analysis as follows:
pool of predictors. Coded
67. A study was carried out to investigate the relation- Variable value: –2 –1 0 1 2
ship between brightness of finished paper (y) and H2O2 .1 .2 .3 .4 .5
the variables percentage of H2O2 by weight, per- NaOH .1 .2 .3 .4 .5
centage of NaOH by weight, percentage of silicate Silicate .5 1.5 2.5 3.5 4.5
by weight, and process temperature (“Advantages Temperature 130 145 160 175 190
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
578 chapter 11 Inferential Methods in Regression and Correlation
The data follow: temperature is 175. What are the values of the re-
siduals for the observations made with these values
Sili- Temp- Bright-
of the independent variables?
Obs H2O2 NaOH cate erature ness
b. Express the estimated regression in uncoded form.
1 21 21 21 21 83.9
c. SSTo 5 17.2567 and R2 for the model of part (a)
2 1 21 21 21 84.9 is .885. When a model that includes only the
3 21 1 21 21 83.4 four independent variables as predictors is fit,
4 1 1 21 21 84.2 R2 5 .721. Carry out a test at level .05 to decide
5 21 21 1 21 83.8 whether at least one of the second-order predic-
6 1 21 1 21 84.7 tors provides useful information about brightness.
7 21 1 1 21 84.0
68. Three sets of journal bearing tests were run on a
8 1 1 1 21 84.8
Mil-L-8937-type film at each combination of three
9 21 21 21 1 84.5
loads (psi) and three speeds (rpm). The wear life
10 1 21 21 1 86.0 (hr) was recorded for each run, resulting in the fol-
11 21 1 21 1 82.6 lowing data (“Accelerated Testing of Solid Film
12 1 1 21 1 85.1 Lubricants,” Lubrication Engr., 1972: 365–372):
13 21 21 1 1 84.5
Load Load
14 1 21 1 1 86.0
Speed (1000s) Life Speed (1000s) Life
15 21 1 1 1 84.0
20 3 300.2 60 6 65.9
16 1 1 1 1 85.4
20 3 310.8 60 10 10.7
17 22 0 0 0 82.9
20 3 333.0 60 10 34.1
18 2 0 0 0 85.5
20 6 99.6 60 10 39.1
19 0 22 0 0 85.2
20 6 136.2 100 3 26.5
20 0 2 0 0 84.5
20 6 142.4 100 3 22.3
21 0 0 22 0 84.7
20 10 20.2 100 3 34.8
22 0 0 2 0 85.0
20 10 28.2 100 6 32.8
23 0 0 0 22 84.9
20 10 102.7 100 6 25.6
24 0 0 0 2 84.0
60 3 67.3 100 6 32.7
25 0 0 0 0 84.5
60 3 77.9 100 10 2.3
26 0 0 0 0 84.7
60 3 93.9 100 10 4.4
27 0 0 0 0 84.6
60 6 43.0 100 10 5.8
28 0 0 0 0 84.9
60 6 44.5
29 0 0 0 0 84.9
a. With w 5 wear life, s 5 speed, and l 5 load (in
30 0 0 0 0 84.5
1000s), fit the model with dependent variable w
31 0 0 0 0 84.6
and predictors s and l, and assess the utility of
a. When the complete second-order coded model the fitted model.
was fit, the estimate of the constant term was 84.67; b. The cited article contains the comment that a
the estimated coefficients of the linear predictors lognormal distribution is appropriate for wear
were .650, 2.258, .133, and .108, respectively; the life, since ln(w) is known to follow a normal law.
estimated quadratic coefficients were 2.135, .028, The suggested model is w 5 3y(s l )4 «, where
.028, and –.072, respectively; and the estimated « denotes a random deviation and , , and
coefficients of the interaction predictors were .038, are parameters. Estimate the model param-
2.075, .213, .200, 2.188, and .050, respectively. eters, and obtain a prediction interval for wear
Calculate a point prediction of brightness when life when speed is 60 rpm and load is 6000 psi.
H2O2 is .4%, NaOH is .4%, silicate is 3.5%, and (Hint: Transform the model equation so it has
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Supplementary Exercises 579
the appearance of the general additive multiple a. Use various techniques to decide whether it is
regression model equation.) plausible that the two techniques measure on
average the same amount of fat.
69. Normal hatchery processes in aquaculture inevi-
b. Use the data to develop a way of predicting an
tably produce stress in fish, which may negatively
HW measurement from a BOD POD measure-
impact growth, reproduction, flesh quality, and
ment, and investigate the effectiveness of such
susceptibility to disease. Such stress manifests itself
predictions.
in elevated and sustained corticosteroid levels. The
article “Evaluation of Simple Instruments for the 71. Curing concrete is known to be vulnerable to
Measurement of Blood Glucose and Lactate, and shock vibrations, which may cause cracking or
Plasma Protein as Stress Indicators in Fish” (J. of hidden damage to the material. As part of a study
the World Aquaculture Society, 1999: 276–284) de- of vibration phenomena, the paper “Shock Vibra-
scribed an experiment in which fish were subjected tion Test of Concrete” (ACI Materials J., 2002:
to a stress protocol and then removed and tested at 361–370) reported the accompanying data on peak
various times after the protocol had been applied. particle velocity (mm/sec) and ratio of ultrasonic
The accompanying data on x 5 time (min) and y 5 pulse velocity after impact to that before impact in
blood glucose level (mmol/L) was read from a plot: concrete prisms:
x: 2 2 5 7 12 13 17 18 23 24 26 28
Obs ppv Ratio Obs ppv Ratio
y: 4.0 3.6 3.7 4.0 3.8 4.0 5.1 3.9 4.4 4.3 4.3 4.4
1 160 .996 16 708 .990
x: 29 30 34 36 40 41 44 56 56 57 60 60
2 164 .996 17 806 .984
y: 5.8 4.3 5.5 5.6 5.1 5.7 6.1 5.1 5.9 6.8 4.9 5.7
3 178 .999 18 884 .986
Use the methods developed in this chapter to analyze 4 252 .997 19 526 .991
the data, and write a brief report summarizing your 5 293 .993 20 490 .993
conclusions (assume that the investigators are partic- 6 289 .997 21 598 .993
ularly interested in glucose level 30 min after stress). 7 415 .999 22 505 .993
70. The article “Evaluating the BOD POD for As- 8 478 .997 23 525 .990
sessing Body Fat in Collegiate Football Players” 9 391 .992 24 675 .991
(Medicine and Science in Sports and Exercise, 10 486 .985 25 1211 .981
1999: 1350–1356) reports on a new air displace- 11 604 .995 26 1036 .986
ment device for measuring body fat. The custom- 12 528 .995 27 1000 .984
ary procedure utilizes the hydrostatic weighing 13 749 .994 28 1151 .982
device, which measures the percentage of body fat 14 772 .994 29 1144 .962
by means of water displacement. Here is represen- 15 532 .987 30 1068 .986
tative data read from a graph in the paper.
Obs BOD HW Obs BOD HW Transverse cracks appeared in the last 12 prisms,
whereas there was no observed cracking in the first
1 2.5 8.0 11 12.2 15.3
18 prisms.
2 4.0 6.2 12 12.6 14.8
a. Construct a comparative boxplot of ppv for the
3 4.1 9.2 13 14.2 14.3
cracked and uncracked prisms, and comment.
4 6.2 6.4 14 14.4 16.3
Then estimate the difference between true aver-
5 7.1 8.6 15 15.1 17.9 age ppv for cracked and uncracked prisms in
6 7.0 12.2 16 15.2 19.5 a way that conveys information about precision
7 8.3 7.2 17 16.3 17.5 and reliability.
8 9.2 12.0 18 17.1 14.3 b. The investigators fit the simple linear regression
9 9.3 14.9 19 17.9 18.3 model to the entire data set consisting of 30 obser-
10 12.0 12.1 20 17.9 16.2 vations, with ppv as the independent variable and
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
580 chapter 11 Inferential Methods in Regression and Correlation
ratio as the dependent variable. Use a statistical b. For the soccer players, the sample correlation
software package to fit several different regression coefficient calculated from the values of x 5
models, and draw appropriate inferences. soccer exposure (total number of competitive
seasons played prior to enrollment in the study)
72. Have you ever wondered whether soccer players
and y 5 score on an immediate memory recall
suffer adverse effects from hitting “headers”? The
test was r 5 2.220. Interpret this result.
authors of the article “No Evidence of Impaired
c. Here is summary information on score on a con-
Neurocognitive Performance in Collegiate Soccer
trolled oral word association test for the soccer
Players” (The Amer. J. of Sports Medicine, 2002:
and nonsoccer athletes:
157–162) investigated this issue from several
perspectives. n1 5 26 x15 37.50 s1 5 9.13
a. The paper reported that 45 of the 91 soccer n2 5 56 x25 39.63 s2 5 10.19
players in their sample had suffered at least one Analyze this data and draw appropriate conclu-
concussion, 28 of 96 nonsoccer athletes had suf- sions.
fered at least one concussion, and only 8 of 53 d. Considering the number of prior nonsoccer con-
student controls had suffered at least one con- cussions, the values of mean 6 sd for the three
cussion. Analyze this data and draw appropriate groups were .30 6 .67, .49 6 .87, and .19 6 .48.
conclusions. Analyze this data and draw appropriate conclusions.
Bibliography
Please see the bibliography for Chapter 3.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables
581
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
582 Appendix Tables
* 0
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
3.8 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0000
3.7 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001
3.6 .0002 .0002 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001
3.5 .0002 .0002 .0002 .0002 .0002 .0002 .0002 .0002 .0002 .0002
3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002
3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003
3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005
3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007
3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014
2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019
2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036
2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
1.8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294
1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681
1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985
1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148
0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451
0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859
0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247
0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 583
*
0
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
0.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
0.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
0.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
0.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
0.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
0.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997
3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998
3.5 .9998 .9998 .9998 .9998 .9998 .9998 .9998 .9998 .9998 .9998
3.6 .9998 .9998 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999
3.7 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999
3.8 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 .9999 1.0000
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
584 Appendix Tables
n 5
x 0.05 0.1 0.2 0.25 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.9 0.95
0 .774 .590 .328 .237 .168 .078 .031 .010 .002 .001 .000 .000 .000
1 .203 .329 .409 .396 .360 .259 .157 .077 .029 .015 .007 .000 .000
2 .022 .072 .205 .263 .309 .346 .312 .230 .132 .088 .051 .009 .001
3 .001 .009 .051 .088 .132 .230 .312 .346 .309 .263 .205 .072 .022
4 .000 .000 .007 .015 .029 .077 .157 .259 .360 .396 .409 .329 .203
5 .000 .000 .000 .001 .002 .010 .031 .078 .168 .237 .328 .590 .774
n 10
x 0.05 0.1 0.2 0.25 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.9 0.95
0 .599 .349 .107 .056 .028 .006 .001 .000 .000 .000 .000 .000 .000
1 .315 .387 .268 .188 .121 .040 .010 .002 .000 .000 .000 .000 .000
2 .075 .194 .302 .282 .233 .121 .044 .011 .001 .000 .000 .000 .000
3 .010 .057 .201 .250 .267 .215 .117 .042 .009 .003 .001 .000 .000
4 .001 .011 .088 .146 .200 .251 .205 .111 .037 .016 .006 .000 .000
5 .000 .001 .026 .058 .103 .201 .246 .201 .103 .058 .026 .001 .000
6 .000 .000 .006 .016 .037 .111 .205 .251 .200 .146 .088 .011 .001
7 .000 .000 .001 .003 .009 .042 .117 .215 .267 .250 .201 .057 .010
8 .000 .000 .000 .000 .001 .011 .044 .121 .233 .282 .302 .194 .075
9 .000 .000 .000 .000 .000 .002 .010 .040 .121 .188 .268 .387 .315
10 .000 .000 .000 .000 .000 .000 .001 .006 .028 .056 .107 .349 .599
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 585
n 15
x 0.05 0.1 0.2 0.25 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.9 0.95
0 .463 .206 .035 .013 .005 .000 .000 .000 .000 .000 .000 .000 .000
1 .366 .343 .132 .067 .030 .005 .000 .000 .000 .000 .000 .000 .000
2 .135 .267 .231 .156 .092 .022 .004 .000 .000 .000 .000 .000 .000
3 .031 .128 .250 .225 .170 .064 .014 .002 .000 .000 .000 .000 .000
4 .004 .043 .188 .225 .218 .126 .041 .007 .001 .000 .000 .000 .000
5 .001 .011 .103 .166 .207 .196 .092 .025 .003 .001 .000 .000 .000
6 .000 .002 .043 .091 .147 .207 .153 .061 .011 .003 .001 .000 .000
7 .000 .000 .014 .040 .081 .177 .196 .118 .035 .013 .003 .000 .000
8 .000 .000 .003 .013 .035 .118 .196 .177 .081 .040 .014 .000 .000
9 .000 .000 .001 .003 .011 .061 .153 .207 .147 .091 .043 .002 .000
10 .000 .000 .000 .001 .003 .025 .092 .196 .207 .166 .103 .011 .001
11 .000 .000 .000 .000 .001 .007 .041 .126 .218 .225 .188 .043 .004
12 .000 .000 .000 .000 .000 .002 .014 .064 .170 .225 .250 .128 .031
13 .000 .000 .000 .000 .000 .000 .004 .022 .092 .156 .231 .267 .135
14 .000 .000 .000 .000 .000 .000 .000 .005 .030 .067 .132 .343 .366
15 .000 .000 .000 .000 .000 .000 .000 .000 .005 .013 .035 .206 .463
n 20
x 0.05 0.1 0.2 0.25 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.9 0.95
0 .358 .122 .012 .003 .001 .000 .000 .000 .000 .000 .000 .000 .000
1 .377 .270 .058 .021 .007 .000 .000 .000 .000 .000 .000 .000 .000
2 .189 .285 .137 .067 .028 .003 .000 .000 .000 .000 .000 .000 .000
3 .060 .190 .205 .134 .072 .012 .001 .000 .000 .000 .000 .000 .000
4 .013 .090 .218 .190 .130 .035 .005 .000 .000 .000 .000 .000 .000
5 .002 .032 .175 .202 .179 .075 .015 .001 .000 .000 .000 .000 .000
6 .000 .009 .109 .169 .192 .124 .037 .005 .000 .000 .000 .000 .000
7 .000 .002 .055 .112 .164 .166 .074 .015 .001 .000 .000 .000 .000
8 .000 .000 .022 .061 .114 .180 .120 .035 .004 .001 .000 .000 .000
9 .000 .000 .007 .027 .065 .160 .160 .071 .012 .003 .000 .000 .000
10 .000 .000 .002 .010 .031 .117 .176 .117 .031 .010 .002 .000 .000
11 .000 .000 .000 .003 .012 .071 .160 .160 .065 .027 .007 .000 .000
12 .000 .000 .000 .001 .004 .035 .120 .180 .114 .061 .022 .000 .000
13 .000 .000 .000 .000 .001 .015 .074 .166 .164 .112 .055 .002 .000
14 .000 .000 .000 .000 .000 .005 .037 .124 .192 .169 .109 .009 .000
15 .000 .000 .000 .000 .000 .001 .015 .075 .179 .202 .175 .032 .002
16 .000 .000 .000 .000 .000 .000 .005 .035 .130 .190 .218 .090 .013
17 .000 .000 .000 .000 .000 .000 .001 .012 .072 .134 .205 .190 .060
18 .000 .000 .000 .000 .000 .000 .000 .003 .028 .067 .137 .285 .189
19 .000 .000 .000 .000 .000 .000 .000 .000 .007 .021 .058 .270 .377
20 .000 .000 .000 .000 .000 .000 .000 .000 .001 .003 .012 .122 .358
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
586 Appendix Tables
n 25
x 0.05 0.1 0.2 0.25 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.9 0.95
0 .277 .072 .004 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000
1 .365 .199 .023 .006 .002 .000 .000 .000 .000 .000 .000 .000 .000
2 .231 .266 .071 .025 .007 .000 .000 .000 .000 .000 .000 .000 .000
3 .093 .227 .136 .064 .024 .002 .000 .000 .000 .000 .000 .000 .000
4 .027 .138 .187 .118 .057 .007 .000 .000 .000 .000 .000 .000 .000
5 .006 .065 .196 .164 .103 .020 .002 .000 .000 .000 .000 .000 .000
6 .001 .024 .163 .183 .148 .045 .005 .000 .000 .000 .000 .000 .000
7 .000 .007 .111 .166 .171 .080 .015 .001 .000 .000 .000 .000 .000
8 .000 .002 .062 .124 .165 .120 .032 .003 .000 .000 .000 .000 .000
9 .000 .000 .030 .078 .134 .151 .061 .009 .000 .000 .000 .000 .000
10 .000 .000 .011 .042 .091 .161 .097 .021 .002 .000 .000 .000 .000
11 .000 .000 .004 .019 .054 .146 .133 .044 .004 .001 .000 .000 .000
12 .000 .000 .002 .007 .027 .114 .155 .076 .011 .002 .000 .000 .000
13 .000 .000 .000 .002 .011 .076 .155 .114 .027 .007 .002 .000 .000
14 .000 .000 .000 .001 .004 .044 .133 .146 .054 .019 .004 .000 .000
15 .000 .000 .000 .000 .002 .021 .097 .161 .091 .042 .011 .000 .000
16 .000 .000 .000 .000 .000 .009 .061 .151 .134 .078 .030 .000 .000
17 .000 .000 .000 .000 .000 .003 .032 .120 .165 .124 .062 .002 .000
18 .000 .000 .000 .000 .000 .001 .015 .080 .171 .166 .111 .007 .000
19 .000 .000 .000 .000 .000 .000 .005 .045 .148 .183 .163 .024 .001
20 .000 .000 .000 .000 .000 .000 .002 .020 .103 .164 .196 .065 .006
21 .000 .000 .000 .000 .000 .000 .000 .007 .057 .118 .187 .138 .027
22 .000 .000 .000 .000 .000 .000 .000 .002 .024 .064 .136 .227 .093
23 .000 .000 .000 .000 .000 .000 .000 .000 .007 .025 .071 .266 .231
24 .000 .000 .000 .000 .000 .000 .000 .000 .002 .006 .023 .199 .365
25 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .004 .072 .277
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 587
x .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0
0 .905 .819 .741 .670 .607 .549 .497 .449 .407 .368
1 .090 .164 .222 .268 .303 .329 .348 .359 .366 .368
2 .005 .016 .033 .054 .076 .099 .122 .144 .165 .184
3 .001 .003 .007 .013 .020 .028 .038 .049 .061
4 .001 .002 .003 .005 .008 .011 .015
5 .001 .001 .002 .003
6 .001
x 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 15.0 20.0
0 .135 .050 .018 .007 .002 .001 .000 .000 .000 .000 .000
1 .271 .149 .073 .034 .015 .006 .003 .001 .000 .000 .000
2 .271 .224 .147 .084 .045 .022 .011 .005 .002 .000 .000
3 .180 .224 .195 .140 .089 .052 .029 .015 .008 .000 .000
4 .090 .168 .195 .175 .134 .091 .057 .034 .019 .001 .000
5 .036 .101 .156 .175 .161 .128 .092 .061 .038 .002 .000
6 .012 .050 .104 .146 .161 .149 .122 .091 .063 .005 .000
7 .003 .022 .060 .104 .138 .149 .140 .117 .090 .010 .001
8 .001 .008 .030 .065 .103 .130 .140 .132 .113 .019 .001
9 .003 .013 .036 .069 .101 .124 .132 .125 .032 .003
10 .001 .005 .018 .041 .071 .099 .119 .125 .049 .006
11 .002 .008 .023 .045 .072 .097 .114 .066 .011
12 .001 .003 .011 .026 .048 .073 .095 .083 .018
13 .001 .005 .014 .030 .050 .073 .096 .027
14 .002 .007 .017 .032 .052 .102 .039
15 .001 .003 .009 .019 .035 .102 .052
16 .001 .005 .011 .022 .096 .065
17 .001 .002 .006 .013 .085 .076
18 .001 .003 .007 .071 .084
19 .001 .004 .056 .089
20 .001 .002 .042 .089
21 .001 .030 .085
22 .020 .077
23 .013 .067
24 .008 .056
25 .005 .045
26 .003 .034
27 .002 .025
28 .001 .018
29 .013
30 .008
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
588 Appendix Tables
0 0
– critical value critical value critical value
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Table V Tolerance critical values for normal population distributions
% of population captured $90% $95% $99% $90% $95% $99% $90% $95% $99% $90% $95% $99%
2 32.019 37.674 48.430 160.193 188.491 242.300 20.581 26.260 37.094 103.029 131.426 185.617
3 8.380 9.916 12.861 18.930 22.401 29.055 6.156 7.656 10.553 13.995 17.370 23.896
4 5.369 6.370 8.299 9.398 11.150 14.527 4.162 5.144 7.042 7.380 9.083 12.387
5 4.275 5.079 6.634 6.612 7.855 10.260 3.407 4.203 5.741 5.362 6.578 8.939
6 3.712 4.414 5.775 5.337 6.345 8.301 3.006 3.708 5.062 4.411 5.406 7.335
7 3.369 4.007 5.248 4.613 5.488 7.187 2.756 3.400 4.642 3.859 4.728 6.412
8 3.136 3.732 4.891 4.147 4.936 6.468 2.582 3.187 4.354 3.497 4.285 5.812
9 2.967 3.532 4.631 3.822 4.550 5.966 2.454 3.031 4.143 3.241 3.972 5.389
10 2.839 3.379 4.433 3.582 4.265 5.594 2.355 2.911 3.981 3.048 3.738 5.074
11 2.737 3.259 4.277 3.397 4.045 5.308 2.275 2.815 3.852 2.898 3.556 4.829
12 2.655 3.162 4.150 3.250 3.870 5.079 2.210 2.736 3.747 2.777 3.410 4.633
13 2.587 3.081 4.044 3.130 3.727 4.893 2.155 2.671 3.659 2.677 3.290 4.472
14 2.529 3.012 3.955 3.029 3.608 4.737 2.109 2.615 3.585 2.593 3.189 4.337
15 2.480 2.954 3.878 2.945 3.507 4.605 2.068 2.566 3.520 2.522 3.102 4.222
16 2.437 2.903 3.812 2.872 3.421 4.492 2.033 2.524 3.464 2.460 3.028 4.123
Sample size n 17 2.400 2.858 3.754 2.808 3.345 4.393 2.002 2.486 3.414 2.405 2.963 4.037
18 2.366 2.819 3.702 2.753 3.279 4.307 1.974 2.453 3.370 2.357 2.905 3.960
19 2.337 2.784 3.656 2.703 3.221 4.230 1.949 2.423 3.331 2.314 2.854 3.892
20 2.310 2.752 3.615 2.659 3.168 4.161 1.926 2.396 3.295 2.276 2.808 3.832
www.ebook3000.com
25 2.208 2.631 3.457 2.494 2.972 3.904 1.838 2.292 3.158 2.129 2.633 3.601
30 2.140 2.549 3.350 2.385 2.841 3.733 1.777 2.220 3.064 2.030 2.516 3.447
35 2.090 2.490 3.272 2.306 2.748 3.611 1.732 2.167 2.995 1.957 2.430 3.334
40 2.052 2.445 3.213 2.247 2.677 3.518 1.697 2.126 2.941 1.902 2.364 3.249
45 2.021 2.408 3.165 2.200 2.621 3.444 1.669 2.092 2.898 1.857 2.312 3.180
50 1.996 2.379 3.126 2.162 2.576 3.385 1.646 2.065 2.863 1.821 2.269 3.125
60 1.958 2.333 3.066 2.103 2.506 3.293 1.609 2.022 2.807 1.764 2.202 3.038
70 1.929 2.299 3.021 2.060 2.454 3.225 1.581 1.990 2.765 1.722 2.153 2.974
80 1.907 2.272 2.986 2.026 2.414 3.173 1.559 1.965 2.733 1.688 2.114 2.924
90 1.889 2.251 2.958 1.999 2.382 3.130 1.542 1.944 2.706 1.661 2.082 2.883
100 1.874 2.233 2.934 1.977 2.355 3.096 1.527 1.927 2.684 1.639 2.056 2.850
150 1.825 2.175 2.859 1.905 2.270 2.983 1.478 1.870 2.611 1.566 1.971 2.741
200 1.798 2.143 2.816 1.865 2.222 2.921 1.450 1.837 2.570 1.524 1.923 2.679
250 1.780 2.121 2.788 1.839 2.191 2.880 1.431 1.815 2.542 1.496 1.891 2.638
300 1.767 2.106 2.767 1.820 2.169 2.850 1.417 1.800 2.522 1.476 1.868 2.608
1.645 1.960 2.576 1.645 1.960 2.576 1.282 1.645 2.326 1.282 1.645 2.326
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
590 Appendix Tables
df
t 1 2 3 4 5 6 7 8 9 10 11 12
0.0 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500
0.1 .468 .465 .463 .463 .462 .462 .462 .461 .461 .461 .461 .461
0.2 .437 .430 .427 .426 .425 .424 .424 .423 .423 .423 .423 .422
0.3 .407 .396 .392 .390 .388 .387 .386 .386 .386 .385 .385 .385
0.4 .379 .364 .358 .355 .353 .352 .351 .350 .349 .349 .348 .348
0.5 .352 .333 .326 .322 .319 .317 .316 .315 .315 .314 .313 .313
0.6 .328 .305 .295 .290 .287 .285 .284 .283 .282 .281 .280 .280
0.7 .306 .278 .267 .261 .258 .255 .253 .252 .251 .250 .249 .249
0.8 .285 .254 .241 .234 .230 .227 .225 .223 .222 .221 .220 .220
0.9 .267 .232 .217 .210 .205 .201 .199 .197 .196 .195 .194 .193
1.0 .250 .211 .196 .187 .182 .178 .175 .173 .172 .170 .169 .169
1.1 .235 .193 .176 .167 .162 .157 .154 .152 .150 .149 .147 .146
1.2 .221 .177 .158 .148 .142 .138 .135 .132 .130 .129 .128 .127
1.3 .209 .162 .142 .132 .125 .121 .117 .115 .113 .111 .110 .109
1.4 .197 .148 .128 .117 .110 .106 .102 .100 .098 .096 .095 .093
1.5 .187 .136 .115 .104 .097 .092 .089 .086 .084 .082 .081 .080
1.6 .178 .125 .104 .092 .085 .080 .077 .074 .072 .070 .069 .068
1.7 .169 .116 .094 .082 .075 .070 .066 .064 .062 .060 .059 .057
1.8 .161 .107 .085 .073 .066 .061 .057 .055 .053 .051 .050 .049
1.9 .154 .099 .077 .065 .058 .053 .050 .047 .045 .043 .042 .041
2.0 .148 .092 .070 .058 .051 .046 .043 .040 .038 .037 .035 .034
2.1 .141 .085 .063 .052 .045 .040 .037 .034 .033 .031 .030 .029
2.2 .136 .079 .058 .046 .040 .035 .032 .029 .028 .026 .025 .024
2.3 .131 .074 .052 .041 .035 .031 .027 .025 .023 .022 .021 .020
2.4 .126 .069 .048 .037 .031 .027 .024 .022 .020 .019 .018 .017
2.5 .121 .065 .044 .033 .027 .023 .020 .018 .017 .016 .015 .014
2.6 .117 .061 .040 .030 .024 .020 .018 .016 .014 .013 .012 .012
2.7 .113 .057 .037 .027 .021 .018 .015 .014 .012 .011 .010 .010
2.8 .109 .054 .034 .024 .019 .016 .013 .012 .010 .009 .009 .008
2.9 .106 .051 .031 .022 .017 .014 .011 .010 .009 .008 .007 .007
3.0 .102 .048 .029 .020 .015 .012 .010 .009 .007 .007 .006 .006
3.1 .099 .045 .027 .018 .013 .011 .009 .007 .006 .006 .005 .005
3.2 .096 .043 .025 .016 .012 .009 .008 .006 .005 .005 .004 .004
3.3 .094 .040 .023 .015 .011 .008 .007 .005 .005 .004 .004 .003
3.4 .091 .038 .021 .014 .010 .007 .006 .005 .004 .003 .003 .003
3.5 .089 .036 .020 .012 .009 .006 .005 .004 .003 .003 .002 .002
3.6 .086 .035 .018 .011 .008 .006 .004 .004 .003 .002 .002 .002
3.7 .084 .033 .017 .010 .007 .005 .004 .003 .002 .002 .002 .002
3.8 .082 .031 .016 .010 .006 .004 .003 .003 .002 .002 .001 .001
3.9 .080 .030 .015 .009 .006 .004 .003 .002 .002 .001 .001 .001
4.0 .078 .029 .014 .008 .005 .004 .003 .002 .002 .001 .001 .001
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 591
df
t 13 14 15 16 17 18 19 20 21 22 23 24
0.0 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500
0.1 .461 .461 .461 .461 .461 .461 .461 .461 .461 .461 .461 .461
0.2 .422 .422 .422 .422 .422 .422 .422 .422 .422 .422 .422 .422
0.3 .384 .384 .384 .384 .384 .384 .384 .384 .384 .383 .383 .383
0.4 .348 .347 .347 .347 .347 .347 .347 .347 .347 .347 .346 .346
0.5 .313 .312 .312 .312 .312 .312 .311 .311 .311 .311 .311 .311
0.6 .279 .279 .279 .278 .278 .278 .278 .278 .278 .277 .277 .277
0.7 .248 .247 .247 .247 .247 .246 .246 .246 .246 .246 .245 .245
0.8 .219 .218 .218 .218 .217 .217 .217 .217 .216 .216 .216 .216
0.9 .192 .191 .191 .191 .190 .190 .190 .189 .189 .189 .189 .189
1.0 .168 .167 .167 .166 .166 .165 .165 .165 .164 .164 .164 .164
1.1 .146 .144 .144 .144 .143 .143 .143 .142 .142 .142 .141 .141
1.2 .126 .124 .124 .124 .123 .123 .122 .122 .122 .121 .121 .121
1.3 .108 .107 .107 .106 .105 .105 .105 .104 .104 .104 .103 .103
1.4 .092 .091 .091 .090 .090 .089 .089 .089 .088 .088 .087 .087
1.5 .079 .077 .077 .077 .076 .075 .075 .075 .074 .074 .074 .073
1.6 .067 .065 .065 .065 .064 .064 .063 .063 .062 .062 .062 .061
1.7 .056 .055 .055 .054 .054 .053 .053 .052 .052 .052 .051 .051
1.8 .048 .046 .046 .045 .045 .044 .044 .043 .043 .043 .042 .042
1.9 .040 .038 .038 .038 .037 .037 .036 .036 .036 .035 .035 .035
2.0 .033 .032 .032 .031 .031 .030 .030 .030 .029 .029 .029 .028
2.1 .028 .027 .027 .026 .025 .025 .025 .024 .024 .024 .023 .023
2.2 .023 .022 .022 .021 .021 .021 .020 .020 .020 .019 .019 .019
2.3 .019 .018 .018 .018 .017 .017 .016 .016 .016 .016 .015 .015
2.4 .016 .015 .015 .014 .014 .014 .013 .013 .013 .013 .012 .012
2.5 .013 .012 .012 .012 .011 .011 .011 .011 .010 .010 .010 .010
2.6 .011 .010 .010 .010 .009 .009 .009 .009 .008 .008 .008 .008
2.7 .009 .008 .008 .008 .008 .007 .007 .007 .007 .007 .006 .006
2.8 .008 .007 .007 .006 .006 .006 .006 .006 .005 .005 .005 .005
2.9 .006 .005 .005 .005 .005 .005 .005 .004 .004 .004 .004 .004
3.0 .005 .004 .004 .004 .004 .004 .004 .004 .003 .003 .003 .003
3.1 .004 .004 .004 .003 .003 .003 .003 .003 .003 .003 .003 .002
3.2 .003 .003 .003 .003 .003 .002 .002 .002 .002 .002 .002 .002
3.3 .003 .002 .002 .002 .002 .002 .002 .002 .002 .002 .002 .001
3.4 .002 .002 .002 .002 .002 .002 .002 .001 .001 .001 .001 .001
3.5 .002 .002 .002 .001 .001 .001 .001 .001 .001 .001 .001 .001
3.6 .002 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001
3.7 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001
3.8 .001 .001 .001 .001 .001 .001 .001 .001 .001 .000 .000 .000
3.9 .001 .001 .001 .001 .001 .001 .000 .000 .000 .000 .000 .000
4.0 .001 .001 .001 .001 .000 .000 .000 .000 .000 .000 .000 .000
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
592 Appendix Tables
df
t 25 26 27 28 29 30 35 40 60 120 `( z)
0.0 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500 .500
0.1 .461 .461 .461 .461 .461 .461 .460 .460 .460 .460 .460
0.2 .422 .422 .421 .421 .421 .421 .421 .421 .421 .421 .421
0.3 .383 .383 .383 .383 .383 .383 .383 .383 .383 .382 .382
0.4 .346 .346 .346 .346 .346 .346 .346 .346 .345 .345 .345
0.5 .311 .311 .311 .310 .310 .310 .310 .310 .309 .309 .309
0.6 .277 .277 .277 .277 .277 .277 .276 .276 .275 .275 .274
0.7 .245 .245 .245 .245 .245 .245 .244 .244 .243 .243 .242
0.8 .216 .215 .215 .215 .215 .215 .215 .214 .213 .213 .212
0.9 .188 .188 .188 .188 .188 .188 .187 .187 .186 .185 .184
1.0 .163 .163 .163 .163 .163 .163 .162 .162 .161 .160 .159
1.1 .141 .141 .141 .140 .140 .140 .139 .139 .138 .137 .136
1.2 .121 .120 .120 .120 .120 .120 .119 .119 .117 .116 .115
1.3 .103 .103 .102 .102 .102 .102 .101 .101 .099 .098 .097
1.4 .087 .087 .086 .086 .086 .086 .085 .085 .083 .082 .081
1.5 .073 .073 .073 .072 .072 .072 .071 .071 .069 .068 .067
1.6 .061 .061 .061 .060 .060 .060 .059 .059 .057 .056 .055
1.7 .051 .051 .050 .050 .050 .050 .049 .048 .047 .046 .045
1.8 .042 .042 .042 .041 .041 .041 .040 .040 .038 .037 .036
1.9 .035 .034 .034 .034 .034 .034 .033 .032 .031 .030 .029
2.0 .028 .028 .028 .028 .027 .027 .027 .026 .025 .024 .023
2.1 .023 .023 .023 .022 .022 .022 .022 .021 .020 .019 .018
2.2 .019 .018 .018 .018 .018 .018 .017 .017 .016 .015 .014
2.3 .015 .015 .015 .015 .014 .014 .014 .013 .012 .012 .011
2.4 .012 .012 .012 .012 .012 .011 .011 .011 .010 .009 .008
2.5 .010 .010 .009 .009 .009 .009 .009 .008 .008 .007 .006
2.6 .008 .008 .007 .007 .007 .007 .007 .007 .006 .005 .005
2.7 .006 .006 .006 .006 .006 .006 .005 .005 .004 .004 .003
2.8 .005 .005 .005 .005 .005 .004 .004 .004 .003 .003 .003
2.9 .004 .004 .004 .004 .004 .003 .003 .003 .003 .002 .002
3.0 .003 .003 .003 .003 .003 .003 .002 .002 .002 .002 .001
3.1 .002 .002 .002 .002 .002 .002 .002 .002 .001 .001 .001
3.2 .002 .002 .002 .002 .002 .002 .001 .001 .001 .001 .001
3.3 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .000
3.4 .001 .001 .001 .001 .001 .001 .001 .001 .001 .000 .000
3.5 .001 .001 .001 .001 .001 .001 .001 .001 .000 .000 .000
3.6 .001 .001 .001 .001 .001 .001 .000 .000 .000 .000 .000
3.7 .001 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000
3.8 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
3.9 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
4.0 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 593
Right-tail area df 1 df 2 df 3 df 4 df 5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
594 Appendix Tables
Right-tail area df 11 df 12 df 13 df 14 df 15
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 595
curve
Area
Value
Numerator df
Area 1 2 3 4 5 6 7 8 9 10
1 .100 39.86 49.50 53.59 55.83 57.24 58.20 58.91 59.44 59.86 60.19
.050 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90
.010 4052.00 5000.00 5403.00 5625.00 5764.00 5859.00 5928.00 5981.00 6022.00 6056.00
2 .100 8.53 9.00 9.16 9.24 9.29 9.33 9.35 9.37 9.38 9.39
.050 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40
.010 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40
.001 998.50 999.00 999.20 999.20 999.30 999.30 999.40 999.40 999.40 999.40
3 .100 5.54 5.46 5.39 5.34 5.31 5.28 5.27 5.25 5.24 5.23
.050 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
.010 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23
.001 167.00 148.50 141.10 137.10 134.60 132.80 131.60 130.60 129.90 129.20
4 .100 4.54 4.32 4.19 4.11 4.05 4.01 3.98 3.95 3.94 3.92
.050 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
.010 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55
.001 74.14 61.25 56.18 53.44 51.71 50.53 49.66 49.00 48.47 48.05
Denominator df
5 .100 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 3.30
.050 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
.010 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05
.001 47.18 37.12 33.20 31.09 29.75 28.83 28.16 27.65 27.24 26.92
6 .100 3.78 3.46 3.29 3.18 3.11 3.05 3.01 2.98 2.96 2.94
.050 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
.010 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87
.001 35.51 27.00 23.70 21.92 20.80 20.03 19.46 19.03 18.69 18.41
7 .100 3.59 3.26 3.07 2.96 2.88 2.83 2.78 2.75 2.72 2.70
.050 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
.010 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62
.001 29.25 21.69 18.77 17.20 16.21 15.52 15.02 14.63 14.33 14.08
8 .100 3.46 3.11 2.92 2.81 2.73 2.67 2.62 2.59 2.56 2.54
.050 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
.010 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81
.001 25.41 18.49 15.83 14.39 13.48 12.86 12.40 12.05 11.77 11.54
9 .100 3.36 3.01 2.81 2.69 2.61 2.55 2.51 2.47 2.44 2.42
.050 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
.010 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26
.001 22.86 16.39 13.90 12.56 11.71 11.13 10.70 10.37 10.11 9.89
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
596 Appendix Tables
Numerator df
Area 1 2 3 4 5 6 7 8 9 10
10 .100 3.29 2.92 2.73 2.61 2.52 2.46 2.41 2.38 2.35 2.32
.050 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
.010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85
.001 21.04 14.91 12.55 11.28 10.48 9.93 9.52 9.20 8.96 8.75
11 .100 3.23 2.86 2.66 2.54 2.45 2.39 2.34 2.30 2.27 2.25
.050 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85
.010 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54
.001 19.69 13.81 11.56 10.35 9.58 9.05 8.66 8.35 8.12 7.92
12 .100 3.18 2.81 2.61 2.48 2.39 2.33 2.28 2.24 2.21 2.19
.050 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75
.010 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30
.001 18.64 12.97 10.80 9.63 8.89 8.38 8.00 7.71 7.48 7.29
13 .100 3.14 2.76 2.56 2.43 2.35 2.28 2.23 2.20 2.16 2.14
.050 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67
.010 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10
.001 17.82 12.31 10.21 9.07 8.35 7.86 7.49 7.21 6.98 6.80
14 .100 3.10 2.73 2.52 2.39 2.31 2.24 2.19 2.15 2.12 2.10
.050 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60
Denominator df
.010 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94
.001 17.14 11.78 9.73 8.62 7.92 7.44 7.08 6.80 6.58 6.40
15 .100 3.07 2.70 2.49 2.36 2.27 2.21 2.16 2.12 2.09 2.06
.050 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
.010 8.68 6.36 5.42 4.89 4.67 4.32 4.14 4.00 3.89 3.80
.001 16.59 11.34 9.34 8.25 7.57 7.09 6.74 6.47 6.26 6.08
16 .100 3.05 2.67 2.46 2.33 2.24 2.18 2.13 2.09 2.06 2.03
.050 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49
.010 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69
.001 16.12 10.97 9.01 7.94 7.27 6.80 6.46 6.19 5.98 5.81
17 .100 3.03 2.64 2.44 2.31 2.22 2.15 2.10 2.06 2.03 2.00
.050 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45
.010 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59
.001 15.72 10.66 8.73 7.68 7.02 6.56 6.22 5.96 5.75 5.58
18 .100 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2.00 1.98
.050 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41
.010 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51
.001 15.38 10.39 8.49 7.46 6.81 6.35 6.02 5.76 5.56 5.39
19 .100 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 1.96
.050 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38
.010 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43
.001 15.08 10.16 8.28 7.27 6.62 6.18 5.85 5.59 5.39 5.22
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 597
Numerator df
Area 1 2 3 4 5 6 7 8 9 10
20 .100 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 1.94
.050 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
.010 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37
.001 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 5.08
21 .100 2.96 2.57 2.36 2.23 2.14 2.08 2.02 1.98 1.95 1.92
.050 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32
.010 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31
.001 14.59 9.77 7.94 6.95 6.32 5.88 5.56 5.31 5.11 4.95
22 .100 2.95 2.56 2.35 2.22 2.13 2.06 2.01 1.97 1.93 1.90
.050 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30
.010 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26
.001 14.38 9.61 7.80 6.81 6.19 5.76 5.44 5.19 4.99 4.83
23 .100 2.94 2.55 2.34 2.21 2.11 2.05 1.99 1.95 1.92 1.89
.050 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27
.010 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21
.001 14.20 9.47 7.67 6.70 6.08 5.65 5.33 5.09 4.89 4.73
24 .100 2.93 2.54 2.33 2.19 2.10 2.04 1.98 1.94 1.91 1.88
.050 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25
Denominator df
.010 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17
.001 14.03 9.34 7.55 6.59 5.98 5.55 5.23 4.99 4.80 4.64
25 .100 2.92 2.53 2.32 2.18 2.09 2.02 1.97 1.93 1.89 1.87
.050 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24
.010 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.13
.001 13.88 9.22 7.45 6.49 5.89 5.46 5.15 4.91 4.71 4.56
26 .100 2.91 2.52 2.31 2.17 2.08 2.01 1.96 1.92 1.88 1.86
.050 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22
.010 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.09
.001 13.74 9.12 7.36 6.41 5.80 5.38 5.07 4.83 4.64 4.48
27 .100 2.90 2.51 2.30 2.17 2.07 2.00 1.95 1.91 1.87 1.85
.050 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20
.010 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.06
.001 13.61 9.02 7.27 6.33 5.73 5.31 5.00 4.76 4.57 4.41
28 .100 2.89 2.50 2.29 2.16 2.06 2.00 1.94 1.90 1.87 1.84
.050 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19
.010 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.03
.001 13.50 8.93 7.19 6.25 5.66 5.24 4.93 4.69 4.50 4.35
29 .100 2.89 2.50 2.28 2.15 2.06 1.99 1.93 1.89 1.86 1.83
.050 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18
.010 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.00
.001 13.39 8.85 7.12 6.19 5.59 5.18 4.87 4.64 4.45 4.29
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
598 Appendix Tables
Numerator df
Area 1 2 3 4 5 6 7 8 9 10
30
.100 2.88 2.49 2.28 2.14 2.05 1.98 1.93 1.88 1.85 1.82
.050 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16
.010 7.56 5.39 4.51 4.02 3.70 3.47 330 3.17 3.07 2.98
.001 13.29 8.77 7.05 6.12 5.53 5.12 4.82 4.58 4.39 4.24
40
.100 2.84 2.44 2.23 2.09 2.00 1.93 1.87 1.83 1.79 1.76
.050 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08
.010 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80
.001 12.61 8.25 6.59 5.70 5.13 4.73 4.44 4.21 4.02 3.87
60
.100 2.79 2.39 2.18 2.04 1.95 1.87 1.82 1.77 1.74 1.71
.050 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99
.010 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63
.001 11.97 7.77 6.17 5.31 4.76 4.37 4.09 3.86 3.69 3.54
Denominator df
90
.100 2.76 2.36 2.15 2.01 1.91 1.84 1.78 1.74 1.70 1.67
.050 3.95 3.10 2.71 2.47 2.32 2.20 2.11 2.04 1.99 1.94
.010 6.93 4.85 4.01 3.53 3.23 3.01 2.84 2.72 2.61 2.52
.001 11.57 7.47 5.91 5.06 4.53 4.15 3.87 3.65 3.48 3.34
120
.100 2.75 2.35 2.13 1.99 1.90 1.82 1.77 1.72 1.68 1.65
.050 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91
.010 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47
.001 11.38 7.32 5.78 4.95 4.42 4.04 3.77 3.55 3.38 3.24
240
.100 2.73 2.32 2.10 1.97 1.87 1.80 1.74 1.70 1.65 1.63
.050 3.88 3.03 2.64 2.41 2.25 2.14 2.04 1.98 1.92 1.87
.010 6.74 4.69 3.86 3.40 3.09 2.88 2.71 2.59 2.48 2.40
.001 11.10 7.11 5.60 4.78 4.25 3.89 3.62 3.41 3.24 3.09
.100 2.71 2.30 2.08 1.94 1.85 1.77 1.72 1.67 1.63 1.06
.050 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83
.010 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32
.001 10.83 6.91 5.42 4.62 4.10 3.74 3.47 3.27 3.10 2.96
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Table IX(a) Studentized range critical values ( 5 .05)
k
Error
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 18.00 027.00 032.80 37.10 40.40 43.10 45.40 47.40 49.10 50.60 52.00 53.20 54.30 55.40 56.30 57.20 58.00 58.80 59.60
2 6.08 8.33 9.80 10.90 11.70 12.40 13.00 13.50 14.00 14.40 14.70 15.10 15.40 15.70 15.90 16.10 16.40 16.60 16.80
3 4.50 5.91 6.82 7.50 8.04 8.48 8.85 9.18 9.46 9.72 9.95 10.20 10.30 10.50 10.70 10.80 11.00 11.10 11.20
4 3.93 5.04 5.76 6.29 6.71 7.05 7.35 7.60 7.83 8.03 8.21 8.37 8.52 8.66 8.79 8.91 9.03 9.13 9.23
5 3.64 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 7.17 7.32 7.47 7.60 7.72 7.83 7.93 8.03 8.12 8.21
6 3.46 4.34 4.90 5.30 5.63 5.90 6.12 6.32 6.49 6.65 6.79 6.92 7.03 7.14 7.24 7.34 7.43 7.51 7.59
7 3.34 4.16 4.68 5.06 5.36 5.61 5.82 6.00 6.16 6.30 6.43 6.55 6.66 6.76 6.85 6.94 7.02 7.10 7.17
8 3.26 4.04 4.53 4.89 5.17 5.40 5.60 5.77 5.92 6.05 6.18 6.29 6.39 6.48 6.57 6.65 6.73 6.80 6.87
9 3.20 3.95 4.41 4.76 5.02 5.24 5.43 5.59 5.74 5.87 5.98 6.09 6.19 6.28 6.36 6.44 6.51 6.58 6.64
10 3.15 3.88 4.33 4.65 4.91 5.12 5.30 5.46 5.60 5.72 5.83 5.93 6.03 6.11 6.19 6.27 6.34 6.40 6.47
11 3.11 3.82 4.26 4.57 4.82 5.03 5.20 5.35 5.49 5.61 5.71 5.81 5.90 5.98 6.06 6.13 6.20 6.27 6.33
12 3.08 3.77 4.20 4.51 4.75 4.95 5.12 5.27 5.39 5.51 5.61 5.71 5.80 5.88 5.95 6.02 6.09 6.15 6.21
13 3.06 3.73 4.15 4.45 4.69 4.88 5.05 5.19 5.32 5.43 5.53 5.63 5.71 5.79 5.86 5.93 5.99 6.05 6.11
14 3.03 3.70 4.11 4.41 4.64 4.83 4.99 5.13 5.25 5.36 5.46 5.55 5.64 5.71 5.79 5.85 5.91 5.97 6.03
15 3.01 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 5.31 5.40 5.49 5.57 5.65 5.72 5.78 5.85 5.90 5.96
16 3.00 3.65 4.05 4.33 4.56 4.74 4.90 5.03 5.15 5.26 5.35 5.44 5.52 5.59 5.66 5.73 5.79 5.84 5.90
17 2.98 3.63 4.02 4.30 4.52 4.70 4.86 4.99 5.11 5.21 5.31 5.39 5.47 5.54 5.61 5.67 5.73 5.79 5.84
18 2.97 3.61 4.00 4.28 4.49 4.67 4.82 4.96 5.07 5.17 5.27 5.35 5.43 5.50 5.57 5.63 5.69 5.74 5.79
19 2.96 3.59 3.98 4.25 4.47 4.65 4.79 4.92 5.04 5.14 5.23 5.31 5.39 5.46 5.53 5.59 5.65 5.70 5.75
www.ebook3000.com
20 2.95 3.58 3.96 4.23 4.45 4.62 4.77 4.90 5.01 5.11 5.20 5.28 5.36 5.43 5.49 5.55 5.61 5.66 5.71
24 2.92 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 5.01 5.10 5.18 5.25 5.32 5.38 5.44 5.49 5.55 5.59
30 2.89 3.49 3.85 4.10 4.30 4.46 4.60 4.72 4.82 4.92 5.00 5.08 5.15 5.21 5.27 5.33 5.38 5.43 5.47
40 2.86 3.44 3.79 4.04 4.23 4.39 4.52 4.63 4.73 4.82 4.90 4.98 5.04 5.11 5.16 5.22 5.27 5.31 5.36
60 2.83 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 4.73 4.81 4.88 4.94 5.00 5.06 5.11 5.15 5.20 5.24
120 2.80 3.36 3.68 3.92 4.10 4.24 4.36 4.47 4.56 4.64 4.71 4.78 4.84 4.90 4.95 5.00 5.04 5.09 5.13
2.77 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 4.55 4.62 4.68 4.74 4.80 4.85 4.89 4.93 4.97 5.01
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
Table IX(b) Studentized range critical values ( 5 .01)
k
Error
df 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 90.00 135.00 0164.80 186.10 202.40 216.00 227.00 237.00 246.00 253.00 260.00 266.00 272.00 277.00 282.00 286.00 290.00 294.00 298.00
2 14.00 19.00 22.30 24.70 26.60 28.20 29.50 30.70 31.70 32.60 33.40 34.10 34.80 35.40 36.00 36.50 37.00 37.50 37.90
3 8.26 10.60 12.20 13.30 14.20 15.00 15.60 16.20 16.70 17.10 17.50 17.90 18.20 18.50 18.80 19.10 19.30 19.50 19.80
4 6.51 8.12 9.17 9.96 10.60 11.10 11.50 11.90 12.30 12.60 12.80 13.10 13.30 13.50 13.70 13.90 14.10 14.20 14.40
5 5.70 6.97 7.80 8.42 8.91 9.32 9.67 9.97 10.20 10.50 10.70 10.90 11.10 11.20 11.40 11.60 11.70 11.80 11.90
6 5.24 6.33 7.03 7.56 7.97 8.32 8.61 8.87 9.10 9.30 9.49 9.65 9.81 9.95 10.10 10.20 10.30 10.40 10.50
7 4.95 5.92 6.54 7.01 7.37 7.68 7.94 8.17 8.37 8.55 8.71 8.86 9.00 9.12 9.24 9.35 9.46 9.55 9.65
8 4.74 5.63 6.20 6.63 6.96 7.24 7.47 7.68 7.87 8.03 8.18 8.31 8.44 8.55 8.66 8.76 8.85 8.94 9.03
9 4.60 5.43 5.96 6.35 6.66 6.91 7.13 7.32 7.49 7.65 7.78 7.91 8.03 8.13 8.23 8.32 8.41 8.49 8.57
10 4.48 5.27 5.77 6.14 6.43 6.67 6.87 7.05 7.21 7.36 7.48 7.60 7.71 7.81 7.91 7.99 8.07 8.15 8.22
11 4.39 5.14 5.62 5.97 6.25 6.48 6.67 6.84 6.99 7.13 7.25 7.36 7.46 7.56 7.65 7.73 7.81 7.88 7.95
12 4.32 5.04 5.50 5.84 6.10 6.32 6.51 6.67 6.81 6.94 7.06 7.17 7.26 7.36 7.44 7.52 7.59 7.66 7.73
13 4.26 4.96 5.40 5.73 5.98 6.19 6.37 6.53 6.67 6.79 6.90 7.01 7.10 7.19 7.27 7.34 7.42 7.48 7.55
14 4.21 4.89 5.32 5.63 5.88 6.08 6.26 6.41 6.54 6.66 6.77 6.87 6.96 7.05 7.12 7.20 7.27 7.33 7.39
15 4.17 4.83 5.25 5.56 5.80 5.99 6.16 6.31 6.44 6.55 6.66 6.76 6.84 6.93 7.00 7.07 7.14 7.20 7.26
16 4.13 4.78 5.19 5.49 5.72 5.92 6.08 6.22 6.35 6.46 6.56 6.66 6.74 6.82 6.90 6.97 7.03 7.09 7.15
17 4.10 4.74 5.14 5.43 5.66 5.85 6.01 6.15 6.27 6.38 6.48 6.57 6.66 6.73 6.80 6.87 6.94 7.00 7.05
18 4.07 4.70 5.09 5.38 5.60 5.79 5.94 6.08 6.20 6.31 6.41 6.50 6.58 6.65 6.72 6.79 6.85 6.91 6.96
19 4.05 4.67 5.05 5.33 5.55 5.73 5.89 6.02 6.14 6.25 6.34 6.43 6.51 6.58 6.65 6.72 6.78 6.84 6.89
20 4.02 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 6.19 6.29 6.37 6.45 6.52 6.59 6.65 6.71 6.76 6.82
24 3.96 4.54 4.91 5.17 5.37 5.54 5.69 5.81 5.92 6.02 6.11 6.19 6.26 6.33 6.39 6.45 6.51 6.56 6.61
30 3.89 4.45 4.80 5.05 5.24 5.40 5.54 5.65 5.76 5.85 5.93 6.01 6.08 6.14 6.20 6.26 6.31 6.36 6.41
40 3.82 4.37 4.70 4.93 5.11 5.27 5.39 5.50 5.60 5.69 5.77 5.84 5.90 5.96 6.02 6.07 6.12 6.17 6.21
60 3.76 4.28 4.60 4.82 4.99 5.13 5.25 5.36 5.45 5.53 5.60 5.67 5.73 5.79 5.84 5.89 5.93 5.98 6.02
120 3.70 4.20 4.50 4.71 4.87 5.01 5.12 5.21 5.30 5.38 5.44 5.51 5.56 5.61 5.66 5.71 5.75 5.79 5.83
3.64 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16 5.23 5.29 5.35 5.40 5.45 5.49 5.54 5.57 5.61 5.65
Source: From E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, 1: 176–77. Reproduced by permission of the Biometrika Trustees.
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
Appendix Tables 601
Two-sided comparisons
n k 1 2 3 4 5 6 7 8 9
5 2.57 3.03 3.29 3.48 3.62 3.73 3.82 3.90 3.97
6 2.45 2.86 3.10 3.26 3.39 3.49 3.57 3.64 3.71
7 2.36 2.75 2.97 3.12 3.24 3.33 3.41 3.47 3.53
8 2.31 2.67 2.88 3.02 3.13 3.22 3.29 3.35 3.41
9 2.26 2.61 2.81 2.95 3.05 3.14 3.20 3.26 3.32
10 2.23 2.57 2.76 2.89 2.99 3.07 3.14 3.19 3.24
11 2.20 2.53 2.72 2.84 2.94 3.02 3.08 3.14 3.19
12 2.18 2.50 2.68 2.81 2.90 2.98 3.04 3.09 3.14
13 2.16 2.48 2.65 2.78 2.87 2.94 3.00 3.06 3.10
14 2.14 2.46 2.63 2.75 2.84 2.91 2.97 3.02 3.07
15 2.13 2.44 2.61 2.73 2.82 2.89 2.95 3.00 3.04
16 2.12 2.42 2.59 2.71 2.80 2.87 2.92 2.97 3.02
17 2.11 2.41 2.58 2.69 2.78 2.85 2.90 2.95 3.00
18 2.10 2.40 2.56 2.68 2.76 2.83 2.89 2.94 2.98
19 2.09 2.39 2.55 2.66 2.75 2.81 2.87 2.92 2.96
20 2.09 2.38 2.54 2.65 2.73 2.80 2.86 2.90 2.95
24 2.06 2.35 2.51 2.61 2.70 2.76 2.81 2.86 2.90
30 2.04 2.32 2.47 2.58 2.66 2.72 2.77 2.82 2.86
40 2.02 2.29 2.44 2.54 2.62 2.68 2.73 2.77 2.81
60 2.00 2.27 2.41 2.51 2.58 2.64 2.69 2.73 2.77
120 1.98 2.24 2.38 2.47 2.55 2.60 2.65 2.69 2.73
1.96 2.21 2.35 2.44 2.51 2.57 2.61 2.65 2.69
a
Reproduced with permission from C. W. Dunnett, “New Tables for Multiple Comparison with a
Control,” Biometrics, Vol. 20, No. 3, 1964, and from C. W. Dunnett, “A Multiple Comparison Procedure
for Comparing Several Treatments with a Control,” Journal of the American Statistical Association, Vol. 50,
1955.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
602 Appendix Tables
Sample
size (n) D3 D4 B3 B4 A2 A3 A6 A7 d2 c4 d3
2 0.000 3.267 0.000 3.267 1.880 2.659 1.880 1.880 1.128 0.7979 0.853
3 0.000 2.574 0.000 2.568 1.023 1.954 1.187 1.067 1.693 0.8862 0.888
4 0.000 2.282 0.000 2.266 0.729 1.628 0.796 0.796 2.059 0.9213 0.880
5 0.000 2.114 0.000 2.089 0.577 1.427 0.691 0.660 2.326 0.9400 0.864
6 0.000 2.004 0.030 1.970 0.483 1.287 0.549 0.580 2.534 0.9515 0.848
7 0.076 1.924 0.118 1.882 0.419 1.182 0.509 0.521 2.704 0.9594 0.833
8 0.136 1.864 0.185 1.815 0.373 1.099 0.434 0.477 2.847 0.9650 0.820
9 0.184 1.816 0.239 1.761 0.337 1.032 0.412 0.444 2.970 0.9693 0.808
10 0.223 1.777 0.284 1.716 0.308 0.975 0.365 0.419 3.078 0.9727 0.797
11 0.256 1.744 0.321 1.679 0.285 0.927 0.350 0.399 3.173 0.9754 0.787
12 0.283 1.717 0.354 1.646 0.266 0.886 0.317 0.382 3.258 0.9776 0.778
13 0.307 1.693 0.382 1.618 0.249 0.850 0.306 0.368 3.336 0.9794 0.770
14 0.328 1.672 0.406 1.594 0.235 0.817 0.282 0.356 3.407 0.9810 0.763
15 0.347 1.653 0.428 1.572 0.223 0.789 0.274 0.346 3.472 0.9823 0.756
16 0.363 1.637 0.448 1.552 0.212 0.763 0.257 0.337 3.532 0.9835 0.750
17 0.378 1.622 0.466 1.534 0.203 0.739 0.250 0.329 3.588 0.9845 0.744
18 0.391 1.608 0.482 1.518 0.194 0.718 0.237 0.322 3.640 0.9854 0.739
19 0.403 1.597 0.497 1.503 0.187 0.698 0.231 0.315 3.689 0.9862 0.734
20 0.415 1.585 0.510 1.490 0.180 0.680 0.218 0.308 3.735 0.9869 0.729
21 0.425 1.575 0.523 1.477 0.173 0.663 0.215 0.303 3.778 0.9876 0.724
22 0.434 1.566 0.534 1.466 0.167 0.647 0.204 0.298 3.819 0.9882 0.720
23 0.443 1.557 0.545 1.455 0.162 0.633 0.202 0.292 3.858 0.9887 0.716
24 0.451 1.548 0.555 1.445 0.157 0.619 0.192 0.288 3.895 0.9892 0.712
25 0.459 1.541 0.565 1.435 0.153 0.606 0.191 0.284 3.931 0.9896 0.708
a
Values in this table were generated using MathCAD version 3.1 software.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Appendix Tables 603
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to
Odd-Numbered Exercises
A specific gravity of roughly .45 is typical. The data The second display is essentially as informative
spreads out quite a bit about this typical value. as the first. With 200 observations, the first dis-
There is asymmetry in the distribution of values. play would be very cumbersome.
The observation .75 appears at first glance to be a
604
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 605
0 7 .117 50
1 12 .200
2 13 .217 40
3 14 .233
4 6 .100
5 3 .050 30
Percent
6 3 .050
7 1 .017 20
8 1 .017
60 1.001
10
b. .917, .867, 1 2 .867 5 .133
c. The histogram has a substantial positive skew.
It is centered somewhere between 2 and 3 and 0
1 7 13 19 25 31
spreads out quite a bit about its center. Herd size
9. a. .99 (99%), .71 (71%)
15. a. Yes, .518.
b. .64 (64%), .44 (44%)
b. .152.
c. Strictly speaking, the histogram is not unimodal,
c. .408.
but is close to being so with a moderate positive
d. The distribution is heavily positively skewed.
skew. A much larger sample size would likely
Though angles can range from 0° to 90°,
give a smoother picture.
approximately 85% of all angles are less than 30°.
11. a. 0 3 3 9 5 5 9 4 5 1 5 2 3 Histogram of Angle
1 2 2 0 0 3 2 1 8 6 8 4 stem: thousands 0.04
2 1 4 1 2 3 4 4 7 7 1 leaf: hundreds
3 0 3 3 3 8 1 1
4 3 7 0.03
5 3 7 2 8 7
A typical value is one in the low 2000s; there is
Density
0–,1000 12 .255
1000–,2000 11 .234 0.00
0 10 20 40 90
2000–,3000 10 .213
Angle
3000–,4000 7 .149
17. Class Freq. Rel. Freq.
4000–,5000 2 .043
5000–,6000 5 .106 4000–,4200 1 .01
47 1.000 4200–,4400 2 .02
.489, .149; see the description in (a) 4400–,4600 9 .09
13. a. 589/1570 5 .3752 4600–,4800 13 .13
b. 1 2 (589 1 190 1 176 1 157 1 115)y1570 5 4800–,5000 18 .18
.2185 5000–,5200 22 .22
c. (115 1 89 1 57 1 55 1 33 5 31)y1570 5 5200–,5400 20 .20
.2420 5400–,5600 7 .07
d. The shape of this histogram is positively skewed. 5600–,5800 7 .07
5800–,6000 1 .01
100 1.00
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
606 Answers to Odd-Numbered Exercises
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 607
Due to the strong positive skew, the sample mean 29. 2 5 n(1 2 ) 5 (25)(.20)(.80) 5 4 5 2
will be greater than the sample median. P(x . 1 2) 5 P(x . 5 1 2(2)) 5 P(x . 9) 5 .017
b. x 5 3.654, x~ 5 3.35 31. .135
c. By any amount. By no more than 6.7. 33. a. Lower quartile 5 122, upper quartile 5 135,
5. Due to the unusually large observation 59.31, the IQR 5 13
sample mean will be greater than the sample me- b. The proximity of the upper quartile to the
dian. Since the mean can be inflated when an un- median suggests a negative skew. The variation
usually large observation exists, the median (31.28) seems quite large and there do not appear to be
appears to be a more representative value. any outliers.
7. x~ 5 68.0, 20% xtr 5 66.2, 30% xtr 5 67.5 c. Observations less than 102.5 and greater than
9. a. 4y3 because of skewness 154.5 would be outliers, and observations less
b. 1.414, so , ~ because of negative skewness. than 83 and greater than 174 would be extreme
c. .615, .707 outliers.
11. 5 1.614, ~ 5 1.64, .032 (a bit more than 3% of d. Decrease the maximum by any amount and the
all weeks) IQR remains unchanged.
13. 1.8 35. 5 3.51, 5 .146
15. a. x 5 1939.367, and the deviations are 66.733, 37. min 5 16; lower quartile 5 87; median 5 140;
125.833, 179.533, 2252.767, 27.533, and upper quartile 5 210; max 5 403. A mild high
2146.867. outlier is above 394.5 N and an extreme high
b. 27747.695, 166.576 outlier is above 579 N. The value 403 N is a mild
17. a. Group 1 has mean 5 9.86, SD 5 2.67. Group 2 outlier. The distribution has positive skew.
has mean 5 8.93, SD 5 2.37. 39. The most noticeable feature of the comparative
b. Group 1 has range 5 7, Group 2 has range 5 8. boxplots is that machine 2’s sample values have
c. considerably more variation than do machine 1’s
Group1
Group2
4.8 6.0 7.2 8.4 9.6 10.8 12.0 13.2
d. The standard deviation measures spread by sample values. However, a typical value, as mea-
incorporating the deviation of each observation sured by the median, seems to be about the same
from the sample mean. Many observations of for the two machines. The only outlier that exists is
Group 2 are clustered near its sample mean of from machine 1.
8.93, whereas the observations of Group 1 are 41. The endotoxin concentration in urban homes gener-
farther away from its sample mean of 9.86. So, ally exceeds that in farm homes. The range of endo-
although Group 2 data exhibits a larger range, it toxin concentrations for urban homes exceeds that for
also yields the smaller standard deviation. farm homes. For the urban homes data, there is one
19. The sample mean of 17.67 can be considered a mild outlier (1) and one extreme outlier (104). For
representative value for this data. The standard the farm homes data there is one mild outlier (64).
deviation is 6.41. In general, the size of a typical 43. a. IQR 5 (qu 2 ql) 5 (133.44 2 97.43) 5 36.01
deviation from the sample mean is about 6.41. b. IQR 5 (13.34 2 9.74) 5 3.6
Some observations may deviate from 17.67 by a 45. The general pattern is reasonably straight and a de-
little more than this, some by less. parture from linearity is not clear-cut. One should
21. 76,683 and 76,910 not rule out normality of the tension distribution.
23. a. .785 47. The plot shows some nontrivial departures from
b. .688 linearity, especially in the lower tail of the distribu-
25. a. 1.72 tion. This indicates a normal distribution might
b. .3, 0 not be a good fit to the population distribution of
27. .423 clubhead velocities for female golfers.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
608 Answers to Odd-Numbered Exercises
49. The corresponding probability plot appears suf- c. Chebyshev’s inequality may not accurately
ficiently straight to lead us to agree with the argu- estimate any particular distribution as it must
ment that the distribution of fracture toughness in accommodate all distributions.
concrete specimens could well be modeled by a 63. a. xtr(6.7) 5 10.67, xtr(13.3) 5 10.58
Weibull distribution (10.67 1 10.58)
51. Clearly, the variable IDT is not normally distrib- b. xtr(10) 5 a b 5 10.625
uted, since its normal quantile plot is nonlinear. 2
c. Interpolate between xtr(6.25) and xtr(12.5) to obtain
IDT is likely to be lognormally distributed since
xtr(10)
the normal quantile plot of ln(ITD) is quite linear.
53. a. Clearly, the variable, hourly median power, is 65. The mean and the midrange are sensitive to outli-
not normally distributed, as the normal quantile ers. The median, the trimmed mean, and the mid-
plot is curvilinear. hinge are not sensitive to outliers.
b. By taking the natural logarithm of the variable 67. a. Aortic root diameters for males have mean
and constructing a normal quantile plot, the 3.64 cm, median 3.70 cm, standard deviation
plot looks quite linear indicating that it is plau- 0.269 cm, and IQR 0.40. The correspond-
sible that these observations were sampled from ing values for females are x 5 3.28 cm, x~ 5
a lognormal distribution. 3.15 cm, s 5 0.478 cm, and IQR 5 0.50 cm.
55. The corresponding histogram shows the noise Aortic root diameters are typically smaller for
distribution is bimodal (but close to unimodal) females than for males, and females show more
with a positive skew and no outliers. The mean variability. The distribution for males is nega-
noise level is 64.89 dB and the median noise level tively skewed, while the distribution for females
is 64.7 dB. The IQR of the noise measurements is is positively skewed.
about 70.4 2 57.8 5 12.6 dB. b. For females (n 5 10), the 10% trimmed mean
57. b. x16 5 12.53125, s16 5 .532 is the average of the middle 8 observations:
59. a. The initial Se concentrations in the treatment xtr(10) 5 3.24 cm. For males (n 5 13), the
and control groups are not that different. The 1y13 trimmed mean is 40.2y11 5 3.6545, and
median initial Se concentrations for the treat- the 2y13 trimmed mean is 32.8y9 5 3.6444.
ment and control groups are 10.3 mg/L and Interpolating, the 10% trimmed mean is xtr(10) 5
10.5 mg/L, respectively, each with IQR of about 0.7(3.6545) 1 0.3(3.6444) 5 3.65 cm.
1.25 mg/L. So, the two groups of cows are com- 69. .0228, .1587
parable at the beginning of the study.
b. The final Se concentrations of the two groups are Chapter 3
extremely different. The median final Se con- 1. The scatterplot exhibits a negative linear associa-
centration for the control group is 9.3 mg/L, the tion between the variables.
median Se concentration in the treatment group 3. The scatterplot exhibits a positive linear association
is now 103.9 mg/L, nearly a 10-fold increase. between the variables. One unusual observation
61. a. (with # beds 5 68) deviates from the linear pattern.
Percentage within Chebyshev’s Rule Empirical Rule 5. b. Yes
c. There appears to be an appropriate quadratic
1 No statement About 68% relationship (points fall closest to a parabola).
2 At least 75% About 95% 7. The scatterplot exhibits a negative linear associa-
3 At least 89% About 99.7% tion between the variables.
9. a. Positive b. Negative c. Positive
Chebyshev’s inequality is more conservative than is d. Little or none e. Negative
the empirical rule. f. Little or none
b. 11. r 5 .4806, a weak to moderate linear correlation
Percentage within Chebyshev’s Rule Exponential exists
13. If, for example, 18 is the minimum age of eligibil-
1 No statement 86.47% ity, then for most people y < x 2 18.
2 At least 75% 95.02% 15. 2.9
3 At least 89% 98.17% 17. a. .733 b. .9985
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 609
19. a. yn 5 2305.88 1 9.96x. The coefficient of deter- 31. b. The ln(x) versus y transformation seems to do
mination is .124, which is quite low. The linear the best job, though it yields a somewhat low
regression model accounts for only 12.4% of the r2 5 .497.
variability of colony density. c. yn 5 .0197 2 .0013* ln(5000) 5 .0086
b. yn 5 34.37 1 .78x. The coefficient of determina- 33. a. No, there is a quadratic relationship between
tion is .024, which is much lower than before. strength and thickness, so a quadratic model
The linear regression model accounts for only should be fit.
2.4% of the variability of colony density. The b. yn 5 14.521 1 .0432x 2 .00006x2. At x 5 500,
elimination of the observation has a drastic yn 5 21.121. The residual plot shows no unusual
impact on the regression model. pattern and R2 5 .780. The quadratic fit seems
21. a. The scatterplot reveals a roughly positive linear adequate.
relationship. 35. a. yn 5 4.479. Residual 5 4.454 2 4.479 5 2.0025
b. yn 5 231.80 1 .987x. A one-MPa increase in b. 12.03836y5.1109 5 .9925.
cube strength is associated with a .987 MPa 37. a. 92.34% of the observed variability in
increase in the predicted axial strength for these hydrocarbon deposition can be attributed to the
asphalt samples. given multiple regression model involving x1
c. r2 5 .630. That is, 63.0% of the observed varia- and x2.
tion in axial strength of asphalt samples of this b. yn 5 37.476
type can be attributed to its linear relationship c. Yes, it is legitimate to interpret b2 in this way.
with cube strength. 39. a. For yn 5 a 1 b1x1 1 b2x2 1 b3x3, R2 5 .0165.
d. se 5 6.625. The second model gives R2 5 .9866. Clearly, the
23. a. yn 5 11.013 2 .448x. A one percent increase second model yields a superior fit to the data.
in fiber weight is associated with a .448 MPa b. yn 5 .3569, residual 5 2.1549.
increase in the predicted compressive strength. c. yn 5 .1801, residual 5 2.0219.
b. .694 d. The larger residual magnitude based on
c. yn 5 8.101 MPa yn 5 a 1 b1x1 1 b2x2 1 b3x3 is reasonable
d. The observed range for x was 0 to 10%. given the corresponding low coefficient of
25% is well outside this range and the ex- determination.
trapolated prediction could be unreliable. For 41. a. a 5 89.111, b1 5 2.050, b2 5 6.564, b3 5
x 5 25, yn 5 2.187 MPa, a nonsensical value. 227.418, R2 5 .9175
25. a. No, if the values of Cc were perfectly linearly b. a 5 55.703, b1 5 .018, b2 5 8.719, b3 5
related to the e0 values, then one line would 211.313, b4 5 2.005, b5 5 2.033, b6 5 .105,
exactly satisfy all points in the scatterplot. R2 5 .9237
b. yn 5 2.144 1 .337x c. a 5 81.233, b1 5 .123, b2 5 26.837, b3 5
c. .874 242.035, b4 5 2.005, b5 5 2.033, b6 5 .105,
d. yn 5 .227 when x 5 1.10. Predicting y when b7 5 2.0001, b8 5 1.945, b9 5 10.241, R2 5
x 5.80 would not be advisable as this is an .9679
example of extrapolation. 43. a. .030 b. .120 c. .105 d. 2.80
27. Data set #1: scatterplot yields a rough linear e. 4.90
relationship. Data set #2: scatterplot reveals a 45. 9375, .302, no
quadratic relationship, so a linear relationship does 47. b. 35, 5, 26 c. .632
not hold. Data set #3: scatterplot shows a clear outlier. 49. a. yn 5 1.6932 1 .0805x
Without this observation, a linear relationship holds b. yn 5 220.0514 1 12.1149x
very well. Data set #4: scatterplot (containing a clear c. .975 for both regressions
outlier) shows a linear relationship does not hold. 51. a. 109.07 b. R2 5 .893
29. a. It is not appropriate to fit a straight line to this c. yn , 0, which is ridiculous.
data as there is clear curvature to the scatterplot. 53. a. No
b. A scatterplot of (x, 1yy) yields rough linearity. b. ln( y) 5 27.2557 1 8328.4yx, yn 5 74.6, r2 5 .953
The least squares line is 1yyn 5 .105 2 21.02x 55. ln( y) 5 23.7372 2 .12395ln(x), r2 5 46.9,
with corresponding r2 5 .868. y 5 .00829 when x 5 5000
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
610 Answers to Odd-Numbered Exercises
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 611
attributed to the chromite ore plants and not to 15. a. (.80)(.60) 5 .48
variation in the measurement system. b. .95 1 (.05)(.80)(.60) 5 .974
c. The location at which an air sample is taken c. P(F | I) 5 P(F and I)yP(I) 5 .95y.974 5 .9754
can be considered an experimental factor (i.e., 17. The probabilities of independent events A and B
independent variable). The six sampling periods must satisfy the equation P(A and B) 5 P(A) ? P(B).
illustrate the experimental principle of replica- If A and B were also mutually exclusive, then P(A
tion. Distinguishing between wet and dry days and B) would equal 0, which would mean that
constitutes blocking. P(A) ? P(B) 5 P(A and B) 5 0. But, P(A) ? P(B) 5
.5 ? .6 5 .3. So, A and B cannot be mutually
Chapter 5 exclusive.
19. a. (.42)(.42) 5 .1764 b. .01, .0016, .1936
1. a. There are 10 possible such samples of size 3:
c. .1764 1 .01 1.0016 1 .1936 5 .3816
{a, b, c}, {a, b, d}, {a, b, e}, {a, c, d}, {a, c, e},
d. 1 2 (.3816) 5 .6184
{a, d, e}, {b, c, d}, {b, c, e}, {b, d, e}, {c, d, e}
21. .81 1 .99 2 .8019 5 .9981
b. A 5 {{a, b, c}, {a, c, d}, {a, c, e}}
23. a. .9042 b. .7660
c. A9 5 {{a, b, d}, {a, b, e}, {a, d, e}, {b, c, d},
25. Using the addition law for exclusive events,
{b, c, e}, {b, d, e}, {c, d, e}}
P(B) 5 P(A and B) 1 P(A9 and B), which can be
3. a. A and B is the event “either 4 or 5 defectives in
rearranged as P(A9 and B) 5 P(B) 2 P(A and B).
the sample.”
Using the fact that A and B are independent,
b. A or B is the event “there is at least one defec-
P(A and B) 5 P(A) ? P(B), so, P(A9 and B) 5
tive in the sample.”
P(B) 2 P(A and B) 5 P(B) 2 P(A) ? P(B) 5
c. A9 is the event “there are at most 3 defectives in
[1 2 P(A)] ? P(B) 5 P(A9) ? P(B), which shows that
the sample.”
A9 and B are independent.
5. meet standards
27. a. Discrete b. Continuous c. Discrete
scrap d. Discrete e. Continuous f. Continuous
do not meet
meet standards g. Discrete
standards readjust crimp 29. a. 2.3 b. .81 c. 88.5 lb
scrap
31. a. k 5 1y15 b. .40
7. The event A and B is the shaded area where A and c. 11y3 5 3.667 d. 1.2472
B overlap in a Venn diagram. Its complement con- 33. a. Mean 5 2.85; standard deviation 5 1.6797
sists of all events that are either not in A or not in b. .05702 c. .77883
B (or not in both). That is, the complement can be 35. a. Binomial; mean 5 50
expressed as A9 or B9. b. Normal approximation (with continuity
9. a. 1159 distinct joints were identified by the in- correction) to binomial gives .0287.
spectors together. 37. a. 1 2 .736 5 .264
b. A and B9 contains 724 2 316 5 408 solder joints. b. 1, because there will be no defectives in any
11. P(A1 or A2 or A3 or . . . or Ak) sample
# P(A1) 1 P(A2) 1 P(A3) 1 1 P(Ak) c. .086 (for 5%); .624 (for 20%); .989 (for 50%)
5 .01 1 .01 1 1 .01 5 10(.01) 5 .10 39. a. Binomial with n 5 25, 5 1y5
13. a. P(A | E9) 5 P(A and E9)yP(E9) 5 P(A)yP(E9) 5 b. Mean 5 5; standard deviation 5 2
.20y(1 2 .10) 5 .20y.90 5 20/90. c. Closest integer score S that satisfies P(x $ S) 5
P(B | E9) 5 P(B and E9)yP(E9) 5 P(B)yP(E9) 5 .01 is S 5 11.
.25y(1 2 .10) 5 .25y.90 5 25/90. 41. a. Median 5 346.57 hours
P(C | E9) 5 P(C and E9)yP(E9) 5 P(C)yP(E9) 5 b. Median is smaller than mean.
.15y(1 2.10) 5 .15y.90 5 15/90. c. Median 5 2ln(.50)y 5 .693y 5 .693
P(D | E9) 5 P(D and E9)yP(E9) 5 P(D)yP(E9) 5 43. a. P(x 5 5) 5 .40; P(x 5 6) 5 .35; P(x 5 7) 5 .25
.30y(1 2 .10) 5 .30y.90 5 30/90. b. P( y 5 10) 5 .40; P( y 5 15) 5 .40;
b. P(A | B, D, E not chosen) 5 P(A)y(1 2 (.25 1 P( y 5 20) 5 .20
.30 1 .10) 5 .20y.35 5 20y35. c. No, because P(x 5 5 and y 5 10) 5 .20 Þ (.40)
P(C | B, D, E not chosen) 5 P(C)y(1 2 (.25 1 (.40) 5 P(x 5 5)P( y 5 10)
.30 1 .10) 5 .15y.35 5 15y35.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
612 Answers to Odd-Numbered Exercises
45. a. 0, because x cannot take values between .2 and .3 b. Possible penalties: refold letter (rework), bend
b. .36498 letter to fit envelope (lower quality), reprint and
c. .3544 (without continuity correction); .5098 fold new letter (scrap and rework).
(with continuity correction) 5. a. Attributes data b. Variables data
47. a. Mean of sampling distribution should be closer c. Attributes data d. Attributes data
to 4. e. Attributes data f. Variables data
b. Mean of sampling distribution based on n 5 100 g. Attributes data h. Variables data
will be closer to 4. i. Variables data
c. Variance of sampling distribution based on n 5 7. Some unacceptable parts whose true lengths are
2 will be larger. .02 inch or less below the LSL will give measured
49. a. Mean 5 .80; standard deviation 5 .08 lengths above the LSL (and will then be incor-
b. Mean 5 .20; standard deviation 5 .08 rectly classified as acceptable). Conversely, some
c. Mean 5 .80; standard deviation 5 .04 acceptable parts whose true lengths are less than
51. a. .6826 b. .9544 .02 inch below the USL will have measured
53. a. .0228 b. .0228; same as in (a) lengths above the USL (which incorrectly classifies
c. 8.8225 hours d. $0.15 per package them as unacceptable).
55. a. .9803; .4803 b. 31.91, or n 5 32 9. Method 2 would be a better rational subgrouping
57. a. Sampling distribution is approximately normal scheme.
with p 5 .02 and p 5 .014 11. a. P(z . 3) 5 .0013 b. P(z . 3.09) 5 .001
b. .7611 13. Chart #1: Test #3 is found [Six points in a row
59. a. x 5 exp( 1 2y2) 5 .099308 b. .2643 are steadily increasing, starting with
61. a. 2y3 b. 7y9 point #3.]
63. As long as P(A) and P(B) are both positive, A and B Chart #2: Even though there are no tests found,
cannot be independent. Test #7 (which requires that 15 points
65. P(A or B) 5 1 2 P(A9)P(B9) if A and B are in a row be in zone C) seems likely to
independent. occur.
67. a. b 5 2 b. 5 4y3 Chart #3: Test #2 is found [Nine points in a row
c. 2 5 32y243, so 5 .36289 on one side of the centerline, starting
69. a. .1396 b. .8604 c. .0099 with point #2.]
71. a. The shape of the histogram should be symmet- Chart #4: Both Tests #5 and 6 are found starting
ric and bell shaped with point #1.
b. The shape of the histogram should be positively 85.2
15. Centerline R 5 a b 5 2.84
skewed. 30
c. For the uniform distribution, a sample size of
UCLR 5 D4R 5 (2.282)(2.84) 5 6.48
10 is sufficiently large to produce a reasonable
normal sampling distribution of x. However, for LCLR 5 D3R 5 (0)(2.84) 5 0
the exponential distribution, a sample size of 17. a. On the s chart no rules for statistical control are
10 is not yet sufficiently large to produce a normal broken. So, we would conclude that the process
sampling distribution of x. variation is in statistical control.
73. a. 0 b. .0038 c. 6 b. The control limits of Exercise 16(b) are based
75. For flights coming into DC: P(1 u late) 5 .4918, on a different formula compared to that used in
P(2 u late) 5 .2459, P(3 u late) 5 .2623 17(b). However, the control limits in both exer-
For flights coming into LA: P(1 u late) 5 .3125, cises are similar in values.
P(2 u late) 5 .375, P(3 u late) 5 .3125 19. a. Centerline 5 1.2642, UCL 5 2.4905,
77. P(A) 5 .45, P(B) 5 .32 LCL 5 .0379
b. Centerline 5 96.503, UCL 5 98.1300,
Chapter 6 LCL 5 94.8760
1. Tolerance 5 6(.05)(560) 5 28 ohm, so LSL 5 532 21. a. If each xi value is transformed into yi 5
and USL 5 588. b(xi 2 a), where a and b are constants and
3. a. The envelope puts an upper specification limit b . 0, then for any set of n values, y 5 b(x 2 a)
of 4.00 inches on the width of a folded letter. and Ry 5 bRx . From these two relationships,
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 613
from the transformed data should be used. So, b. R(800,000) 5 e2(800,000y600,000) 5 .0424
x 5 .1833, R 5 4.200. Also, the process speci- c. R(600,000) 5 e2(600,000y600,000) 5 1ye 5 .3679
4
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
614 Answers to Odd-Numbered Exercises
55. As the drill wears out it may not be able to drill the 11. a. Narrower b. No c. No d. No
hole diameters properly. On a control chart this 13. a. (12.69, 14.97). We are 99% confident the aver-
problem will likely manifest itself as a slow trend age backpack weight of 6th graders is between
down in the hole diameters that are being drilled. 12.69 and 14.97 pounds
That is, the hole diameters may get smaller and b. (13.26, 16.25).
smaller. Test #3 on the conditions for an “out of c. The average backpack weight as a percentage
control” process may occur. of body weight of 6th graders seems well above
57. b. When analyzing the control chart we do not see the recommendation as 10% is well outside the
any “out of control” conditions. We conclude interval (13.26, 16.25).
that the milling process is in statistical control. 15. a. (1398.90, 1455.10). We are 95% confident that
59. a. Cp 5 1.33, Cpu 5 1, Cpl 5 1.67, Cpk 5 1. the true average FEV1 level for the given popu-
61. It can be shown that P(x . t1 1 t2 u x . t1) lation is between 1398.90 and 1455.10 ml.
P(x . t1 1 t2) b. 158
5 5 e2t2 5 P(x . t2)
P(x . t1) s
17. s 6 (z critical value) a b 5 (3.332, 4.128)
63. a. Since this system is connected in series, the 12n
overall reliability is R(t) 5 R1(t) ? R2(t). 19. 390.74 min
Note that R1(t) # 1 and R2(t) # 1, and so 21. 4.062 kip
R(t) # R1(t) and R(t) # R2(t). Thus, the over- 23. a. (.50, .56). We are 99% confident the propor-
all reliability never exceeds the reliability of tion of all adult Americans who have watched
any of its individual components. That is, streamed programming is between 50 and 56%.
R(t) # min{R1(t), R2(t)}. b. 664
b. In the case where the components are not nec- 25. a. .042
essarily independent, then R(t) = P(T . t) 5 b. If we were to sample repeatedly, the calculation
P(both components last longer than t) 5 P(T1 . t method in (a) is such that will exceed the cal-
and T2 . t). Since {T1 . t and T2 . t} is the in- culated lower confidence bound for 95% of all
tersection of the two events {T1 . t} and possible random samples of n 5 143 individuals
{T2 . t}, it’s probability cannot exceed P(T1 . t) who received ceramic hips.
or P(T2 . t). That is, R(t) # min[P(T1 . t), 27. a. p1 2 p2
P(T2 . t)] 5 min[R1(t), R2(t)]. p1(1 2 p1) p2(1 2 p2)
6 (z critical value) 1
B n1 n2
Chapter 7
b. (2.118, .136), no
1. Yes, because the length x can also be thought of as
c. (2.118, .135)
a sample average based on a sample size of n 5 1.
29. a. (A, B) 5 ln(p1yp2)
3. a. .4714 n1 2 u n2 2 v
b. .8414 (n 5 50); .9544 (n 5 100) approximately 6 (z critical value) 1 , (eA, eB)
1 (n 5 1000) A n 1u n 2v
c. The probability that the sample mean lies within b. (.970, 1.349), yes
61 unit of increases as the sample size n 31. 271
increases. 33. (.012, .056), using a 95% confidence level
5. a. 1n 5 2(1.645), so n $ 11 35. 4.3, no
b. 80%: 1n 5 2(1.282), so n $ 7 37. a. 2.228 b. 2.086 c. 2.845
95%: 1n5 2(1.960), so n $ 16 d. 2.680 e. 2.485 f. 2.571
99%: 1n 5 2(2.576), so n $ 27 39. a. 1.812 b. 1.753 c. 2.602 d. 3.747
c. Increasing the probability that x lies within e. 2.1716 (from Minitab) f. Roughly 2.43
1 unit of requires corresponding increases in 41. a. Yes, a normal quantile plot shows a somewhat
the sample size n. linear relationship.
7. a. 99.8% b. 99.5% c. 85% d. 68% b. (106.4, 109.1). Based on this interval, 107 is a
9. a. Increased interval width plausible value but 110 is not plausible for the
b. Decreased interval width true average work of adhesion.
c. Increased interval width
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 615
43. a. We are 95% confident that the true average c. Let 1 and 2 represent the true mean differ-
mileage is between 46,145.4 and 86,296.8. ences in side-to-side AP translation for pitchers
b. We are 95% confident that the mileage for a and position players, respectively. To generate a
single vehicle is between 0 and 148,995.4. This confidence interval for 1 2 2, we use the dif-
interval is much wider than the interval from ferences utilized in parts (a) and (b). A 95% con-
part (a). fidence interval for 1 2 2 is (1.69, 5.98). Since
45. a. Using a normal probability plot, we ascertain both endpoints are positive, we concur with the
that it is plausible that this sample was taken authors’ assessment that this difference is greater,
from a normal population distribution. on average, in pitchers than in position players.
c. 38.78 59. a. (23.85, 11.35)
d. 42.29, a higher upper bound than that found in b. (7.02, 10.06)
part (c). 61. a. 95% bootstrap interval is (431.82, 445.65) based
47. a. A 95% prediction interval for the amount of on 200 bootstrap replications. (Note that all
warpage of a single piece of laminate is .0635 6 bootstrap intervals will differ slightly from one
.0137 another.)
b. (.0464, .0806) b. t interval: (430.51, 446.08); bootstrap interval:
49. (3.43, 4.13). Thus, with 95% confidence, we can (431.82, 445.65)
say that the true average firmness for zero-day ap- 63. a. MLE for is xyn.
ples exceeds that of 20-day apples by between 3.43 b. xyn is an unbiased estimator of .
and 4.13 N. c. MLE for (1 2 )5 is (1 2 xyn)5.
51. a. The most notable feature of these boxplots is 65. a. MLE is x 1 11.645(s ), where s equals
the larger amount of variation present in the n21
s ,
mid-range data as compared to the high-range A n
data. Otherwise, both boxplots look reasonably where s is the sample standard deviation of the
symmetric and there are no outliers present. data.
b. A 95% confidence interval for ( mid range 2 b. 403.3
high range) is (27.84, 9.54). Since plausible 67. a. n 5 min(x1, x2,…, xn); n 5 1y(x 2 n)
values for (1 2 2) are both positive and nega- b. n 5 .64; n 5 1y(5.58 2 .64) 5 .202
tive (i.e., the interval spans zero), we would 69. 5 2 is too large; the resulting kernel density will
conclude that there is not sufficient evidence to not show much detail in the data.
suggest that 1 and 2 differ. 71. a. The kernel density graph will have a very
53. Assuming sample “1” corresponds to the lab choppy appearance.
method, the CI says we’re 95% confident that the b. Larger values of will result in smoother kernel
true mean arsenic concentration measurement density curves.
using the lab method is between 6.498 g/L and 73. will have to be raised.
11.102 g/L higher than using the field method. 75. a. . 134.78
55. a. A 95% confidence interval for d is (214.83, b. Tensile strengths should be normally distributed.
26.50). Since this interval contains negative and c. A histogram of the data appears approximately
positive values, there is not sufficient evidence bell-shaped, so the normality assumption is a
to suggest that d is different from zero. good one for this data.
b. A 95% prediction interval for the difference d is d. . 127.81
(248.85, 60.51). 77. (2299.3, 1517.9)
57. a. (2.03, 6.10). We are 95% confident that the 79. (1024.0, 1336.0), yes
true mean difference between dominant and 81. a. A normal probability plot shows it is reasonable
nondominant arm translation for pitchers is be- to assume the sample was taken from a normal
tween 2.03 and 6.10. population distribution.
b. (20.54, 1.01). We are 95% confident that the b. Letting d 5 peak ER velocity–peack IR veloc-
true mean difference between dominant and ity, a 95% confidence interval for d is (34.1,
nondominant arm translation for position play- 130.9). Since both endpoints are positive, we
ers is between 2.54 and 1.01. conclude that IR and ER differ significantly,
with ER being the higher of the two.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
616 Answers to Odd-Numbered Exercises
83. a. .5 b. .25 c. (.5)n d. (.5)n b. No, it is not plausible from the results in part a
e. 1 2 2(.5)n, 100[1 2 2(.5)n] that the variable ALD is normal. However, since
f. (28.7, 42.0), 99.8% n 5 49, normality is not required for the use of
g. (28.62, 40.28), narrower than (f) z inference procedures.
85. a. 1y2 b. 1y3 c. 1y(n 1 1), 1y(n 1 1) c. H0: 5 1.0 versus Ha: , 1.0. z 5 25.79; at
d. 1 2 2y(n 1 1), 100[1 2 2y(n 1 1)]%, any reasonable significance level, we reject the
(28.7, 42.0), 81.8% null hypothesis. Yes, the data provides strong evi-
87. No, (69.80, 88.80), 99.97% dence that the true average ALD is less than 1.0.
89. a. (38.46, 38.84) 19. a. P-value 5 P(t . 3.2) 5 .003, reject H0.
b. (Answers will vary): For a simulation pro- b. P-value 5 P(t . 1.8) 5 .055, do not reject H0.
grammed in R using 1000 bootstrapped means, c. P-value 5 P(t . 2.2) 5 .578, do not reject H0.
a 95% bootstrap interval for the population 21. a. P-value 5 2P(t . 1.6) 5 2(.068) 5 .136, do not
mean was (38.51, 38.81). This interval agrees reject H0.
closely with the interval from part (a). b. P-value 5 2P(t , 21.6) 5 2(.068) 5 .136, do
91. a. (.296, .324) not reject H0.
b. Since the interval value dips below 30%, we c. P-value 5 2P(t , 22.6) 5 2(.008) 5 .016, do
cannot conclude that the 2002 percentage is not reject H0.
more than 1.5 times the 1998 percentage. d. P-value 5 2P(t , 23.9) 2(0) 0, reject H0.
23. H0: 5 30 versus Ha: , 30. t 5 0.84, P-value 5
Chapter 8 .209, do not reject H0.
25. H0: 5 181 versus Ha: . 181. t 5 1.91,
1. a. Yes b. No c. Yes d. No
P-value 5 .041, reject H0.
e. Yes f. Yes g. Yes h. Yes
27. a. 17 b. 21 c. 18 d. 26
i. Yes j. Yes
29. H0: (1 2 2) 5 0 versus Ha: (1 2 2) , 0.
3. H0: 5 40 versus Ha: Þ 40
t 5 22.46 22.5, df 5 15, P-value 5 .012.
5. H0: 5 120 versus Ha: , 120. Type I: Con-
Do not reject H0.
clude that the new system does reduce average
31. a. Normal quantile plots show sufficient linearity
distance when in fact it does not. Type II: Con-
for each data set. Therefore, it is plausible that
clude that the new system does not reduce average
both samples have been selected from normal
distance when in fact it does.
population distributions.
7. With 1 for regular and 2 for special, H0: 1 2 2 5
b. The comparative boxplot does not suggest a
0 versus Ha: 1 2 2 . 0. Type I: Conclude that
difference between average extensibility for the
the special outperforms the regular laminate when
two types of fabrics.
this is not the case. Type II: Conclude that the
c. H0: (H 2 P) 5 0 versus Ha:(H 2 P) Þ 0.
regular laminate is at least as good as the special
t 5 2.38, df 5 10, P-value 5 .71. Do not reject
laminate when in fact the special does yield an
H0.
improvement.
33. H0: (1 2 2) 5 0 versus Ha: (1 2 2) . 0. When
9. a. Reject H0. b. Reject H0.
assuming unequal variances, t 5 3.6362, the
c. Don’t reject H0. d. Reject H0.
corresponding df is 37.5, and, the P-value for our
e. Do not reject H0.
upper-tailed test would be [(.0008)y2] 5 .0004.
11. a. 1.83, .0336 b. 4.22, approximately 0
(Note: P-value 5 P(t . 3.6362) 5 .0004) Reject
c. 1.33, .0918
H0. We could have committed a Type I error.
13. a. H0: 5 .85 versus Ha: Þ .85
35. H0: (H 2 NH) 5 0 versus Ha: (H 2 NH) . 0.
b. Don’t reject H0, because P-value . . Same
t 5 2.09, df 5 17, P-value 5 .026. Do not reject
conclusion and reason.
H0.
15. H0: 5 55 versus Ha: Þ 55, z 5 25.25, (x1 2 x2) 2 D
P-value < 0, reject H0 37. a. Use t 5 with corresponding
17. a. Using software, x 5 0.75, x 5 0.64, s 5 .3025,
~ 1 1
sp 1
IQR 5 .505. These summary statistics, as well A n1 n2
as a boxplot (not shown) indicate substantial df 5 (n1 1 n2 2 2). sp is defined in Exercise 54
positive skewness, but no outliers. in Chapter 7.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 617
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
618 Answers to Odd-Numbered Exercises
71. a. The corresponding probability plot suggests the 85. H0: d 5 0 versus Ha: d . 0, d 5 .821, sd 5 2.52,
data is consistent with a normally distributed t 5 1.22, P-value 5 .126, do not reject H0.
population. So, we are comfortable proceeding 87. H0: 1 2 2 5 10 versus Ha: 1 2 2 . 10, t 5 2.49,
with the t procedure. df 5 5, P-value 5 .027, reject H0 when 5 5%.
b. H0: 5 0.6 versus Ha: , 0.6, t 5 22.14, 89. a. H0: 1 5 .27477, 2 5 .20834, 3 5 .15429,
P-value 5 .0495, reject H0 when 5 5%, do 4 5 .3626. The alternative hypothesis is that
not reject H0 when 5 1%. at least one of these proportions is incorrect.
c. In this context, a Type I error would be to 2 5 9.02, .01 , P-value , .05. Reject H0 when
conclude that less than 10% of the tube’s 5 5%. Thus, the above model is questionable.
contents remain after squeezing, on average, b. H0: 1 5 .45883, 2 5 .18813, 3 5 .11032,
when in fact 10% (or more) actually remains. 4 5 .24272. The alternative hypothesis is that
When we rejected H0 at the 5% level, we may at least one of these proportions is incorrect.
have committed a Type I error. A Type II error 2 5 .157, P-value . .10. Do not reject H0.
occurs if we fail to recognize that less than 10% Thus, the proposed model appears to fit the data
of a tube’s contents remains, on average, when quite well.
that’s actually true (i.e., we fail to reject the
false null hypothesis of 5 0.6 oz). When we Chapter 9
failed to reject H0 at the 1% level, we may have
committed a Type II error. 1. a. H0: A 5 B 5 C; where i 5 average
73. a. Since the mean, x 5 215, is so much lower strength of wood of Type i.
than the midrange (about 585), one would b. Use either Type A or B, but choose the less
suspect the distribution is positively skewed. expensive of the two types.
However, it is not necessary to assume normal- c. Choose the least expensive of the three types.
ity if the sample size is “large enough,” due to 3. The two ANOVA tests will give identical
the central limit theorem. Since n 5 47, we can conclusions.
proceed with a test of hypothesis about the true 5. There is no way of knowing whether there is a
mean consumption. statistically significant difference between the
b. H0: 5 200 versus Ha: . 200, z 5 .44, means. When there is no difference, the “pick
P-value 5 .33, do not reject H0. the winner” strategy doesn’t allow you to choose
75. H0: 5 1.75 versus Ha: Þ 1.75, t 5 1.70, between the population based on other criteria
P-value 5 .102, do not reject H0. (e.g., cost, time, etc.).
77. H0: 5 1y3 versus Ha: , 1y3, z 5 21.35, 7. F.05(5, 8) Þ F.05 (8, 5) and F.01(5, 8) Þ F.01(8, 5)
P-value 5 .0885, do not reject H0. 9. F 5 4.12 exceeds F.05(3, 30), so conclude that
79. H0: (1 2 2) 5 0 versus Ha: (1 2 2) Þ 0, there is a difference among the means.
z 5 2.25, P-value 5 .0244, reject H0. 11. a. Source df SS MS F
81. a. H0: (1 2 2) 5 0 versus Ha: (1 2 2) Þ 0.
Using the unpooled t-test statistic we have t 5 Treatments 5 3575.065 715.013 51.3
2.84 and df 5 18. This results in a P-value 5 Error 150 2089.350 13.929
2[P(t . 2.8)] 5 2(.006) 5 .012. These values Total 155 5664.415
differ slightly from t 5 2.51 and P-value 5 .019. b. H0: 1 5 5 6 versus Ha: at least two of the
b. H0: (1 2 2) 5 25 versus Ha: (1 2 2) . 25, i’s are different.
the unpooled t-test statistic value is .556, c. P-value is P(F5,150 51.3) 0, reject H0
P-value 5 .278, do not reject H0. 13. a. H0: 1 5 5 5 versus Ha: at least two of the
83. a. H0: 37,dry 2 22,dry 5 100 v. Ha: 37,dry 2 22,dry . i’s are different.
100. The relevant test statistic value is t 5 2.58, b. F 5 4.14, using software P-value 5 .0061,
df 5 9, P-value 5 .015, reject H0 when 5 5%. reject H0
b. H0: 22,wet 2 37,wet 5 100 v. 15. H0: 1 5 2 5 3 5 4 versus Ha: at least two of
Ha: 22,wet 2 37,wet . 50. The relevant test the i’s are different. F 5 2.31, P-value . .10, do
statistic value is t 5 .46, df 5 9, P-value 5 .328, not reject H0
do not reject H0. 17. a. SST, SSTr, and SSE are each multiplied by a
factor of (2.54)2, but the F ratio does not change.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 619
b. Changing the units of measurement will change 41. a. F ratio for Methods 5 8.69 is significant at
the sum of squares column in the ANOVA 5 5%. Curing methods do have differing ef-
table, but the degrees of freedom and F ratio fects on strength.
will remain unchanged. b. F ratio for Batches 5 7.22 is significant at
19. SSTr 5 .2164; SSE 5 .3870; F ratio 5 4.194 is 5 5%. Different batches do have an effect on
significant at 5% level. There is evidence of a strength.
difference between the means. c. F ratio for Methods 5 2.83, which is not signifi-
21. a. SSTr 5 25.80; SSE 5 115.48; SST 5 141.28; cant at 5 .05. Conclusion: Curing method
F ratio 5 2.35 is not significant at 5% level. does not have an effect on strength.
There is no evidence of a difference between 43. a. H0: 1 5 2 5 3 versus Ha: at least two of the
the means. population means differ
b. Favorable, because pegs have same strength b. Source df SS MS F
regardless of positioning
23. a. The ANOVA table entries will be unchanged by Factor 2 591.20 295.6 1.3
the calibration error. Error 21 4773.3 227.3
b. If all data points are shifted (up or down) by the Total 23 5364.50
same amount , this will not affect any of the en- c. Corresponding P-value . .10. Do not reject H0.
tries in the ANOVA table; however, the means of 45. a. F.05(1, 10) 5 4.96 and t.025(10) 5 2.228. (2.228)2 <
each sample will shift by an amount equal to . 4.96; the equality is approximate because the F
25. An effects plot does not show the within-samples and t table entries are rounded.
variance, only the between-groups variation. b. F.05(1, df2) approaches 3.8416, the square of 1.96.
27. q.05(5, 15) 5 4.37, so T 5 36.09. 47. MSTr 5 140, so if F ratio . F.05(2, 12) 5 3.89,
437.5 462.0 469.3 512.8 532.1 then MSE , 140y3.89 5 35.99. q.05(3, 12) 5 3.77,
29. T 5 36.09 as in Problem 27. so to have T . 10 (i.e., largest mean 2 smallest
427.5 462.0 469.3 502.8 532.1 mean), MSE must exceed (5)(10y3.77)2 5 35.18.
Therefore, if 35.18 , MSE , 35.99, then the
two conditions will be satisfied. In terms of SSE,
422.16 , SSE , 431.88.
31. q.05(6,150) q.05(6,120) 5 4.10, T 5 3.00. 49. For condition 1 to be satisfied it can be shown that
14.18 17.94 18.00 18.00 25.74 27.67 SSE , 385.60. For condition 2 to be satisfied it
33. q.05(3, 6) 5 4.34, T 5 7.92. There are 2 distinct can be shown that SSE . 422.15. Therefore, no
sets: Set 1 (42.67, 43.33), Set 2 (53.67). SSE value exists that can satisfy both conditions.
42.67 43.33 53.67
35. a. F ratio for Brands 5 95.57 is significant at 5 51. a.
1%. There is a difference between the brands. Source df SS MS F P-value
b. F ratio for Humidity 5 278.20 is significant at
5 1%. Humidity levels do affect power con- Drying method 4 14.962 3.741 36.70 0.000
sumption, so it was wise to use humidity as a Fabric type 8 9.696 1.212 11.89 0.000
blocking factor. Error 32 3.262 0.102
37. a. F ratio for Brand 5 8.96 is significant at 5 Total 44 27.920
5%. There is a difference among lathe brands. b. The null hypothesis of interest is H0: there are
b. F ratio for Operators 5 10.78 is significant at no differences in mean smoothness scores for
5 5%. There is a difference among operators. the five drying methods. The F-ratio for “drying
39. a. method” is F 5 36.7, P-value , .001. reject H0.
Df Sum Sq Mean Sq F-value Pr(F)
DESIGN 3 519515 173171.67 35.46 <.0001 Chapter 10
PERSON 20 100460 5023.00 1.03 0.445 1. Replication allows you to obtain an estimate of the
Residuals 60 293009 4883.48
experimental error.
b. Yes, P-value ,.0001. 3. a. Surface is a dome over the x–y plane
c. Corresponding F ratio 5 1.03 with P-value 5 b. Maximum occurs at x 5 2, y 5 5.
.445. The person-to-person differences in RPN c. Contours are circles centered at (x, y) 5 (2, 5).
are not confirmed by the data.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
620 Answers to Odd-Numbered Exercises
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 621
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
622 Answers to Odd-Numbered Exercises
A 2.625 21 21 21 21 1
B 9.625 1 21 21 1 21
C 24.625 21 1 21 1 1
AB .175 1 1 21 21 21
AC 1.125 21 21 1 1 21
BC 21.825 1 21 1 21 1
ABC 2.525 21 1 1 21 21
b. Factors B and C appear to be significant. 1 1 1 1 1
c. B at its high level and C at its low level. 37. By multiplying each of the 2521 effects through by
d. yn 5 14.313 1 4.813xB 2 2.313xC the defining relation I 5 ACE 5 BDE 5 ABCD
e. Term Effect you obtain the following alias structure:
A 5 CE 5 BCD 5 ABDE, B 5 DE 5 ACD 5
A .002 ABCE, C 5 AE 5 ABD 5 BCDE, D 5 BE 5
B 22.693 ABC 5 ACDE, E 5 AC 5 BD 5 ABCDE, AB 5
C .067 CD 5 ADE 5 BCE, AD 5 BC 5 ABE 5 CDE
AB 2.132 39. a. k 5 5 and p 5 1
AC .108 b. Let A 5 temp, B 5 pH, C 5 yeast, D 5
BC .063 Tryptone, and E 5 Nitsch. A 5 BCDE, B 5
ABC 2.058 ACDE, C 5 ABDE, D 5 ABCE, E 5 ABCD,
Only factor B appears to be significant. Set fac- AB 5 CDE, AC 5 BDE, AD 5 BCE, AE 5
tor B at its low level. Prediction equation is yn 5 BCD, BC 5 ADE, BD 5 CE, BE 5 ACD,
4.371 2 1.347xB. CD 5 ABE, CE 5 ABD, DE 5 ABC
33. a. Let A 5 time, B 5 current, C 5 EC area, c. The four–way interactions are confounded with
D 5 volume, E 5 arsenic. the main effects. The two–way interactions are
confounded with the three–way interactions.
Term Effect Term Effect Term Effect So, if all interactions consisting of three or more
A 20.019 AB 23.169 BD 2.906 factors are negligible, none of the estimates of
the remaining effects will be confounded with
B 26.119 AC 21.181 BE 21.456 one another.
C 2.131 AD 2.131 CD 1.069 41. a. k 5 3 and p 5 1
D 217.531 AE 21.281 CE 20.594 b. Let A 5 anode height, B 5 board orientation,
C 5 anode placement. The design generator in
E 22.519 BC 21.256 DE 21.331 this design is E 5 ABCD. The alias structure is:
b. The important effects appear to be the main A 5 2BC, B 5 2AC, C 5 2AB
effects A (time), B (current), and D (volume). c. Effect Name Effect
c. A (time) and B (current) should be set to
their high values. D (volume) should to set to A 23.135
its low value. B 21.135
d. Grand mean 5 71.953, Coefficient for C 24.925
A 5 (20.019y2) 5 10.010, Coefficient for d. SSE 5 (24.925)2 5 24.26, SSTo 5 (s2)(3) 5
B 5 (26.119y2) 5 13.060, Coefficient for (3.4338)2(3) 5 35.37, SSA 5 (23.135)2 5 9.83,
D 5(217.531/2) 5 28.766. So, the SSB 5 (21.135)2 51.29. When testing at 5
prediction equation is: .05, neither factor A nor factor B is important,
yn 5 71.953 1 10.010xA 1 13.060xB 2 8.766xD since their corresponding P-values (.639 and
.856) are so large.
e. Based on our analysis in part (d), we cannot con-
clude that factors A or B are significant. Also, we
assumed factor C was not significant in order to
test for the significance of factors A and B.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 623
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
624 Answers to Odd-Numbered Exercises
Chapter 11 15. As we saw in Exercise 13(b), the t ratio for testing
1. a. .095 b. .475 the model utility is dependent only on the sample
c. .830, 1.305 d. .4207 size and the sample correlation coefficient. Neither
3. a. V 5 g0 ? g11/T ? « b. 26.341 of these quantities is unit dependent. So, multiply-
5. a. Yes, a linear model seems appropriate. ing the dependent variable by a constant will have
b. yn 5 .1012 1 .4607x no effect on the t test statistic.
c. .3085 17. H0: 5 0 versus Ha: > 0, t 5 5.25, P-value ,
d. .0011 .0001, reject H0.
7. a. Yes, a linear model seems appropriate for each 19. a. b 5 1.378. There is, on average, a 1.378% in-
pair of variables. crease in reported nausea for each unit increase
b. yn0%5 123.501 2 8.711x, yn20% 5 158.570 2 in motion sickness dose.
13.562x, yn40% 5 167.282 2 17.113x b. t 5 3.422. Yes, there is a useful relationship be-
c. As timber damage increases, the linear relation- tween the two variables.
ship between pile length and critical rating c. It would be possible, but not advisable because
becomes increasingly negative. x 5 5 is outside the range of the x data.
d. se 5 .45 at 0%, se 5 3.01 at 20%, se 5 4.7 at d. b 5 1.424
40%. As timber damage increases, the estimated 21. a. The scatterplot appears to be quite linear.
value se also increases. b. .931
9. a. For a one unit increase in inverse foil thickness, c. If increasing velocity by 900 cm/sec results
one would expect a .260 unit increase in flux. in an average change in the response of .6,
98% of the observed variation in flux can be then our true population slope coefficient is
attributed to the simple linear regression rela- 5.6y90056.667 3 1024. H0 : 56.66731024
tionship between flux and inverse foil thickness. versus Ha : , 6.667 3 1024, t 5 2.6016,
b. 5.712 P-value . .10, do not reject H0.
c. 11.302 d. We are 95% confident that the true average
change in mist associated with a 1 cm/sec
a 2 1.128
11. H0: 50 versus Ha: Þ 0, t5 a b 5 a b5 increase in velocity is between 4.26 3 1024 and
sa 2.368 8.159 3 1024.
2.48, P-value 5 .642, do not reject H0. 23. a. A 95% prediction interval is (3.2833, 3.6067).
13. a. Method 1: Hypothesis Test, H0: 5 0 versus b. The interval when the temperature is 1200 de-
Ha: Þ 0, t 5 54.56, P-value , .0001, reject H0 grees will be wider than when the temperature
and conclude that there is a useful linear rela- is 1500 degrees. This is because 1200 degrees is
tionship between these two variables. Method 2: 200 degrees away from the mean temperature
A confidence interval for 5 b 6 (t critical of 1400 degrees whereas 1500 degrees is only
value)∙sb. A 95% confidence interval for is: 100 degrees away from the mean temperature.
.87825 6 (2.179)(.01610) 5 (0.8432, 0.9133), 25. The mean x value is 40.3. Intervals with x val-
using t critical value for df 5 (n 2 2) 5 ues farther away from this mean are wider. Also,
(14 2 2) 5 12. The plausible values are all prediction intervals are wider than confidence
positive so we conclude there is a useful linear intervals. And, 99% intervals are wider than 95%
relationship between the two variables. intervals. Therefore, (i) will be wider than
b. The t ratio for testing model utility would (iii), (i) will be more narrow than (ii), (ii) will be
be the same value regardless of which of the wider than (iv), (iii) will be more narrow than
two variables was defined to be the inde- (iv) and (v).
pendent variable. This can be easily seen by 27. a. t 5 16.2, P-value < 0, conclude there is a use-
looking at the t test statistic for testing if the ful linear relationship.
population correlation coefficient is equal to b. (.879, .947) c. (.780, 1.046)
zero. In that equation the only values required 29. a. 4.9 hr
are the sample size (n) and the sample cor- b. When number of deliveries is held fixed, the
relation coefficient (r). Both r and n are not average change in travel time associated with a
dependent on which variable was the indepen- 1-mile increase in distance traveled is .060 hr.
dent variable.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 625
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
626 Answers to Odd-Numbered Exercises
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Answers to Odd-Numbered Exercises 627
c. The large ΔCO value has extremely high le- Term Coefficient
verage. The least squares line that is obtained Constant 76.437
when excluding the value is yn 5 1.00 1 .0346x.
z1 27.35
The R2 value with the value included is 96% z2 9.61
and is reduced to 75% when the value is ex-
z3 2.915
cluded. The value of se with the value included
is 2.024 and with the value excluded is 1.96. So, z4 .09632
the large ΔCO value does appear to affect our z21 213.452
analysis in a substantial way. z22 2.798
61. a. Same x values yet different y values .02798
z23
b. b 5 .01023, sb 5 .009577, t 5 1.1,
P-value < .3; model cannot be judged useful. z24 2.0003201
63. a. The statement is incorrect. r2 is not the “linear z1 z2 3.750
correlation coefficient.” r2 is the coefficient of z1 z3 2.7500
determination. The linear correlation coeffi- z1 z4 .14167
cient is r and r 5 2.89 5 .9434. z2 z3 2.000
b. H0: 5 0 versus Ha: Þ 0. The value of the
z2 z4 2.1250
t test statistic equals 12.06. The corresponding
z3 z4 .00333
P-value is extremely small. So, we reject the
null hypothesis and conclude that there is a lin- c. The full model contains k 5 14 variables.
ear relationship between the two variables. The reduced model contains 4 variables.
c. As x increases so does the variation in the stan- H0: 5 5 5 14 5 0 versus H0: At least one
dardized residuals. This fact is inconsistent with of the ’s is not zero. SSResid(full) 5 1.9845,
our constant variance assumption of a least SSResid(reduced) 5 4.8146.
squares regression analysis. The value of the test statistic is:
65. a. The full model contains k 5 9 predictors. The (4.8146 2 1.9845)y10
F 5 c d 5 2.28,
reduced model contains 3 predictors. (1.9845)y(31 2 15)
H0: 4 5 5 9 5 0 versus H0: At least one of .05 , P-value , .10, do not reject the null
the ’s is not zero. hypothesis. There is not sufficient evidence at
(15233512805534)y6 the 5% level to claim that the second-order pre-
F 5 c d 5 .743, P-value . dictors provide useful information beyond what
(805534)y(15210)
is contained in the four first-order predictors.
.10, do not reject H0. There is not sufficient
69. A plot of y versus x suggests that simple linear
evidence to claim that the second-order predic-
regression model may be appropriate, but a graph
tors provide useful information beyond what is
of the residuals versus fitted values questions the
contained in the three first-order predictors.
validity of a simple linear regression model. Fit-
67. a. yn 5 84.67 1 .650 2 .258 1 .133 1 .108 2 .135 1 ting higher order models (such as second and
.028 1 .028 2 .072 1 .038 2 .075 1 .2131 third-order) may be more appropriate.
.200 2 .188 1 .050 5 85.39 The second-order model has R2 5 65.3%
The value of the residual for the one observa- and adjusted R2 5 62%, whereas the third-order
tion made under the specified conditions is: model has R2 5 70.7% and adjusted R2 5 66.3%.
(y 2 yn) 5 (85.4 2 85.39) 5 .01 Comparing adjusted R2 values the third-order
b. Let z1, z2, z3, z4 denote the uncoded variables. model seems to perform slightly better. From the
Then, z1 5 .1x1 1 .3, z2 5 .1x2 1 .3, z3 5 x3 1 second-order model, we predict y (at x 5 30) to be
2.5, z4 5 15x4 1 160. Equivalently, x1 5 10z1 2 3, 3.45 1 .0618(30) 2 .000377(302) 5 4.9647.
x2 5 10z2 2 3, x3 5 z3 2 2.5, x4 5 (z4 2 160)/15. For the third-order model, our estimate for y is
Substitution yields the following least squares 3.94 2 .045(30) 5 .0041(302) 2 .000048(303) 5
regression coefficients: 4.984. Both models appear to give roughly the
same estimate.
71. a. The boxplot shows that the shapes of the ppv for
the cracked and uncracked prisms appear to be
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
628 Answers to Odd-Numbered Exercises
fairly symmetric. The boxplot further suggests present and 1 if there is a crack), and the in-
that the ppv for the cracked prisms tend to be teraction term PPV*Crack?. The best subsets
greater than the ppv for the uncracked prisms. regression suggests that the single quadratic
Let 1 5 the true mean ppv for uncracked term PPV2 is the single most useful predic-
prisms and 2 5 the true mean ppv for cracked tor. The quadratic regression model, which
prisms. A 95% confidence interval for (222), has the R2 value of 61.2%, has the equation
using the critical t value 5 2.093 based on yn 5 .996719 2 .00000001(PPV)2. The next
19 df, is: most useful single predictor is the PPV term.
233.72 295.32 This simple linear regression model, which
(482.7 2 827.4) 6 2.093 1 has the R2 value of 57.7%, has the equation
B 18 12
yn 5 1.00161 2 .000018(PPV). Models involving
2344.7 6 2.093(101.494), or (2557.127, more than 1 term don’t appear to explain the
2132.273). ratio variable any more significantly, since the
b. Using Minitab, we can use the best subsets R2 values of such models are not much different
option using the PPV, PPV2, the indica- than the model that simply uses PPV2 or PPV.
tor variable Crack? (0 if there is no crack
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Index
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
630 Index
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Index 631
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
632 Index
Frequency, 14 Indicator variables, 484, 539 Least squares line, 119–121, 508–509
cumulative, 60 Inferential statistics, 6, 195 assessing the fit of, 122–124
relative, 14–15 Influential observations, 559 polynomial functions, 135–136
Frequentist approach to probability, Inspection units, 278 and residual plots, 127
202–203 Interaction effects, 454–456, 477 standard deviation about,
Full factorial designs, 489 Interaction predictors, 145 125–126
Full model, 550 Interactions, 447–448 Level of significance, 356
Full quadratic model, 145, 539 and degrees of freedom, 458 Levels, 414, 446, 472
of factors, 455 Leverage, 559
multifactor designs, 464 Likelihood functions, 336
G two-factor designs, 454–456 Likelihood ratio principles, 405
between variables, 537–539 Likelihood ratio test statistics,
Galton, Francis, 121–122 Intercept of a line, 118 405–406
General addition rule, 206 Interlaboratory comparisons, Logistic regression, 563–567
General additive fit, 146–148 189–191 Logit functions, 564
General additive model, 534 International Organization for Lognormal distribution, 46–47, 242
Goodness-of-fit tests, 119 Standardization (ISO), 165 mean value, 69
Grand mean, 257 Interquartile range (IQR), 80–83 and quantile plots, 93
and boxplots, 83–86 variance of, 77
Interval estimates, 298 Low level, 472
H
Invariance property, 339 Lower capability index, 271
Lower confidence bound, 305
Half fraction design, 491
Lower control limit (LCL), 253
Hazard functions, 285–287
High level, 472
J Lower quartile, 80–81
Lower specification limit (LSL), 248
High leverage, 559
Joint density, function, 153 Lower-tailed test, 360, 363–364
High-leverage observations, 129–130
Joint distributions, 151–157 LOWESS, 137, 138, 551
Histograms, 14–23, 247
mean values of, 154
of process data, 249–250
Joint mass function, 152
shapes of, 21–22
Homogeneity, test for, 385–389
Joint probabilities, 222 M
Hypergeometric distribution, 390
Main effects, 458
Hypothesis, 353
and test procedures, 353–363
K of a factor, 475
multifactor designs, 464
Hypothesis testing. See Test of
Kernel density estimation, 340–342 two-factor designs, 454–456
hypotheses
Kernel function, 340 Mallows’ CP, 561
k-out-of-n system, 60 Mann-Whitney test, 403–404
I Marginal distribution, 153, 156
Mass function, 33
Implicit null hypothesis, 359 L binomial distribution, 52
Independence joint, 152
of joint distributions, 155–156 Laplace, Pierre Simon de, 201–202 Poisson distribution, 54
of random variables, 222 Large-sample confidence intervals, probability, 218, 222
and the sample mean, 233 298–317 Matched pairs, 331
test for, 388–389 Leaf, 10 Maximum likelihood estimation,
Independent events, 209–213 Least squares, weighted, 137 335–339
Independent variables, 115, Least squares coefficients, 135, 143 Mean. See also specific
155–156, 181, 446. See also Least squares estimates, 509 distributions
Predictor variables in multiple regression, 542 bootstrap confidence intervals,
ANOVA problems, 415 Least squares fit, 119 343–344
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Index 633
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
634 Index
One-sample t test, 402–403 for nonconformities, 279 Probability density functions, 218
test of hypotheses, 365–367 table, 587 joint, 222
One-sided confidence intervals, variance of, 76 Probability distributions, 218–220.
304–305 Polynomial functions, 135–137 See also individual
One-sided tolerances, 247 Polynomial regression, 135–137, distributions
One-tailed t test, 401–402 535–537 Probability plots. See Quantile plots
One-way ANOVA, 415 Pooled t confidence intervals, 329 Procedures, 170
F test, 421 Pooled t test, 378 Process, 9
Operating characteristic curve, 57 Population, 3, 166 Process capability, 265–273
Operational definitions, 162–166 Population correlation indexes, 268–272
Out of control rules, 253–254 coefficient, 114–115, 522 nonconformance rates, 266–268
Outliers, 11 Population proportions, 309 Process control activities, 248
in boxplots, 85–86 estimating, 175–177 Process mean, 266
and sample means, 63 Population regression coefficients, Process spread, 266
and sample medians, 63, 64 534 Process variation, 266
and trimmed means, 64 Population regression functions, Product rule for probabilities, 210
513, 534 Professional standards, 163–165
Population regression line, 505 Proportion
P Positive skew, 16, 22 distribution of sample, 238–240
Power function relationship, 135 population, estimate of, 175–177
p charts, 240, 273–278 Power of a test, 402 Proportional allocation, 173
P-values, 358 Power transformations, 132–135
for a chi-squared test, 381 Practical significance, 400
test of hypotheses, 357–361 Precision of measuring data, 187 Q
for t tests, 363 Predictable controlled process, 265
Paired data, 327, 371–374 Predicted values, 122, 484, 510, 544 Quadratic predictors, 145
confidence interval from, sampling distribution, 526 Quadratic regression, 135–137, 535
329–331 Prediction bounds, 322 model, 536
Paired t intervals, 330 Prediction errors, 529 Quadrats, 174, 177
Paired t test, 372 Prediction intervals Qualitative predictor variables,
Parallel systems, 212–213, and confidence intervals, 530 539–541
288–289 in multiple regression, 549 Quantile, 87
Pareto diagrams, 23 one-sample, 318, 321–323 Quantile plots, 90–97
Pearson’s sample correlation in simple linear regression, for normal distributions,
coefficient, 108–111 529–530 91–93, 557
and the coefficient of Predictor variables, 117, 140–151 sample quantiles, 90
determination, 124 qualitative, 539–541 Weibull distribution, 93–94
properties of, 111–114 Predictors Quartiles, 80–83
Percentiles, 86–87 creating new, 145–146
of a normal distribution, 40 eliminating a group of, 550–551
Plots for checking model adequacy, interaction, 145 R
556, 557, 558 model selection, 559–563
Point estimates, 293–294 quadratic, 145 R charts, 257–258, 263
Point estimation, 294–298 Probabilistic models, 504 R software, 4
maximum likelihood, 335 Probability Random deviation (error), 504
unbiased, 295–297 concepts of, 201–208 Random effects, 433–434
Poisson distribution, 54–56 conditional, 208–215 Random effects model, 459
approximation to binomial, joint, 222 Random experiments, 195
55–56 mass function, 218, 222 Random factors, 433
mean value, 67 of a match, 214 Random number generator, 168
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Index 635
Random sampling, 167, 168–170, Resampling procedures, 343 Sampling inspection, 200
194 Research hypothesis, 354 Sampling plans, 162
and nonrandom samples, Residual plots, 126–129, 557–558 SAS software, 4
170–171 Residual sum of squares (SSResid), Scatterplot matrix, 142
and sampling distributions, 228 122–123, 136, 144, 510 Scatterplots, 102–107, 508
stratified sampling, 171–177 multiple regression, 544 and correlation, 108
Random variables, 215–227 Residuals, 122, 510 monotonic pattern, 513
Randomization, 182–183, 446, 474 multiple regression, 544 of nonlinear relationships,
Randomized block design, 436 standardized, 558 132–140
Randomized block experiments, Resistant line, 129 smoothing, 137–138
435–441 Response surface, 449 Youden plots, 189–191
Range, 72 Response variables, 117, 162, 181, Screening designs, 494
Rank, 403 414, 446 Series system, 211, 212–213,
Rational subgroups, 253–255 predicting, 484–486 287–289
Rayleigh distribution, 59 Robust interval, 324 Shewhart, W.A., 252
Reduced models, 550 Ryan-Joiner test, 395 Shewhart chart, 254, 256
Redundancy, 288 for normality, 395, 603 Shifted Weibull distribution, 50
Regression, 121–122 Significance, statistical vs.
and analysis of variance practical, 400
(ANOVA), 521–522 S Significance level, 356
cubic, 536 Simple events, 196
exponential, 513–514 s charts, 260–263 Single factor ANOVA
line, 505, 525–533 Sample mean, 62–63 Simple linear regression model,
logistic, 563–567 sampling distribution of, 505, 507, 525
model selection, 559–563 233–238 Simple random sampling (SRS),
multiple, 533–542 Sample median, 63–64 171
nonlinear, 541 Sample proportion, 175 Simultaneous confidence
polynomial, 135–137, 535–537 sampling distribution of, intervals, 531
quadratic, 535 238–240, 308 Single-factor ANOVA, 419–427
simple linear, 505, 507 Sample regression line, 119–121 and boxplots, 85
single independent variable, Sample size determination degrees of freedom, 420
504–517 and confidence intervals, notation, 419
slope of a line, 501, 505, 303–304, 310–311 test of hypotheses, 420–423
517–525 estimation, 173 Six sigma, 248
unusual observations in, 559 Sample space, 196 Skewed histogram, 22
Regression analysis, 117, 555–573 Samples, 3, 166 Skewed Weibull density
Regression coefficients, 547, 549 Sampling cluster, 177 curves, 48
Regression sum of squares Sampling data, 166–179 Slope of a line, 118
(SSRegr), 511 with or without replacement, 168 in multiple regression, 547
Relative error, 191 stratified, 171 in regression, 501, 505,
Relative frequency, 14–15 Sampling distributions, 228–232 517–525
Reliability, 283–291 of the chart statistic, 253 Small-sample intervals, 318–327
and hazard functions, 285–287 of the difference between two Smoothed histograms, 21–22
system, 287–289 means, 312–313 Smoothing a scatter plot, 1
at time t, 285 of the estimated slope, 517, 547 37–138
Repeatability, 188–189 of a predicted value, 526 Smoothing parameters,
Repeated stems, 12 of a sample mean, 233–238 340–341
Replicate, 463 of a sample proportion, Software packages, 4
Replication, 182, 446, 447 238–240, 308 Special causes, 252
Reproducibility, 188–189 Sampling frames, 9, 168 Specification limits, 247–248
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
636 Index
Standard deviation block sum of squares (SSB), 437 multifactor designs, 466–469
about the least squares line, effect, 482 normal distribution, 359–360
125–126 error. See Error sum of squares one-sample t test, 365–367
of a continuous distribution, (SSE) paired t test, 372
76–78 regression, 511 procedures for, 405–406
of a discrete distribution, 74–76 residual. See Residual sum of P-values, 357–361
of the normal distribution, 76–77 squares (SSResid) single-factor ANOVA, 420–423
sample, 73 total. See Total sum of squares steps, 366–367
of the sample mean, 233 (SSTo) Test procedures, 355
of sampling statistics, 526, 528 treatment, 419, 437 confidence intervals, 404–405
Standard error, 173–174, 233 Symmetric histogram, 22 hypothesis, 353–363
of the sample proportion, 239 System reliability, 287–289 Test statistics, 357–361
Standard normal distribution, 38–41 Tolerance critical value, 323, 589
table, 582–583 Tolerance intervals, 318, 323–324
table of values, 38 T Tolerances, 247
Standard order, 473–474 Topological reliability, 211, 287
Standardized limits, 41 t confidence interval, 318–321, Total degrees of freedom,
Standardized residuals, 558 327, 329 457, 466
plot, 557–558 t critical values, 319 Total quality management (TQM),
Standardized variables, 41 t distributions, 318–321 248
Standards, 163–165 t table, 319, 364, 588, 590–592 Total sum of squares (SSTo), 122,
professional, 161 t test, 365–367, 368–369 437, 456–457
Statistic, 194 one sample, 365, 402–403 ANOVA, 420
Statistical control, 253 one-tailed, 401–402 regression, 136, 144
Statistical hypothesis, 353 paired, 372 Transformations, 132–135, 514
Statistical inferences, 194, 195 P-values, 363 of data, 26
Statistical process control two sample, 368–369 Treatment levels, 181, 414
(SPC), 248 Target value, 247 Treatment sum of squares (SSTr),
Statistical significance, 400 Test of hypotheses, 353, 437 419, 437
Statistically significant results, 400 about categorical populations, Treatments, 414
Statistics, 1 380–394 Tree diagrams, 197
descriptive, 4 about means, 363–380 Trimmed mean, 64–65
inferential, 6 bootstrap, 405 Trimming percentage, 64–65
scope of, 6–8 chi-squared tests, 396–397 True regression line, 505
Stem, 10 and confidence intervals, Truncation, 12
Stem-and-leaf displays, 10–13 404–405 Tukey, John, 428
comparative, 13 difference between means, Tukey’s method, 428–432
Stepwise regression, 563 367–374, 371–374, 403–404 2k designs, 472–489
Straight line, fitting to, 118–121 distribution, form of, 394–399 analyzing experiments, 479–484
Strata, 171 errors in, 355–357 fraction of, 490
Stratified sampling, 171–177 for a group of predictors, models, fitting, 484–486
Studentized range distribution, 429 550–551 Two-factor ANOVA, 458
table, 429, 599–600 homogeneity, 385–388 Two-factor designs, 453–463,
Subgroups, 252 hypothesis testing, 437 457–461
rational, 253–255 independence, 388–389 Two-factor interaction effects,
Sum of squares, 436–437, 444–445, large-sample, 359–361 456, 475
456–457 Mann-Whitney, 403–404 Two-sample bootstrap
balanced three-factor means, 365–367 intervals, 344
ANOVA, 465 model utility, 520, 545 Two-sample t interval, 327–329
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Index 637
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
www.ebook3000.com
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.