Mistakes in Quality Statistics - Sample Chapter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

­ istakes in

M
Quality Statistics
­ istakes in
M
Quality Statistics

and How to Fix Them

Donald W. Benbow

ASQ Quality Press


Milwaukee, Wisconsin
 ­Mistakes in Quality Statistics and How to Fix Them
© 2021 by Donald W. Benbow
All rights reserved. Published 2021
Printed in the United States of Amer­i­ca
24 ​23 ​22 ​21 ​SWY 5 ​4 ​3 ​2 ​1

Publisher’s Cataloging-in-Publication data

Names: Benbow, Donald W., 1936–, author.


Title: Mistakes in quality statistics and how to fix them / by Donald
W. Benbow.
Description: Includes bibliographical references. | Milwaukee, WI: Quality
Press, 2021.
Identifiers: LCCN: 2021935792 | ISBN: 978-1-63694-000-7 (paperback) |
978-1-63694-001-4 (epub)
Subjects: LCSH Statistics. | Statistics—Evaluation. | Statistics—
Methodology. | Auditing. | Quality control. | BISAC BUSINESS &
ECONOMICS / Auditing | MATHEMATICS / Probability & Statistics /
General
Classification: LCC QA276 .B46 2021| DDC 519.5—dc23

No part of this book may be reproduced in any form or by any means,


electronic, mechanical, photocopying, recording, or other­wise, without the
prior written permission of the publisher.

ASQ advances individual, orga­nizational, and community excellence


worldwide through learning, quality improvement, and knowledge
exchange.

Attention bookstores, ­wholesalers, schools, and corporations: Quality Press


books are available at quality discounts with bulk purchases for business,
trade, or educational uses. For information, please contact Quality Press at
800-248-1946 or books@asq​.­org.

To place ­orders or browse the se­lection of ASQ Excellence and Quality


Press titles, visit our website at http://­www​.­asq​.­org​/­quality​-­press.

Printed on acid-­free paper.


­Table of Contents

List of Figures and ­Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii


Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Chapter 1 Hypothesis Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Hypothesis Test for Two Population Means . . . . . . . . 3
1.2 Hypothesis Test for a Population Proportion. . . . . . . . 5
1.3 Hypothesis Test for Population
Standard Deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 2 Correlation and Causation . . . . . . . . . . . . . . . . . . 13


2.1 Lurking Variable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Correlation or Causation?. . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Correlation and Prediction . . . . . . . . . . . . . . . . . . . . . . . . 28

Chapter 3 Margin of Error and Confidence Intervals. . . . . . 31


3.1 Margin of Error and Confidence Intervals for
Population Means. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Hypothesis Test for Two Population Means . . . . . . . . 34
3.3 Margin of Error and Confidence Intervals
for Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Chapter 4 Control Charts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41


4.1 Pre-­control Charts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Calculations for a Pro­cess with a Normal
Population. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

v
vi Mistakes in Quality Statistics

Chapter 5 Mea­sure­ment Systems. . . . . . . . . . . . . . . . . . . . . . . 73


5.1 Mea­sure­ment System Variation. . . . . . . . . . . . . . . . . . . . 73
5.2 Evaluating a Mea­sur­ing System
for Continuous Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Sample Procedure for Mea­sure­ment
System Analy­sis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4 Diagnosis Using the GRR . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.5 Stability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6 Mea­sure­ment Systems and Capability Analy­sis. . . . . . 80

Chapter 6 Designed Experiments. . . . . . . . . . . . . . . . . . . . . . . 83


6.1 Main Effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Interaction Effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3 Fractional Factorial Designs. . . . . . . . . . . . . . . . . . . . . . . 89
6.4 Balanced Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Chapter 7 Sampling Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93


7.1 Attribute Sampling Plans . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Double Sampling Plans. . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3 Evaluating Sampling Plans. . . . . . . . . . . . . . . . . . . . . . . . 97
7.4 Variables Sampling Plans . . . . . . . . . . . . . . . . . . . . . . . . . 101

Chapter 8 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


8.1 Summary of Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Chapter 9 Some General Caveats. . . . . . . . . . . . . . . . . . . . . . . 109

Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Notes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Further Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Recommended Reading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
About the Author. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
List of Figures and ­Tables

FIGURES
Figure 1.1 Graph of the data from Example 1.1 . . . . . . . . . . 2
Figure 1.2 Graph of the data from Example 1.2 . . . . . . . . . . 2
Figure 1.3 Graph of the data from increased
sample sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Figure 1.4 The distribution of hole sizes showing
standard deviation bound­aries and
specification limits. . . . . . . . . . . . . . . . . . . . . . . . . 8
Figure 1.5 χ distribution for df = 29 with left
2

tail 0.05 shaded . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9


Figure 2.1 Spot-­welder diagram with strength data . . . . . 14
Figure 2.2 Graph of spot-­weld strength data. . . . . . . . . . . 14
Figure 2.3 Regression line plotted with data points. . . . . . . 14
Figure 2.4 Graph of the data from Example 2.2 . . . . . . . . . . 17
Figure 2.5 Examples of vari­ous values of the
linear correlation coefficient. . . . . . . . . . . . . . . . . 17
Figure 2.6 Graph of data from ­Table 2.3. . . . . . . . . . . . . . . . . 18
Figure 2.7 Graph of data from ­Table 2.4. . . . . . . . . . . . . . . . . 21
Figure 2.8 Fermentation tank with beer well . . . . . . . . . . . 21
Figure 2.9 Graph of data with beer well temperature
set at 140°. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Figure 2.10 Sketch for Example 2.5. . . . . . . . . . . . . . . . . . . . . . 24
Figure 2.11 Plot of data from ­Table 2.5. . . . . . . . . . . . . . . . . . . 24

vii
viii Mistakes in Quality Statistics

Figure 2.12 Scatter diagram for the data in ­Table 2.6. . . . . . . 26


Figure 2.13 Comparing predicted with ­actual DE values. . . 29
Figure 4.1 A control chart showing instability . . . . . . . . . . . 42
Figure 4.2 Control chart without indicators
of instability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Figure 4.3 Histogram showing three distinct humps . . . . 45
Figure 4.4 Control chart for improved pro­cess. . . . . . . . . . . 46
Figure 4.5 Histogram showing the data ­after pro­cess
improvement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Figure 4.6 Five-­by-­eight arrays of choco­lates on a
waxed-­paper-­covered con­vey­or. . . . . . . . . . . . . . 48
Figure 4.7 Two methods for selecting five choco­lates
to weigh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Figure 4.8 Individuals control chart. . . . . . . . . . . . . . . . . . . . 50
Figure 4.9 X-­bar and R chart for a batch pro­cess . . . . . . . . . 51
Figure 4.10 X-­MR control chart using average values
of ­percent solids as the individual values. . . . . . 53
Figure 4.11 Scatter plot of the data in ­Table 4.1. . . . . . . . . . . . 54
Figure 4.12 Scatter plot of the points in ­Table 4.2. . . . . . . . . . 57
Figure 4.13 X-­bar and R chart using stratified data. . . . . . . . 60
Figure 4.14 Histogram of the readings in Figure 4.13 . . . . . . 61
Figure 4.15 Illustration of typical pre-­control limits . . . . . . 62
Figure 4.16 Pre-­control chart for Example 4.7 with four
samples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Figure 4.17 Illustration for Example 4.10. . . . . . . . . . . . . . . . . 66
Figure 4.18 Illustrating the specification limits in
terms of standard deviations from the
pro­cess mean. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Figure 4.19 Normal distribution, with the portion
of production that violates the lower
specification limit shaded. . . . . . . . . . . . . . . . . . . 68
Figure 4.20 NORMSDIST with a z-­value of −1.48 . . . . . . . . . 68
Figure 4.21 NORMSDIST value with 5.93 entered. . . . . . . . . 69
Figure 4.22 Illustrating a centered pro­cess . . . . . . . . . . . . . . . 70
List of Figures and ­Tables ix

Figure 4.23 Excel function NORMSDIST with −3.705


input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Figure 6.1 Graphical repre­sen­ta­tion of the average
responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Figure 7.1 Example of an ac­cep­tance sampling plan. . . . . . 94
Figure 7.2 Example of a double sampling plan. . . . . . . . . . . 96
Figure 7.3 Example of sampling plan that uses arrows. . . . 97
Figure 7.4 Examples of OC curves . . . . . . . . . . . . . . . . . . . . . 98
Figure 7.5 OC curve for 1.5 AQL. . . . . . . . . . . . . . . . . . . . . . . 98
Figure 7.6 Details for an OC curve for 1.5 AQL . . . . . . . . . . 99
Figure 7.7 Ideal OC curve for an AQL = 1.5%. . . . . . . . . . . . 100
Figure 7.8(a) Example of an OC curve for AQL = 1.5 . . . . . . . . 100
Figure 7.8(b) Another example of an OC curve
for AQL = 1.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Figure 7.9 ­Table for converting Q-­values to ­percent
defective. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

­TABLES
­Table 1.1 Partial χ 2 ­table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
­Table 2.1 Humidity and density data. . . . . . . . . . . . . . . . . . 15
­Table 2.2 Estimated densities using the regression
equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
­Table 2.3 Data relating CO2 concentration
to growth rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
­Table 2.4 Data for 18 batches and correlations
with DE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
­Table 2.5 Fracture data for Example 2.5. . . . . . . . . . . . . . . . 24
­Table 2.6 Data relating orifice dia­meter
and malleability . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
­Table 2.7 Annealing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
­Table 2.8 Annealing data segregated by gauge. . . . . . . . . . 28
­Table 4.1 Data points for Example 4.5 . . . . . . . . . . . . . . . . . 55
x Mistakes in Quality Statistics

­Table 4.2 Values from ­Table 4.1 compared


with lag 1 values. . . . . . . . . . . . . . . . . . . . . . . . . . . 56
­Table 4.3 Illustrating batch means for batch size n = 8. . . . 58
­Table 4.4 Values of d2 for several sample sizes. . . . . . . . . . 63
­Table 5.1 d2 for small numbers of subgroups . . . . . . . . . . . 78
3
­Table 6.1 A 2 factorial data collection sheet for
Example 6.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3
­Table 6.2 A 2 factorial data collection sheet with
run averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3
­Table 6.3 A 2 full factorial design using the
+ and − format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3
­Table 6.4 A 2 full factorial design showing
interaction columns. . . . . . . . . . . . . . . . . . . . . . . 88
3 3−1
­Table 6.5 Half fraction of 2 (also called a 2 design). . . . 89
3
­Table 6.6 Half fraction of 2 with completed
interaction columns. . . . . . . . . . . . . . . . . . . . . . . 90
Preface

This book began as a pre­sen­ta­tion titled “Caveats Regarding


the Use of Statistics” that I made during the May 2009 World
Quality Conference. It is not intended to serve as a statistics
textbook. In fact, it assumes the reader has some familiarity with
the subject, perhaps through an introductory course. Texts and
courses tend to emphasize how to perform statistical analy­sis
and give ­little attention to errors that can occur in the pro­cess.
The purpose of this book is to show readers how to avoid pit-
falls. The examples and case studies, although based on similar
events, do not include ­actual data.

xi
1
Hypothesis Tests

Inferential statistics uses analysis of samples from a population


to generate conclusions about that population. It provides tech-
niques for studying variation that assist in quality improve-
ment. This chapter uses examples of hypothesis testing as a
standard aid for decision-making.

FOR EXAMPLE:

Example 1.1
A quality improvement team has been asked w ­ hether Method A or
Method B provides a higher mean value for the response R.
(Methods A and B could be two dif­fer­ent advertising methods, two
dif­fer­ent vat agitation rates, two dif­fer­ent oven temperatures, ­etc.)
The team uses Method A and Method B once each and finds that
the resulting r-­value is higher for Method B. The team prepares a report
with the graph shown in Figure 1.1.


FOR EXAMPLE:

Example 1.2
Example 1.2 is the same scenario as Example 1.1, but the team decides to
use the two methods five times each. The results are shown in Figure 1.2.

The team recognizes that variation exists in the data. They real-
ize that to make a reasonable estimate of the mean value for
each method, they ­will need many more values.

1
2 Chapter One

Method A ( ) Method B ( )

Higher Value of Response R

Figure 1.1 Graph of the data from Example 1.1.

Method A ( ) Method B ( )

Higher Value of Response R


Figure 1.2 Graph of the data from Example 1.2.

CAVEAT 1.1
It is impor­tant to seek out variation so it can be analyzed.

The team conducts 35 repetitions of Method A and 32 of


Method B. The resulting data are graphed in Figure 1.3.
The graph in Figure 1.3 shows the samples from the two
populations. The mean of the population made by Method A
appears to be larger than the mean of the population made by
Method B.
Hypothesis Tests 3

Method A ( ) Method B (●)

Higher Value of Response R


Figure 1.3 Graph of the data from increased sample sizes.

CAVEAT 1.2
Always review the data collection pro­cess to look for data entry,
coding, and editing errors. Statisticians also speak of sampling
error, which is the unavoidable error that occurs anytime sampling
is used as a basis for a decision about the population from which
the sample was drawn. Sampling error can be reduced by using a
larger sample.

The team calculates the mean and standard deviation of


each sample and gets the following results:

xA = 14.04 SA = 3.52 xB = 9.18 SB = 3.44

­These sample statistics are estimates for the population


par­ameters µA, σA, µB, and σB, respectively.

1.1 HYPOTHESIS TEST FOR TWO


POPULATION MEANS
­ ecause of the amount of variation and the relative closeness
B
of the two sample means, the team decides to use a hypothesis
test for two population means, which ­will guide them in reach-
ing their recommendation. They proceed through textbook steps
for this test:

1 Verify that the conditions for using the test have been met.
The samples are in­de­pen­dent.
The populations are normally distributed or each sample
size ≥30.
4 Chapter One


2 Determine the significance level. In this case, the team
decides on 0.05.

3 State the null hypothesis. In this case, it is Ho: μA = μB. The
alternative hypothesis is Ha: μA > μB.
This is a right tail test.
2
⎡⎛ sA2 ⎞ ⎛ sB2 ⎞ ⎤
⎢⎣⎝ nA ⎠ + ⎝ nB ⎠ ⎥⎦
● Calculate the degrees of freedom: 2 2 ≈ 64.7.
4
2
⎛ sA ⎞ ⎛ sB2 ⎞
⎝ nA ⎠ ⎝ nB ⎠
+
nA − 1 nB − 1
The critical value from a t-­table or Excel is approxi-
mately1.67, so the reject region is >1.67 (see the appendix at
the end of the book for the Excel function TINV).
X A − XB

5 Calculate the test statistic: t = ≈ 5.712.
sA /nA + sB2 /nB
2


6 Determine ­whether to reject or not reject the null hypoth-
esis. In this case, the null hypothesis is rejected since the
test statistic is in the reject region.
● State the conclusion: the data indicate that the mean
7

response for Method A is larger than the mean response


for Method B at the 0.05 significance level.
That α = 0.05 means that 5% of the times that this set of data
occurs, it is incorrect to reject the null hypothesis.

CAVEAT 1.3
The hypothesis test does not prove that the mean response for
Method A is larger than the mean response for Method B.

Significance Levels
A hypothesis test does not always provide the correct decision.
Two types of errors can occur:
• Type I: Rejection of a true null hypothesis
• Type II: Failure to reject a false null hypothesis
Hypothesis Tests 5

The probability that a hypothesis test results in a Type I


error is denoted by α and is called the significance level. The
probability that a hypothesis test results in a Type II error is
denoted by β.
It is pos­si­ble to decrease both α and β by using a larger
sample size.

CAVEAT 1.4
When using a hypothesis test to analyze a par­tic­u­lar data set, speci-
fying a smaller significance level, α, increases the probability of a
Type II error, β.

FOR EXAMPLE:

Example 1.3
A quality improvement team is tasked with reducing the percentage of
units that are rejected due to surface scratches. The current reject rate is
15%. One pos­si­ble cause of scratches is the design of a holding fixture.
The prototype for a new holding fixture design is used to produce a sam-
ple of 1000 items. The sample is inspected and 137 units are rejected for
scratches. The team uses a hypothesis test to determine w ­ hether ­there
has been a reduction in the rejection rate at the 0.10 significance level.

1.2 HYPOTHESIS TEST FOR A POPULATION


PROPORTION
Use the following notation:
n = The sample size = 1000
p0 = old defect rate = 0.15
p̂ = sample defect rate = 0.137
p = the new population reject rate

1 Verify that the conditions for using the test have been met.
n(p0) > 5 and n(1 − p0) > 5 are both true in this example.

2 State the null hypothesis. In this case, it is Ho: p = 0.15. The
alternative hypothesis is Ha: p < 0.15.

3 α = 0.10.
6 Chapter One


4 Critical value is Z0.10 = −1.28 for the left tail test.2 The null
hypothesis can be rejected if the test statistic is < −1.28.

5 Calculate the test statistic:
p̂ − p0 0.137 − 0.15
z= = ≈ −1.15.
p0 (1 − p0 )/ n 0.15(0.85)/1000

6 Determine ­whether to reject or not reject the null hypothe-
sis. Since the test statistic is not in the reject region, the null
hypothesis ­can’t be rejected at the 0.10 significance level.

7 State the conclusion: the data do not support the rejection
of the null hypothesis that the defect level is unchanged at
the 0.10 significance level.
The hypothesis test implies that the team c­ an’t conclude
that the new fixture design reduced the number of scratches at
the 0.10 significance level. Therefore, the team decides not to
commit the resources required to produce a new fixture. How-
ever, looking at the units that have been rejected for scratches
over the past year, they develop five categories of scratches.
The categories and the approximate percentage of rejected units
with ­those scratches are as follows:
Tapered scratches 24%
Vertical scratches 27%
Diagonal scratches 23%
Parallel pair scratches 17%
J-­shaped scratches 9%
(No units had more than one type of scratch.)
The team then reinspects the 1000 units produced with
the prototype holding fixture and discovers that ­those units
had no J-­shaped scratches. In other words, it appears that the
new fixture design completely eliminated the J scratches. The
hypothesis test would, of course, have revealed this fact if
the J scratches had been studied in­de­pen­dently. By aggregating
all scratches as one defect type, the analy­sis indicated that the
new fixture design did not significantly reduce defects when
in fact it did.


Hypothesis Tests 7

CAVEAT 1.5
Dividing defects into categories leads to new ways to analyze data and
may show new paths ­toward root ­causes and pro­cess improvement.

It makes one won­der if ­they’ll someday discover that, say,


arthritis is ­really several diseases with similar symptoms, and
several cures have been rejected over the years ­because of their
failure to pass the hypothesis test.
Pro­cess improvement often requires a reduction in pro­cess
variation.

FOR EXAMPLE:

Example 1.4
A punch operation produces a hole in a sheet metal part. The tolerance
limits on the dia­meter are 1.000 ± 0.005. The population of hole sizes
produced by the punch is normally distributed. The population of dia­meters
has μ0 = 1.0014 and σo = 0.0028.

The diagram in Figure 1.4 shows the standard deviation bound­


aries along the horizontal axis. The specification limits are illus-
trated with dashed lines. The diagram indicates that the pro­cess
has too much variation. In an effort to reduce the variation, the
ram velocity is reduced. A sample of 30 parts using the new
speed has x = 1.0012 and s = 0.0018. Do the data indicate that
the variation has been reduced at the 0.05 significance level?

1.3 HYPOTHESIS TEST FOR POPULATION


STANDARD DEVIATION

1 Verify that the condition(s) for using the hypothesis test
have been met.
The population is normal.
● Determine the significance level. In this case, the team
2

decides on 0.05.

3 State the null hypothesis. In this case, it is Ho: σ = 0.0028.
The alternative hypothesis is Ha: σ < 0.0028.
8 Chapter One

Specification limits

0.993 0.9958 0.9986 μ=1.0014 1.0042 1.007


1.0098

Figure 1.4 The distribution of hole sizes showing standard deviation


bound­aries and specification limits.

This hypothesis test uses the χ2 distribution illustrated


s2
in Figure 1.5. The distribution uses the fraction 2 and has
σ0
a dif­fer­ent shape for dif­fer­ent sample sizes. The shape is
identified by its degrees of freedom (df), which in this case
is n − 1. As Figure 1.5 shows, the distribution is not symmet-
ric, so care must be used when using the χ2 ­table.

CAVEAT 1.6
Although the hole dia­meters are normally distributed, the ratio used
in this test is not.


4 Calculate the degrees of freedom. In this case, it is n − 1 = 29.
A partial row of a χ2 ­table is illustrated in ­Table 1.1.
Note that for this test we use 1 − ­α for the subscript.
Hypothesis Tests 9

α = 0.05

2
0.95 = 17.708
Figure 1.5 χ2 distribution for df = 29 with left tail 0.05 shaded.

­Table 1.1 Partial χ2 ­table.

df χ 0.995
2 χ 0.99
2 χ 0.975
2
χ 0.95
2 χ 0.90
2

29 13.121 14.256 16.047 17.708 19.768

The critical value is found in the χ2 ­table at χ 0.95


2
(see
the appendix at the end of the book for the Excel function
CHIINV).

χ 0.95
2
= 17.708 for df = 29 The reject region is χ2 < 17.708.


5 Calculate the test statistic.
s2 0.00182
χ 2 = (n − 1) = 29 ≈ 12
σ2 0.00282

6 Determine ­whether to reject or not reject the null hypoth-
esis. In this case, the test statistic is in the reject region, so
Ho is rejected.

7 State the conclusion: the data indicate that the standard
deviation has been reduced at the 0.05 significance level.


10 Chapter One

CAVEAT 1.7
A brief note about using spreadsheet software such as Excel to
conduct hypothesis tests: Most hypothesis tests have conditions
that must be met in order for the test to be valid. See, for example,
step ●1 of the hypothesis test used in Example 1.3: the conditions are

that both np and n (1-­ p) are greater than 5. In step ●


1 of the hypoth-

esis test used in Example 1.4, the condition is that the variable is
normally distributed. Software packages often ­don’t emphasize ­these
conditions enough.

CAVEAT 1.8
As a result of a hypothesis test, a null hypothesis may be rejected or
not rejected but it cannot be accepted.

Misinterpreting the results of a hypothesis test can lead to


false conclusions. For example, consider the “Correlation Test for
Normality” frequently taught in first-­year statistics courses. A
reader might think this test could be used to determine w­ hether a
variable is normally distributed. The null hypothesis for this test is
H0: The variable is normally distributed
So the test may be used to determine with a stated significance
level ­whether the null hypothesis may be rejected. This would
indicate that the variable is not normally distributed. There-
fore, the test may not be used to determine ­whether a variable
is normally distributed.
The control chart discussed ­later in the book is a type of
hypothesis test. The null hypothesis is that the pro­cess is stable.
A point outside the control limits constitutes evidence that the
null hypothesis can be rejected. The significance level is deter-
mined by the control limit formulas.

Significance versus Importance


FOR EXAMPLE:

Example 1.5
A sales representative pre­sents the results of a hypothesis test that indi-
cates that the mean thickness of a surface coating material from vendor
Hypothesis Tests 11

A is greater than that from vendor B. As the purchasing department


prepares to certify vendor A as the preferred supplier, it consults quality,
reliability, and warranty data. ­These data show that the thinner coating
is as good as or better than the thicker coating.

CAVEAT 1.9
Recognize that a statistically significant distinction may not result in
a practical difference. It may be a “distinction without a difference.”
Statistical significance should not, in itself, dictate decision-­making.

­Here are a few other concerns to keep in mind when using


hypothesis tests.

CAVEAT 1.10
Begin by deciding what you want to learn. Avoid untestable goals
such as
improve the product,
improve the user experience,
increase customer engagement.

CAVEAT 1.11
Decide in advance on a randomization scheme, the amount of data
to collect, and the significance level. Resist the temptation to stop
collecting data when enough data have been collected to reject the
null hypothesis at some level of significance.

CAVEAT 1.12
When using the results of published studies, keep in mind that stud-
ies that failed to reject a null hypothesis are less likely to be published.
This fact, called the “file drawer syndrome,” tends to bias analy­sis.

You might also like