Mistakes in Quality Statistics - Sample Chapter
Mistakes in Quality Statistics - Sample Chapter
Mistakes in Quality Statistics - Sample Chapter
M
Quality Statistics
istakes in
M
Quality Statistics
Donald W. Benbow
v
vi Mistakes in Quality Statistics
Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Notes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Further Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Recommended Reading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
About the Author. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
List of Figures and Tables
FIGURES
Figure 1.1 Graph of the data from Example 1.1 . . . . . . . . . . 2
Figure 1.2 Graph of the data from Example 1.2 . . . . . . . . . . 2
Figure 1.3 Graph of the data from increased
sample sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Figure 1.4 The distribution of hole sizes showing
standard deviation boundaries and
specification limits. . . . . . . . . . . . . . . . . . . . . . . . . 8
Figure 1.5 χ distribution for df = 29 with left
2
vii
viii Mistakes in Quality Statistics
TABLES
Table 1.1 Partial χ 2 table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Table 2.1 Humidity and density data. . . . . . . . . . . . . . . . . . 15
Table 2.2 Estimated densities using the regression
equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Table 2.3 Data relating CO2 concentration
to growth rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Table 2.4 Data for 18 batches and correlations
with DE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Table 2.5 Fracture data for Example 2.5. . . . . . . . . . . . . . . . 24
Table 2.6 Data relating orifice diameter
and malleability . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 2.7 Annealing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Table 2.8 Annealing data segregated by gauge. . . . . . . . . . 28
Table 4.1 Data points for Example 4.5 . . . . . . . . . . . . . . . . . 55
x Mistakes in Quality Statistics
xi
1
Hypothesis Tests
FOR EXAMPLE:
Example 1.1
A quality improvement team has been asked w hether Method A or
Method B provides a higher mean value for the response R.
(Methods A and B could be two different advertising methods, two
different vat agitation rates, two different oven temperatures, etc.)
The team uses Method A and Method B once each and finds that
the resulting r-value is higher for Method B. The team prepares a report
with the graph shown in Figure 1.1.
✧
FOR EXAMPLE:
Example 1.2
Example 1.2 is the same scenario as Example 1.1, but the team decides to
use the two methods five times each. The results are shown in Figure 1.2.
The team recognizes that variation exists in the data. They real-
ize that to make a reasonable estimate of the mean value for
each method, they will need many more values.
1
2 Chapter One
Method A ( ) Method B ( )
Method A ( ) Method B ( )
CAVEAT 1.1
It is important to seek out variation so it can be analyzed.
CAVEAT 1.2
Always review the data collection process to look for data entry,
coding, and editing errors. Statisticians also speak of sampling
error, which is the unavoidable error that occurs anytime sampling
is used as a basis for a decision about the population from which
the sample was drawn. Sampling error can be reduced by using a
larger sample.
●
2 Determine the significance level. In this case, the team
decides on 0.05.
●
3 State the null hypothesis. In this case, it is Ho: μA = μB. The
alternative hypothesis is Ha: μA > μB.
This is a right tail test.
2
⎡⎛ sA2 ⎞ ⎛ sB2 ⎞ ⎤
⎢⎣⎝ nA ⎠ + ⎝ nB ⎠ ⎥⎦
● Calculate the degrees of freedom: 2 2 ≈ 64.7.
4
2
⎛ sA ⎞ ⎛ sB2 ⎞
⎝ nA ⎠ ⎝ nB ⎠
+
nA − 1 nB − 1
The critical value from a t-table or Excel is approxi-
mately1.67, so the reject region is >1.67 (see the appendix at
the end of the book for the Excel function TINV).
X A − XB
●
5 Calculate the test statistic: t = ≈ 5.712.
sA /nA + sB2 /nB
2
●
6 Determine whether to reject or not reject the null hypoth-
esis. In this case, the null hypothesis is rejected since the
test statistic is in the reject region.
● State the conclusion: the data indicate that the mean
7
CAVEAT 1.3
The hypothesis test does not prove that the mean response for
Method A is larger than the mean response for Method B.
Significance Levels
A hypothesis test does not always provide the correct decision.
Two types of errors can occur:
• Type I: Rejection of a true null hypothesis
• Type II: Failure to reject a false null hypothesis
Hypothesis Tests 5
CAVEAT 1.4
When using a hypothesis test to analyze a particular data set, speci-
fying a smaller significance level, α, increases the probability of a
Type II error, β.
FOR EXAMPLE:
Example 1.3
A quality improvement team is tasked with reducing the percentage of
units that are rejected due to surface scratches. The current reject rate is
15%. One possible cause of scratches is the design of a holding fixture.
The prototype for a new holding fixture design is used to produce a sam-
ple of 1000 items. The sample is inspected and 137 units are rejected for
scratches. The team uses a hypothesis test to determine w hether there
has been a reduction in the rejection rate at the 0.10 significance level.
●
4 Critical value is Z0.10 = −1.28 for the left tail test.2 The null
hypothesis can be rejected if the test statistic is < −1.28.
●
5 Calculate the test statistic:
p̂ − p0 0.137 − 0.15
z= = ≈ −1.15.
p0 (1 − p0 )/ n 0.15(0.85)/1000
●
6 Determine whether to reject or not reject the null hypothe-
sis. Since the test statistic is not in the reject region, the null
hypothesis can’t be rejected at the 0.10 significance level.
●
7 State the conclusion: the data do not support the rejection
of the null hypothesis that the defect level is unchanged at
the 0.10 significance level.
The hypothesis test implies that the team c an’t conclude
that the new fixture design reduced the number of scratches at
the 0.10 significance level. Therefore, the team decides not to
commit the resources required to produce a new fixture. How-
ever, looking at the units that have been rejected for scratches
over the past year, they develop five categories of scratches.
The categories and the approximate percentage of rejected units
with those scratches are as follows:
Tapered scratches 24%
Vertical scratches 27%
Diagonal scratches 23%
Parallel pair scratches 17%
J-shaped scratches 9%
(No units had more than one type of scratch.)
The team then reinspects the 1000 units produced with
the prototype holding fixture and discovers that those units
had no J-shaped scratches. In other words, it appears that the
new fixture design completely eliminated the J scratches. The
hypothesis test would, of course, have revealed this fact if
the J scratches had been studied independently. By aggregating
all scratches as one defect type, the analysis indicated that the
new fixture design did not significantly reduce defects when
in fact it did.
✧
Hypothesis Tests 7
CAVEAT 1.5
Dividing defects into categories leads to new ways to analyze data and
may show new paths toward root causes and process improvement.
FOR EXAMPLE:
Example 1.4
A punch operation produces a hole in a sheet metal part. The tolerance
limits on the diameter are 1.000 ± 0.005. The population of hole sizes
produced by the punch is normally distributed. The population of diameters
has μ0 = 1.0014 and σo = 0.0028.
decides on 0.05.
●
3 State the null hypothesis. In this case, it is Ho: σ = 0.0028.
The alternative hypothesis is Ha: σ < 0.0028.
8 Chapter One
Specification limits
CAVEAT 1.6
Although the hole diameters are normally distributed, the ratio used
in this test is not.
●
4 Calculate the degrees of freedom. In this case, it is n − 1 = 29.
A partial row of a χ2 table is illustrated in Table 1.1.
Note that for this test we use 1 − α for the subscript.
Hypothesis Tests 9
α = 0.05
2
0.95 = 17.708
Figure 1.5 χ2 distribution for df = 29 with left tail 0.05 shaded.
df χ 0.995
2 χ 0.99
2 χ 0.975
2
χ 0.95
2 χ 0.90
2
χ 0.95
2
= 17.708 for df = 29 The reject region is χ2 < 17.708.
●
5 Calculate the test statistic.
s2 0.00182
χ 2 = (n − 1) = 29 ≈ 12
σ2 0.00282
●
6 Determine whether to reject or not reject the null hypoth-
esis. In this case, the test statistic is in the reject region, so
Ho is rejected.
●
7 State the conclusion: the data indicate that the standard
deviation has been reduced at the 0.05 significance level.
✧
10 Chapter One
CAVEAT 1.7
A brief note about using spreadsheet software such as Excel to
conduct hypothesis tests: Most hypothesis tests have conditions
that must be met in order for the test to be valid. See, for example,
step ●1 of the hypothesis test used in Example 1.3: the conditions are
esis test used in Example 1.4, the condition is that the variable is
normally distributed. Software packages often don’t emphasize these
conditions enough.
CAVEAT 1.8
As a result of a hypothesis test, a null hypothesis may be rejected or
not rejected but it cannot be accepted.
Example 1.5
A sales representative presents the results of a hypothesis test that indi-
cates that the mean thickness of a surface coating material from vendor
Hypothesis Tests 11
CAVEAT 1.9
Recognize that a statistically significant distinction may not result in
a practical difference. It may be a “distinction without a difference.”
Statistical significance should not, in itself, dictate decision-making.
CAVEAT 1.10
Begin by deciding what you want to learn. Avoid untestable goals
such as
improve the product,
improve the user experience,
increase customer engagement.
CAVEAT 1.11
Decide in advance on a randomization scheme, the amount of data
to collect, and the significance level. Resist the temptation to stop
collecting data when enough data have been collected to reject the
null hypothesis at some level of significance.
CAVEAT 1.12
When using the results of published studies, keep in mind that stud-
ies that failed to reject a null hypothesis are less likely to be published.
This fact, called the “file drawer syndrome,” tends to bias analysis.