0% found this document useful (0 votes)
40 views16 pages

Robust Estimation of The Process Standard Deviation For Control Charts (Tatum 1997)

Uploaded by

nicholasfa0120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views16 pages

Robust Estimation of The Process Standard Deviation For Control Charts (Tatum 1997)

Uploaded by

nicholasfa0120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Robust Estimation of the Process Standard Deviation for Control Charts

Author(s): Lawrence G. Tatum


Source: Technometrics , May, 1997, Vol. 39, No. 2 (May, 1997), pp. 127-141
Published by: Taylor & Francis, Ltd. on behalf of American Statistical Association and
American Society for Quality

Stable URL: https://www.jstor.org/stable/1270901

REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/1270901?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Taylor & Francis, Ltd. , American Statistical Association and are collaborating with JSTOR to
digitize, preserve and extend access to Technometrics

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
Robust Estimation of the Process Standard
Deviation for Control Charts
Lawrence G. TATUM

Department of Statistics and Computer Information Systems


Baruch College
The City University of New York
New York, NY 10010
([email protected])

Control charts are tools used to detect aberrant behavior in manufacturing processes. The X chart
plots subsample averages as a function of time, and the R chart plots subsample ranges. Both
of these charts rely on an estimate of the standard deviation of the process when it is operating
correctly. The estimate has traditionally been based on the average range of 20-40 subgroups, but
this will produce an estimate that is biased high when outliers are present. One standard solution is
to construct a range chart for the original subgroups and estimate the standard deviation only from
those subgroups within the control limits, repeating the procedure as necessary. Proposals have
also recently been made to use a trimmed mean of the subsample ranges with a fixed percentage
of trimming, as well as the trimmed mean of the subsample interquartile ranges. This article
presents a new approach to robust estimation of the process standard deviation. The procedure first
centers each subsample on its own median and then applies a modified biweight A estimator to
the pooled residuals. This method combines the strengths of the previous methods-the relatively
high efficiency of the range-based methods when no disturbance is present, together with the strong
resistance to disturbances of the trimmed interquartile range method.

KEY WORDS: Biweight; M estimator; Range chart.

1. INTRODUCTION pp. 434-439) recommended the following, more cautious


approach:
A common technique for controlling manufacturing First, estimate a- from the average of the sub-
pro-
cesses is to take measurements on subsamples sample ranges,atR; second, construct the R chart from this
of items
estimate;
regular intervals and to analyze the results using X andthird,Rplot the original subsample ranges on the
charts. The X chart plots the subsample averages chart; fourth, if all ranges are within control, proceed to
as a func-
tion of time, and the R chart plots the subsample the X chart.
ranges.If not, bring the process under control before
movingwhich
The crucial features of each chart are control limits, on.
are horizontal lines used to form a simple decision The subsample range is an excellent choice as a statistic
criterion:
withthen
If the subsample statistic falls outside these limits, which the
to detect disruptions to the core process because
process is deemed to be out of control, but if theit values
is highlyfall
sensitive to outliers and changes in variability.
within the control limits, it is deemed to be in control.
For just those reasons, though, the average subsample range
is not a reliable foundation for estimating the standard de-
When the process is operating under optimal conditions,
viation ofmod-
the measurements taken on the process are typically the core in the teeth of such disturbances. An
eled as independent draws from a Normal distribution
excellentwith
discussion of this point was given by Langenberg
mean p, and standard deviation a. The X and R charts
and Iglewicz are
(1986). Of course, as shown by Duncan, expe-
designed to detect mean shifts, changes in variability, and have always been well aware of the need
rienced operators
to exercise caution
outliers. I shall refer to these phenomena as disturbances of when using the ranges to estimate a.
A standard
the iid N(C, a) core process. To set up X and R charts, alternative is to employ an adaptive trimmer
one
that
must either have a priori knowledge of , and a or iteratively deletes subsample ranges that fall outside the
estimate
these parameters by observing the process over upper acontrol
periodlimit (UCL) for a range chart. For example, if
of time. Such estimates are usually made on the the basis
UCL is set
ofatathe usual level of 33R, then any subsample
range that exceeds
group of subsamples collected during an initialization pe- this UCL is deleted and R is recomputed.
riod. The UCL is again computed, and subsample ranges that
In this article, I focus on offering improvements to the now exceed the limit are deleted. This continues until all
estimation of a, the process standard deviation. I shall not subsample ranges are within the UCL, yielding a trimmed
address the problem of estimating the process mean. average subsample range that will be denoted RATR. A one-
The estimation of a in the control-chart context is ob- step version of this method was studied by Rocke (1989)
viously not a simple matter, given the possible occurrence and found to perform quite well.
of disturbances during the initialization period. The naive
approach is to ignore the possibility of such disturbances
? 1997 American Statistical Association
during the initialization period and to use the average sub-
and the American Society for Quality Control
sample range as the basis for estimating Cr. Duncan (1974, TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

127

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
128 LAWRENCE G. TATUM

the model of the operation of a control chart. I view the


Several recent articles have suggested robust or resistant
methods of estimating the process standard deviation. Lan-
range chart and its operating conventions as techniques by
which an estimate of a is translated into a guide for iden-
genberg and Iglewicz (1986) suggested the use of a trimmed
mean of the subsample ranges, and Rocke (1989) suggested
tifying external sources of disturbance in the production
the use of the mean or trimmed mean of the subsampleprocess.
in- There is no inherent reason why control-chart op-
terquartile ranges (IQR's). Both of these earlier methodserating conventions should also provide the best approach
followed traditional control-chart practice in that they to the estimation of a. A control chart is a decision tool,
esti-
mated overall variability by operating on subsample statis- an estimation method. The control chart aims to clas-
not
sify the process in to one of two states, in control or out of
tics. It will be shown in simulation studies that, although
both of these approaches have their strong points, theycontrol.
leave These categories are essential to guiding the search
room for improvement. for assignable causes, but they are not the best model for
To understand the advantages and disadvantages ofdetermining
us- how to weight the data when estimating param-
eters
ing a trimmed mean of the subsample ranges, it is useful to of the core process. High-quality robust estimation is
distinguish between two classes of disturbances, diffusemade
and possible by introducing a smooth downweighting of
discrepant observations, not by giving full weight to some
localized. Diffuse disturbances are those that are equally
likely to perturb any observation, whereas a localizedofdis-
the data and zero weight to the remainder.
A result of the smooth downweighting of extreme ob-
turbance will have an impact on all members of a partic-
servations
ular subsample or subsamples. The ability of estimators to is that the estimator is by necessity relatively
handle a disturbance can depend heavily on whether complicated
the and may present a "black box" appearance to
control-chart
disturbances are scattered throughout the observations or operators. I attempt to compensate for this
concentrated in a few subsamples. drawback by introducing graphical diagnostics that will
guide the operator in better understanding the estimator and
The trimmed mean of the subsample ranges is in a sense
specialized to handle localized disturbances. Consider,the
forprocess under study. I do not offer the new method as
example, the behavior of a 10% trimmed mean range awith replacement for the careful scrutiny of the data. The new
30 subsamples and 5 observations per subsample. Suppose methods will yield charts that more reliably locate prob-
lems
that the process develops a greatly increased variability dur- by providing UCL's that more closely resemble those
ing the time when two of the subsamples are taken, such that would be used if a was known. By partially shifting
the burden of estimation off of the operators, these methods
that the 10 observations in those two subsamples can be rep-
resented as draws from an N(0,8) distribution, whilefreethethem to focus their detective skills and judgment on the
remaining 28 subsamples are drawn from an N(0, 1)suitable
dis- issues of the search for assignable causes. There is
no substitute
tribution. Those two subsamples will have greatly inflated for human intelligence in operating a control
chart or improving a production process; I submit that for
ranges but will be trimmed, so that the perturbed observa-
tions will have little effect on the estimate of a. a well-defined estimation problem there is no substitute for
a well-defined statistical tool.
Suppose, however, that the 10 perturbed observations re-
There are several robust estimators that have been devel-
flect a disturbance to the production process that is not lo-
oped in recent years that could be applied in the present
calized so that the 10 values are randomly sown throughout
the 30 subsamples. It can easily occur that more than context,
3 including the M estimator proposed by Huber
(1964), the A estimators described by Hoaglin, Mosteller,
of the 30 subsample ranges will be wildly inflated by these
and Tukey (1983), the S estimator of Rousseeuw and Yohai
perturbed observations, yielding in turn an inflated estimate
of a. (1984), and the T estimators of Yohai and Zamar (1988).
In addition, estimators based on backward-stepping outlier
In Section 4, I shall examine more closely the differing
detection and removal could be employed as developed by
impacts of localized and diffuse disturbances in a simulation
Simonoff (1987). I have chosen to concentrate my efforts
study. It will be shown that the trimmed range methods
around the biweight A estimator because it is easily imple-
perform well for localized disturbances but not for diffuse
mented and performed well in a variety of circumstances
disturbances, but the trimmed mean of the IQR's gives in a astudy conducted by Lax (1985). I do not intend to im-
consistent if moderate performance. It will also be seen that
ply that equally good results could not be attained by the
the range-based methods are much more efficient when no
application of other methods.
disturbance is present.
My approach is closely related to methods of robust anal-
I shall propose a new approach as follows: Each subsam-
ysis of variance developed by, among others, Schrader and
ple is first centered on its own median, and these subsample
McKean (1977), Schrader and Hettmansperger (1980), and
residuals are then pooled. The standard deviation is then
Rocke (1983, 1991). In each of these articles, a robust es-
estimated by applying a modified form of the biweight A
timator of scale was applied to the pooled residuals as one
estimator to the pooled residuals. This method combines
step of the formation of a ratio of variances. My approach is
the strong points of the earlier methods: It has strong re-
differentiated by explicitly protecting the estimate of scale
sistance to both localized and diffuse disturbances and high
from a local concentration of outliers. In addition to that of
efficiency when no disturbances are present. Lax, seminal work on the biweight A estimator was inde-
The new approach represents a departure from existing pendently done by Shoemaker and Hettmansperger (1982)
methods in that it does not base the estimation of and a onRocke (1983).

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION 129

This article is organized as follows: The new estimatorofisthe subsample means for n = 5 and g = 30 for the iid
introduced in Section 2. A comparison of the robustnessNormal
of case was roughly a 5% increase in the estimation
the new method to that of existing methods is performed variance.
in
Section 3, based on the concept of breakdown bound. The This is not a trivial loss in efficiency, but it is small com
estimation efficiency of the methods under a wide varietypared to the cost of centering the subsamples, which sa
of situations is studied in Section 4. In Section 5, therifices
per- g - 1 df to protect against the possibility that t
subsamples have differing population means. For examp
formance of range charts based on the estimators is studied.
for n = 5 and g = 30, the standard deviation of the me
Three real datasets are analyzed in Section 6, and Section
7 gives recommendations and conclusions. subsample variance is more than 20% higher than that
the ordinary sample variance for the iid Normal case. Thus
2. THE PROPOSED METHOD my version of the biweight A estimator sacrifices efficien
The new method is constructed around a variant of the under the iid Normal case both by individually centeri
biweight A estimator. As given by Lax (1985), the biweight each subsample and by centering with the subsample m
A estimator of the standard deviation for a sample of size dian rather than the subsample mean.
n is Applying the biweight A to the pooled residuals gives

Sc r ( <n (xi- <l T)2(1 - U2)4) 2 m' (Efk EIuk,il<l Yki( - k,i)4) (5)
(1) c-(m -1)I/2 S i 2
w (n-1)1/2 a Eli< (1 -i)(1 -52) ' ' Z^k \ w here)(I k, )
where
where c is a tuning constant, T is the sample median,
Uk,j = Yk,j/(cM) (6)
ui = (x - T)/(c MAD), (2)
and

and MAD is the median absolute deviation from the med lian.
A = mediank,i |lk,i (7)
The biweight A estimator was well presented by Iglev vicz
(1983) and was shown in a major study by Lax (1985 ) to The process standard deviation, o, is then estimated as
perform well compared to other robust univariate scale es-
(T= S /kn.g-c
c n _q,c (8)
timators. Due to the (x- T)2 term in the numerator, the
biweight A could be termed a squared deviation-based Ies- where kn,g,c is a factor chosen that E(u-) = a. Let B7 denote
timator, as distinguished from the range- and IQR-bc ised S7, the biweight A estimator with tuning constant c = 7,
estimators described in Section 1. One advantage of w( ork- applied to the pooled residuals.
ing with a robust squared deviation-based estimator ra ther It will be shown in simulation studies that B7 performs
than a robust range-based estimator is that the former st arts very well compared to the earlier methods when dealing
with higher efficiency for the iid Normal case. with diffuse disturbances. When a localized disturbance is
The new method begins by removing the median vz alue present, however, the relative performance of B7 suffers be-
of each subsample from that subsample. When n is an e yven cause it does not make use of information on the location
number, this will yield a median-centered kth subsamp] le of the disturbed subsample(s). To find these subsamples,
I could use an estimator that examines and compares the
Yk, -- Xk,z -Xk, i= 1,...,n, (3) subsample ranges or variances, but this would confuse the
handling of localized disturbances with that of diffuse dis-
where xk is the subsample median.
turbances because the sample range and sample variance are
When n is an odd number, subtracting outsensitive
lian highly the to mecoutliers. Instead, I shall make use of the
will produce one zero value, which is dropped, givin
g a subsample IQR's, thus employing a measure of variability
centered subsample of size - 1, that is insensitive to occasional outliers. I shall use Rocke's
definition of the IQR, given in Equation (4).
Yk. X, -Xk, i< (n + 1)/2, ) Information concerning the location and severity of local-
x l Xk,+l-xk, (n + 1)/2 < i < n.
ized disturbances is communicated to the new estimator by
Will including avalues
The total number of median-centered subsample term hk in^the definition of Uk,j that will down-
be rn' = ng when n is even and m' n =
, weight
(n -subsamples
)g whei for which the IQR is unusually large:
is odd.
Uk,j = hkyk,3/(cAI), (9)
Removing the subsample medians as ea-a prelude to m
red where
suring variability ensures that the variability is measu
By subsamples.
within the subsamples and not between the 1, Ek < 4.5,
removing the median rather than the mean,
ing the center
hk= - (Ek 4.5), 4.5 < Ek < 7.5,
step is somewhat protected against outliers.
this Of course, t c, Ek > 7.5, (10)
protection must be paid for by a reduction
ion in the estimati
and,
efficiency for the iid Normal case. In ay, simulation
I stud'
found that the cost of using the subsample
ace medians in pl
Ek = IQRk/M. (11)

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
130 LAWRENCE G. TATUM

The second
Table 1. Normalizing Constants for the Modified Biweight A Estimator value, c = 7, gives an estimator that sacri-
fices some efficiency when no disturbances are present to
Tuning constants
gain efficiency when disturbances are present. In my Monte
7 10
Carlo studies, this choice of c gave the best estimator when
judged on overall performance.
Number of sub

Size 20 30 40 20 30 40

4 .835 .835 .836 .849 .849 .849


3. BREAKDOWN BOUNDS
6 .901 .901 .902 .906 .906 .906 One approach to understanding the behavior of an
8 .944 .943 .943 .941 .940 .940
mator is to examine its breakdown bound. The breakdo
10 .963 .963 .962 .957 .956 .956
12 .976 .975 .975 .967 .967 .967 bound is given by b* = p/m, where p is the largest num
14 .984 .984 .983 .974 .974 .973 of individual observations that can be manipulated w
5 1.070 1.069 1.068 1.054 1.054 1.053 still leaving the estimator bounded (Hampel 1971). A
7 1.307 1.295 1.289 1.283 1.274 1.269 developing precise definitions of the estimators, the br
9 1.489 1.487 1.487 1.462 1.462 1.462 down bounds of the estimators will be derived later. The
11 1.653 1.650 1.649 1.624 1.622 1.621
results are summarized in Table 2 for the case of 28 sub-
13 1.782 1.809 1.823 1.753 1.776 1.787
15 1.893 1.927 1.946 1.867 1.896 1.911 samples and varying subsample sizes.
Based on g subsamples of size n, I wish to estimate the
NOTE: The table entries are values of the constant dn,g,c used in Equation (12) to adjust the
modified biweight A method to produce an standardestimator
unbiased deviation ofof
thercore
forprocess,
the lid a. The total number
Normal case. Values
of d are given for tuning constants c = 7, of observations
10, g = 20, 30,is m 40
= gn. The traditionaland
subsamples, estimator of a
subsamples
sizes of n = 4, ...., 15 with even and odd values of n grouped separately. Each entry was
has been based on the average of the subsample ranges.
estimated from 100,000 independent random trials.

The kth subsample consists of Xk,1,Xk,2, .. Xk,n-1, Xk,n,


and Xk,(i) denotes the ith ordered value of subsample k.
Ek measures the size of the IQR
The kth inrange
subsample the is kth subsample
relative to M, the MAD of the subsample-centered obser-
vations. If Ek is unusually large, then hk downweights (13) all
Rk = Xk,(n) - Xk,(l),
the observations in the kth subsample by increasing the val-
ues of uk,i. The downweighting begins
and the average when
of the subsample ranges is Ek > 4.5, and
increases until Ek > 7.5, at which point all observations in
the subsample are given weight 0. A simulation 9 study of
the distribution Ek in the iid NormalR case = Y Rk/g was used to set
(14)
these cutoff values. k=l

The modified biweight A estimator, denoted S*, is then


defined as in (5) but with the altered form of Uk,j given by The process standard deviation is then estimated as R =-
(9). The process standard deviation, or, is estimated as R/D4. The factor D4 is standard control-chart notation for
a factor that gives an unbiased estimator of a in the iid Nor-
- = S/dn,g,c. (12) mal case. I shall denote CR as R. In practice, as described
earlier by Duncan (1974), this estimator must be used with

Values of dn,g,c are given ingreat care because of its obvious sensitivity to disturbances.
Table 1, for n =
The adaptive trimmed range, RATM, is often normalized
4,5,...,15,g = 20,30,40, and c = 7 and 10. The table
with D4, the appropriate factor only for R. This will re-
values are estimated from 100,000 simulations for each en-
sult in a small downward bias in the iid Normal case. For
try, running programs compiled in Turbo C? + + on an
my studies I shall employ a factor that produces an un-
NEC` 100-MHz Pentium4, using the random-number gen-
biased estimator of ca in the iid Normal case. Letting ATR
erator ran2 from Press, Teukolsky, Vetterling, and Flannery
denote the resulting estimator, then ATR = RATR/dATR.
(1992). The standard deviation of the estimates is less than
.0004. Because dATR is somewhat smaller than D4, ATR will

Each value of the tuning constant, c, yields a differentbe somewhat larger than R when no ranges have been
deleted.
estimator. I have studied the behavior of the estimator for
c= 7 and c = 10, denoted as D7 and D10. The motivation
for selecting c = 10 is that it leads to an estimator of a Table 2. Breakdown Bound of Scale Estimators (/%), g = 28

that slightly improves on the efficiency of the traditional -~Subsample ,Scale estimator
estimator, R, under the best of conditions for R-that is,Subsample
size (n) R ATR R25 RM IR IR25 M
when there are no disturbances in the data. The fact that
I can replace a traditional, nonrobust estimator with a ro- 4 .0 11.6 6.3 11.6 .9 13.4 24.1
5 .0 9.3 5.0 9.3 .7 10.7 29.3
bust estimator while improving efficiency in the iid Normal
6 .0 7.7 4.2 7.7 .6 8.9 24.4
case is rather surprising. The explanation is simply that our 7 .0 6.6 3.6 6.6 .5 7.7 28.1
method exploits the inefficiency of range-based estimators 8 .0 5.8 3.1 5.8 .9 10.3 24.6
for the iid Normal case as compared to squared deviation- 9 .0 5.2 2.8 5.2 .8 9.1 27.4
based estimators. 10 .0 4.6 2.5 4.6 .7 8.2 24.6

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION 131

Langenberg and Iglewicz (1986) suggested estimating ar


where a approaches 50%. More precisely, a = (g
using an a trimmed mean of the subsample ranges, for odd values of g and a = (g - 2)/(2g) for eve
of g. When g is odd, the manipulation of one ob
1
in each of (g + 1)/2 subsamples can force the e
9(1 - 2ce) beyond any bound, so that b* = {[(g + 1)/2] - 1
g-r-1
(g - 1)/(2n). For even values of g, the bound w
duced to b* = [(g/2) - 1]/(ng) = (g - 2)/(2ng
(1 -t)(R(r,+) + R(g))+ E R(i) , (15)
i=r+2
manipulation of one observation in each of g/2 sub
suffices to drive the estimator beyond any bound.
where R(i) is the ith ordered subsample range,Rocke t(1989) proposed estimators based on the
ag - r,
trimmed mean
and r is the integer portion of ag. For simplicity, I shall of the IQR's of the subsamples,
only make use of values of a that result in trimming of thesample is defined as
IQR of the kth
full observations, which requires that ag yield an integer.
For example, if g = 30, R(.1) trims off the three smallest IQRk = xk,(a) - Xk,(b), (17)
and the three largest subsample ranges. I shall denote the
where a = [n/4] + 1 and b = n - a + 1. This means that
resulting estimator of a as R10, where RIO = R(.1)/dR1o.
for 4 < n < 7, the IQR is defined as the difference between
Likewise, I shall let R25 = R(.25)/dR25.
the second smallest and the second largest observations;
The breakdown bound of the average subsample range is
for 8 < n < 11, the IQR is the difference between the third
0 because the manipulation of just one observation can be
smallest and the third largest.
used to drive the estimator beyond any given bound. The
The mean IQR is
maximum amount of trimming that is ordinarily contem-
plated is 25%. For g = 28 and n = 5, R(.25) trims the 9

largest seven and the smallest seven subsample ranges. IIf


= IQRk/g. (18)
one observation in each of eight subsamples is manipulated, k=l

then R(.25) can be forced beyond any given bound so that


I shall denote the corresponding normalized estimator as
the breakdown bound is b* = 7/140 = 5%. When ag equals
IR, where IR = I/dIQR.
an integer, the breakdown bound of R(a) will be
Rocke also proposed to use the 25% trimmed average of
the IQR's for which the normalized version will be denoted
b* = ag/(ng) = a/n. (16)
as IR25. If .259 is an integer, IR25 can be written as
Note that the breakdown bound goes down as the number n--
of observations per subsample goes up.
IR25 = (19)
, IQR(k)/[(g/2)dIR25],
Generally speaking, a breakdown bound of 5% would
k= j+1
not be considered satisfactory for a robust estimator, par-
ticularly because alternative estimators withwhere
substantially
j = .25g and IQR(k) is the kth ordered sub-
higher breakdown bounds exist. In addition, sample
it seems
IQR. rea-
sonable to anticipate disturbances of 5% or more of the
For n < 7, IR can withstand the manipulation of a single
observations when working with an erratic production pro-
observation, but the manipulation of a second observation
cess.
within the same subsample can pull IR beyond any bound.
The breakdown bound of RATM is substantially
Hence, IR has better
a breakdown bound of 1/m for n < 7. For
than that of R. To derive the bound, I begin
8 < nby
< 11, supposing
the estimator will not break down until a third
that k out of the m subsample ranges are equal to 1 within
observation and the
the same subsample is manipulated so
remaining g - k are equal to 0, which gives
thatRATR = k/g.
the breakdown bound becomes 2/m.
This corresponds to the limiting case in whichFor a ggroup
= 28, n = of k trims off the seven largest and the
5, IR25
subsample ranges has been driven infinitely distant from the
seven smallest IQR's. To break down IR25 would then re-
remaining m -g subsample ranges. I shall presume that theof two observations in each of eight
quire the manipulation
UCL is set at D2RATR, where D2 is takensubsamples.
from standard
Therefore, a total of 16 - 1 = 15 observations
tables. To break down the estimator, it is necessary that while
can be manipulated the keeping the estimator bounded,
UCL exceed 1, so D2RATR > 1 or D2k/g > 1. Solving
giving for
a breakdown k of 15/140, or 10.7%, which is
bound
yields k > g/D2. Consequently, the estimator can withstand
more than twice the breakdown bound for R(.25) under the
manipulation of [g/D2] of the subsample ranges, where
same conditions. [x]
Hence, when j = .25g is an integer and
denotes the largest integer less than or equal to x. This can
n < 7, the breakdown bound of IR25 is b* = (2j - l)/(ng).
be accomplished by manipulation of [g/D2] observations,
When 8 < n < 11, three observations are needed in j sub-
one per subsample to be disturbed, so thatsamples
the to breakdown
break down the estimator, so the bound becomes
bound will be b* = [g/D21/(gn). For g = 28b*and = (3j -nl)/(ng).
= 5, the
breakdown bound is [28/2.114]/140 = 13/140 = 9.28%.
The methods based on the biweight A developed in Sec-
The median of the subsample ranges as an tionestimator
2 will inherit theofbreakdown bound of the MAD of
the process standard deviation was earlier proposed by Fer-
the subsample-centered observations, M. The breakdown
rel (1953). This can be viewed as an extreme form
bound of M is of R(a) of both n and g, but it does not
a function

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
132 LAWRENCE G. TATUM

IR25-the
fall below (1/4) - 1/m, which is slightly less than 25% of 23.3% trimmed mean of the subsample IQR's
the observations. A proof is given in the Appendix.
Table 2 compares the breakdown bound for g = 28
Note that a = .233 for R25 and IR25 due to the integer
for R, ATR, R25, MR, IR, IR25, and M. The best break-
trimming method used.
down bound is achieved by M, with IR25 a distant second.
To provide an unbiased estimate of a for the iid Normal
Among the range methods, the highest breakdown bound
case, the values provided by each of the nine methods must
was shared (for this value of g) by RM and ATR, followed
be divided by a normalizing constant. The constants for D 10
by R25, R10, and R. It is somewhat surprising that R25
does not achieve the breakdown bound of ATR. Note that and D7 were provided by Table 1 and for R by standard ta-
bles, and the remainder were estimated from 100,000 ran-
the breakdown bounds for the range methods decrease as
dom simulations. The following values were used for the
n increases. The bounds on the IQR methods suffer a sim-
n = 5 case: B7, 1.069; D10, 1.054; D7, 1.069; R, 2.326;
ilar decrease, although there is a small improvement when
ATR, 2.314; R10, 2.297; R25, 2.272; RM, 2.263; IR, .9907;
moving from n = 7 to n = 8.
IR25, .9239. For n = 10, the constants were B7, .963; D10,
The drawback to working with IR25 is that it is a rel-
.956; D7, .963; R, 3.077; ATR, 3.035; R10, 3.057; R25,
atively inefficient estimator of the standard deviation of a
3.037; RM, 3.026; IR, 1.312; IR25, 1.280.
Normal distribution. A simulation study in Section 4 finds
Comparisons were made under four types of distur-
that the mean squared estimation error for IR25 is sev-
bances-
eral times greater than that of the trimmed range methods.
Hence, when no disturbance is present and the core pro- 1. a 95-5 mixture model of symmetric variance dis
cess can be modeled as Normally distributed, the use of bances in which each observation Xk,i has a 95% prob
IR25 will be costly in terms of estimation error compared ity of being drawn from the N(0, 1) distribution and
to other methods. Used alone, M would suffer from theprobability of being drawn from the N(0, a) distribu
same problem, and hence I employ it only as a preliminarywhere a = 1.0,1.5,...,5.5,6.0;
estimator. RM and ATR will be found to highly efficient2. a model for asymmetric variance disturbances in
compared to IR25 for the iid Normal case and quite com- which each observation is drawn from the N(0, 1) distri-
petitive with IR25 in many of the situations studied. Indeed,
bution and then has a 5% probability of having a multiple
in my studies ATR gave one of the strongest performances of a X2(1) random variable added to it (the multiplier takes
of the existing methods. values of 0, .5, 1.0, 1.5, ..., 5.0);
3. a model for localized, symmetric variance distur-
bances in which the observations in one subsample are
4. SIMULATION STUDIES OF
drawn from the N(0, a) distribution, a = 1.0,..., 6.0, and
ESTIMATION EFFICIENCY
the remaining subsamples contain observations drawn from
In the following simulation studies, I shall compare the
the N(0, 1) distribution;
mean squared error (MSE) of the estimators for g4. =a 95-530 mixture model of diffuse mean disturbances
subsamples and for n = 5 and n = 10. The MSE in will
whichbe each observation has a 95% probability of be-
estimated as ing drawn from the N(O, 1) distribution and a 5% prob-
ability of being drawn from an N(b, 1) distribution, with
b = 0,.5,...,9.0.9.5.
MSE1 o)2 The following results are based on 10,000 simulations for
MSE - Z- (i=1i=1
each level of disturbance for each type of disturbance.

Type I: Symmetric Variance Disturbance


where ai denotes the value of the estimator on the ith sim-
ulation trial and N is the number of trials. Figure l(a) shows the MSE plotted against the standard
deviation of the disturbance for each of the 10 methods
I compare the performance of the following estimators:
for subsamples of size 5. When there is no disturbance,
B7-the biweight A estimator applied to the pooled IR and IR25 are substantially less efficient than the other
residuals with c= 7
methods. In compensation, IR25 does not suffer a sharp in-
D10-the modified biweight A estimator with c = 10
crease in MSE. As the standard deviation of the disturbance
D7-the modified biweight A estimator with c = 7 increases, the range-based methods all deteriorate quickly,
R-the average of the subsample ranges although ATR levels off quickly. D7 and B7 give very strong
ATR-an adaptive trimmed average of the subsample performances, overall. Less obvious is that, with the excep-
ranges tion of ATR, D10 dominated the range and IQR methods
R1O--the 10% trimmed average range of the subsamplefor all levels of disturbance, whereas D7 is slightly less
ranges, R(.1) efficient than R and RIO for low levels of disturbance.
R25-the 23.3% trimmed average range of the subsample In Figure l(b) it can be seen that D7 and B7 improve
ranges, R(.233) their relative performance for n = 10. A major change is
RM-the median of the subsample ranges that ATR is now outperformed by D10 for all levels of
IR-the mean of the subsample IQR's disturbance and is outperformed in the mid-levels by IR
TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
133
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION

.020
.024 -

.020
.016

.016

2 .012

.012

.008

.008

.004 .004

2 3 4 5 6 1 2 3 4

(a) Standard (a) Deviation


Multiplier
of D

R10 R25
.012 - i RM

.010 -

.008 -

w
vt

.006 -

.004 -

.002

1 2 3 4 5 6 0 1 2 3 4 5

(b) Standard Deviation of Disturbance (b) Multiplier

Figure 1. Mean Squared Error of Estimators (MSE) Versus the Stan- Figure 2. Mean
dard Deviation of a Diffuse, Symmetric Variance Disturbance, Modeledple of a Diffus
Mixture
by 95-5 Mixture of Normal(0, 1) and Normal(0, a) Distributions. Values of the
of a are plotted on the horizontal axis. The estimators are D10, a mod-
X2 (1) Distribut
The
ified biweight A with tuning constant c = 10; D7, c = 7; B7, a biweight estimators
A estimator with c = 7; R, the average subsample range; ATR, adaptive
trimmed average of subsample ranges; R10, 10% trimmed average Type of II: Asy
subsample ranges; R25, 25% trimmed ranges; RM, median subsample
range; IR, average subsample IQR; IR25, 25% trimmed average sub- Figure 2, (a
sample IQR. n = 10 for a
quite similar
the strong p
and IR25. The MSE of the range-based methods increases peated. Wit
much more quickly than for n = 5, which appears to reflect
inates all pr
the fact that the breakdown bound of these methods actu- although D7
ally decreases as the number of observations per subsample This catego
goes up. For n = 10, the 5% of observations subject to dis-
monoff (1987
turbance is now equal to the breakdown bound of RM, the well for asy
most robust of the range-based methods. Note also that B7 tion studies.
and D7 gave nearly identical performances. the competin

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
134 LAWRENCE G. TATUM

.007
signed to handle. Although B7 and D7 were indistinguish-
able in previous types of disturbances, for localized distur-
bances D7 has a decided advantage over B7.
In Figure 3(b), with n = 10, D10 is able to outperform
R10 and all other methods for almost the entire range of the
standard deviation of the disturbance. Again, B7 performs
.006
relatively poorly compared to D7 for n = 10, and ATR does
not perform as well as with n = 5.

Type IV: Mean Shift Disturbance


ATR
.005 Figure 4, (a) and (b), plots the MSE against a shift in
mean for n = 5 and n = 10. Each observation has a proba-
bility of 5% of being subjected to the mean shift. The best

.080

.070
1 2 3 4 5 6

(a) Standard Deviation


.060 of D

.050
.004

.040

.030

B7
.020
_.- _ -
_ - _- ", _

v: .003 .010

.000

0 1 2 3 4 5 6 7 8 9

(a) Mean of Disturbance

.022

.002

1 2 3 4 5 6
.018
(b) Standard Deviation of Disturbance

Figure 3. Mean Squared Error of Estimators (MSE) Versus a Lo-


calized Symmetric Variance Disturbance, Modeled by Drawing 29 Sub- .014
samples From the Normal(O, 1) Distribution and 1 Subsample From the
Normal(O, a) Distribution. The estimators are as given in Figure 1. (/
C

Type II1: Localized Variance Disturbance .010

Figure 3(a) shows the MSE plotted against the standard


deviation of a disturbance that has been confined to one
.006
subsample. (To be able to distinguish between the methods,
it was necessary to magnify the vertical scale to a degree
that places IR and IR25 off the plot.) It would be antici-
.002
pated that for this type of disturbance the trimmed range
0 1 2 3 4 5 6 7 8 9
methods would excel. A careful examination of the figure
shows that indeed R10 was highly competitive with D10 (b) Mean of Disturbance
and outperformed D7 and B7, and ATR made an outstand-
ing performance. Figure 4. Mean Squared Error of Estimators (MSE) Versus a Mean
Shift Disturbance, Modeled as a 95-5 Mixture of Normal(O, 1) and
This is also the type of disturbance that our modified
Normal(b, 1) Distributions. The magnitude of the mean shift, b, is given
form of the biweight A estimator (D7 and D10) was de-on the horizontal axis. The estimators are as given in Figure 1.

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC2:34:56 UTC
All use subject to https://about.jstor.org/terms
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION 135

Table 3. Minimum Relative Mean Squared Error (RMSE)

Scale estimator

Type Size D10 D7 B7 R ATR R10 R25 RM IR IR25


I 5 64 91 92 6 75 19 40 47 43 31
10 65 86 86 2 47 3 7 12 48 38
1 5 78 91 90 3 87 21 48 51 34 30
10 82 87 87 1 61 4 12 18 44 38
II I 5 94 90 85 15 90 90 79 60 11 30
10 99 86 75 8 73 88 79 58 7 38
IV 5 37 91 83 1 19 2 4 15 6 30
10 36 86 87 0 5 0 0 0 8 26
Minimum 36 86 75 0 5 0 0 0 6 26
NOTE: Each table entry represents the lowest relative mean squar
row is marked in bold. As an overall measure of performance, the l
localized variance, (IV) mean shift. The estimators are D10, a modifie
ATR, adaptive trimmed average of subsample ranges; R10, 10% tr
IR25, 25% trimmed average subsample IQR.

performances werestantial
given by
improvements D7 methods
over existing andin B7. F
terms of th
n, IR25 outperformed D10
estimation by a
of the underlying scalelarge mar
parameter, but it remai
ate shift in mean, but D10
to be shown performed
that this translates into improved much
control-cha
and large shifts. IR25 also
performance. outperforms
To do so, I examined the performance o
range charts4(a),
brief stretch in Figure based on the various robustwas
which estimatorsthof
happened in my studies.
I have limited the analysis to a subsample size of n =
The only change in the definition of the disturbances
Relative that the Type
Squared Mean III now increases the variance of 3 out of
Error
subsamples rather
To compare the overall performance than just 1 out of 30. This wasof
done
an effort to put the
tors, I found the relative mean squared etrimmed range methods on a stronge
footing.
each level of disturbance for each type of d
RMSE of
estimator The two most
an was common methods
found of comparing the
by per- divid
formance of control
that estimator into the MSE of the methodcharts is to compare the average rate
at which positive signals areThe
best under those conditions. encountered results
and the average are
Table 3 by reportingrun length
the (ARL) until
lowesta positive signal is RMSE
given. A positive achie
signal is deemed to
timator across the levels of disturbance,have occurred if the range of a subsam- by
of n. For example, ple exceeded theentry
the UCL. The weakness ofin
these comparisons
the first
is that they attempt to distinguish the behavior of random
column indicates that for the Type I distur
the RMSE quantities by
did of nota comparisonfall
D10 of the average outcome, thus
below 64%
failing to take into account the
deviation of the disturbance variability of the random from
increased
The best performancequantities involved.
in In terms
eachof the analysis
row done in Sec-of Ta
in boldface and wastion 4, supplied
this would be equivalent to limiting the comparison
either by D
to only the mean error (or bias) of the estimators.
As an overall measure of performance for a
last row 3 My of basis of comparison willthe
reports Table be the meanminimum
squared de-
viation (MSD)
down the columns of the four typesbetween the rate of positive signals for an
of dist
estimator
two subsample sizes. The and the rate
D7 of positive signals that would have
estimator had
occurred if ao was known. That is, I treat the latter rate as
performance withan 86% minimum RMSE,
with a 75% a target rate and measure
minimum RMSE the quality of theand
estimators by D10 a
method had the bestthe performance
squared deviation from that target. Unlike of measuringthe pr
the squared deviation from a fixed target, such as was done
achieving a minimum RMSE of 26%. ATR
performance except using
for r in the previous
the study,Type
here I shall be using
IV a mov-disturb
ing target because
The overall efficiency of the rate
86%of positive signals
for will depend
D7 is r
on the sample realization even when a is known.
sistent with the results of comparative s
weight A estimator The first
as step given
in comparing the estimators
both was to adjust by La
their control limits
Iglewicz (1983). In both ofso that they had approximately
these equal av-
studies,
weight A estimator erage false positive signal rates under the
achieved a iidminimum
Normal case.
across three When a is known, setting the UCL at UCL(a) = 4.918o
distributions.
yields a false positive rate of .602% (see Duncan 1974, p.
436). Consequently, I modified the estimators so that each
5. COMPARISONS OF CONTROL-CHART
PERFORMANCE
had a false positive rate of approximately .602%. This was
accomplished by setting the UCL equal to the following
Section 4 demonstrates that the new methods offer sub-
multiples of the estimators: D10, 4.925; D7, 4.940; B7,
TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
136
LAWRENCE G. TATUM

1000 -

800 -

600

400

200

1 2 3 4 5 6 0 1 2 3 4 5

(a) (b) Multiplier o


Standard Deviation of Dist

280

230

180

130

80

30

1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9

(c) Standard (d)Deviation


Mean of Disturbanc
of D

Figure 5. Mean Squared Deviation of Estimator (MSD


Localized Variance, (d) Mean Shift. The MSD measure
and the rate when the control limit is set with know
with tuning constant c = 10; D7, c = 7; B7, a biweigh
of subsample ranges; R10, 10% trimmed average o
subsample IQR; IR25, 25% trimmed average subsample IQR: dashed line, B7; dotted, D10; dot-dash; RM, dot-dot-dash, ATR.

4.940; R, 4.92; ATR, 4.94; R10, 4.98; R25, 5.037; RM, uing the ith trial, 1,000 subsamples of size 5 were generated
5.057; IR, 5.114; IR25, 5.215. These values were arrived from the same type of disturbance and the same level of dis-
at by numerous simulation studies. turbance. Let S(D7, i) denote the number of times that the
The MSD of each of the four types of disturbances subsample range exceeded UCL(D7, i) during the ith trial.
and each level of disturbance was found by the followingIn addition, let S(a, i) denote the number of times that the
method: For each of 1,000 simulation trials, 30 subsamplessubsample range exceeded the "correct" UCL of UCL(a)
of size 5 were generated from a particular type of distur-during the ith trial. The MSD for D7 is then given by
bance and level of disturbance. Estimates of a were then 1,000

formed by each of the 10 estimators, and these estimators MSD(D7) = (1/1, 000) E [S(D7, i)-S(n, i)]2. (21)
i=l
were used to form upper control-chart limits. For example,
let D7i denote the value of D7 on the ith simulation trial, Overall, because 1,000 simulations of 1,00
with corresponding UCL of UCL(D7, i) = 4.94D7i. Contin-were involved, the value of each MSD was based on re-
TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION 137

Table 4. Minimum Relative Mean Squared Deviation (RMSD) showing the residuals from removing the subsample medi-
ans. Singular residual observations are denoted with a cir
Estimator
cle, and multiple residuals are denoted with a square. Two
Type D10 D7 B7 R ATR R10 R25 RM IR IR25 kinds of graphical diagnostics have been added to the resid-
I 71 92 76 13 76 31 47 41 45 27 ual plots: Individual residuals that are more than three stan-
II 87 87 88 12 72 45 61 41 40 27 dard deviations from 0 as estimated by D7 are marked by
III 69 82 66 14 60 82 79 48 10 23
filled circles, and a double-pointed arrow indicates the pres-
IV 26 84 73 0 2 0 1 7 2 14
ence of an excessive IQR within a subsample. The latter are
Min 26 82 66 0 2 0 1 7 2 14
instances in which Ek > 4.5, where Ek is the standardized
NOTE: Each entry represents the IQRlowest
for the kth subsample,mean
relative as given squared
in Equation (16).
devia
the range chart based on an estimator, for a type of disturbance an
One of the difficulties with these datasets is the high level
measures the variation between the rate of positive signals for a UCL b
for a UCL based on knowledge of of discretizing
a. The best of theperformance
data, which I presume inreflects
each mea
ro
an overall measure of performance, the last row gives the minimum o
surement problems rather
of disturbance and the estimators are as given in Table 3.
than data simplification. The dis-
cretizing is of a large magnitude relative to the variability
of the observations, leading to numerous duplications. In
alizations of one million subsamples. In th
each subsample was addition,
given for one dataset the discretizing was heavilyof
a probability bi-
being disturbed. ased toward trailing digits of 0 or 5. Thus, there are sev-
eral problems present in the real data that did not exist
The results for the Type I, II, III, and IV dist
in
shown in Figure 5, (a), my simulation
(b), studies-measurement
(c), and error, (d), rounding
res or
discretizing error, frequent duplication of observations, and
parisons were limited to the case of n = 5
biased rounding.
figures are not identical to those of the M
I first consider a set of measurements of the melt in-
of Section 4, the same general results are fou
dex of a polyethylene compound analyzed by Wadsworth,
D7 and B7 give the best overall performances
Stephens, and Godfrey (1986, pp. 207-209). The authors
I, Type II, and Type IV disturbances, as see
will hereafter be referred to as WSG. The data consist of
(a), (b), and (d). Again, the trimmed range me
well for the 20 subsamples
localized of size 4. The residuals (from
disturbance case removing
in the
subsample medians) are shown in Figure 6(a), and the sub-
though it is difficult to distinguish the meth
5(c), D7 again sample rangeswell
performs are plotted in Figure 6(b) with four
against B7 candidate
fo
UCL's from estimators D7, R, ATR, and IR25.
disturbance. Finally, ATR performs remarkab
The range of the third subsample is an outlier even when
for the Type IV case. In Figure 5(d), the ve
the the control
log of the MSD to limit is given by R.the
improve Using auxiliary
presentrules for de-
tecting out-of-control behavior, WSG identified the fourth

Relative Mean subsample Deviation


Squared range as indicating out-of-control behavior. They
therefore recomputed the average range after deleting the
As an overall measure of performance, Ta
third and fourth ranges. As pointed out by Rocke (1989)
the minimum relative mean squared deviat
in his analysis of this dataset, the range of the fourth sub-
calculation was done as described for the R
sample would be identified as suspicious by any reasonably
worst-case performance
robust method.for each type of d
given either by D7 or B7, although RIO al
The rule used by WSG to justify the deletion of subsam-
D7 for the Type III case. The last row rep
ple range 4 was that out-of-control behavior is suspected
RMSD for each estimator. As found in Sect
if "two out of three successive points are outside two sig-
the best performance, followed by B7, D10
mas" (p. 159). Two more rules are also given in the same
paragraph: A problem is deemed to exist if "four out of five
6. ANALYSIS OF REAL DATASETS
successive points lie outside of one sigma, or eight succes-
In this section I shall analyze three datasets drawn from
sive points on one side of the centerline."
actual production processes, two of which were studied
These may well be excellent tools for identifying prob-
by Rocke (1989). Each of these datasets contains a differ-
lems when running a control chart, but I do not see how they
ent estimation challenge or illustrates a different aspect
canofbe turned into objective rules governing the deletion of
the problem. For each dataset we will examine the residu-
subsamples for estimating the process standard deviation.
als (from removing the subsample medians) and a plot
This of
rule states that a potential problem has been found,
subsample ranges with UCL given by four estimators R, a problem has been found that can be uniquely as-
not that
ATR, D7, and IR25. The first of these, R, is the standard es-
sociated with some subsamples and not others. Two out of
timator, and the other three were among the best performers
three points lying beyond two sigma suggest a problem in
among the three classes of robust methods considered-the
production but do not state that the problem is confined to
trimmed range methods, the IQR methods, and biweight the A
two largest of the three. As an extreme case, how would
methods.
such rules for deletion be employed if eight successive sub-
I offer a graphical display based on D7 intended to aid in
sample ranges fell below the average range? Should all eight
distinguishing between diffuse and localized disturbances.
be deleted? This is not a theoretical concern: It happens that
In addition to the standard range chart, I present for a chart
this very dataset the ranges for nine successive subsam-

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
138 LAWRENCE G. TATUM

60 - pie are equal in value (as shown by the use of the square
symbol) and the last value seems to be an average distance
4
away. Subsample 4 also seems to have an inflated range
40 -
caused by an outlier. I would also point out that D7 did not
downweight any subsample as having an excessive IQR-a
further suggestion that changes in process variance are not
the problem in this dataset.
20- 0o
o Hence, I have, in this real dataset, evidence that diffuse
0 0 0 0 disturbances exist in actual industrial settings. Given the
o0O0
oo o o
o
0 o o o oo presence of diffuse disturbances, eliminating all observ
0
8 'C 0 0 008T } o ?B tions in a subsample simply because it contains a sing
0 <u 0 outlier will reduce the efficiency of estimators. In addition
0
0 in terms of operating a control chart it would be mislea
-20- ing to treat all inflated subsample ranges as indicating an
increased process variance if in fact some of these reflected
I
the existence of isolated outliers. The sources of the distur-

-40
bances will be different and so will be the remedies.

1 3 5 7 9 11 13 15 1 7 19 In Figure 6(b), I compare the UCL for estimators R, ATR,


D7, and IR25. The values of cr supplied by these estimators
(a) Sample
were 9.11, 7.58, 6.73, and 6.44, respectively. By compar-
ison, WSG gave an estimate of 7.5, quite close to that of
60
o
ATR because both have trimmed the two largest ranges. The
UCL's were placed at D4c, with D4 = 4.698 from table H
50
of WSG.
The range for the third subsample would be identified as
R an outlier even when the UCL is set using R. ATR, D7,
40
o and IR25 identify subsample ranges 3 and 4 as problem-
ATR _ atic, D7 also places subsample range 6 under suspicion,
o
.. ..............- . ....... . . .. ......... ...D7... and IR25 adds subsample range 8. The question of which
30
IR25- of these four estimators has done the best job of estimating
CX

the standard deviation of the core process can obviously


20 - 0 not be resolved from such a limited sample, nor can we
O O 0
o O
~o resolve the question of which method would have provided
0 0
the most useful UCL.
10
o0 0 The second dataset I shall consider was supplied by Grant
o v and Leavenworth (1988, p. 9). The dataset contains 20 sub-
samples of size 5, consisting of actual measurements of
0
pitch diameter of threads for aircraft fittings. The residu-
1 3 5 7 9 11 13 15
17 19 als (from removing the subsample medians) are shown in
(b) Sample Figure 7(a). Again, residuals that are more than three stan-
dard deviations from 0 as estimated by D7 are marked. In
Figure 6. (a) Melt Index of Polyethylene; Residuals F
the Subsample Medians. Squares denote multiple value r sold cRvcis addition, the double-ended arrows mark cases in which the
denote residuals more than 3B57 from 0. (b) Range char t for melt index IQR of the subsample was judged excessive by D7 and the
with upper control limits from R, ATR, D7, and IR25. subsample was downweighted. That is, Ek [see Eq. (15)]
exceeded 4.5 for subsamples 9, 10, and 19.
ples fall below the initial average range of 18. 75 (samples The purpose of checking the IQR in D7 was to help dis-
10-18). tinguish between diffuse and localized disturbances. The
I do not dispute that WSG made a reasonable estimate of presence of excessively large IQR's for 3 out of 20 sub-
the process standard deviation via judicious trir lming. What samples suggests that the variability of the process is in fact
I would dispute is that auxiliary rules for detec :ting out-of- undergoing changes. In addition, the excess range found in
control behavior can provide an objective foun idation for a subsample 13 seems to reflect an increase in variance, as
trimmed range approach to the estimation of cr . The use of well. Thus, in this dataset I seem to be dealing with a dis-
these rules requires the intervention of statistical common tinctly different cause of inflated subsample ranges than the
sense to an extent that defies formulation. previous dataset, in which isolated outliers were a primary
My primary interest in this dataset is thatna]inspection suspect.
s inflated by Figure 7(b) plots a range chart with UCL's from estima-
of third subsample suggests that the range was
iance of the tors R, ATR, D7, and IR25. The three robust methods have
a single outlier, not by an increase in the vari
stion on the clustered at a considerable distance from R and have left 3
process at that point in time. I base this sugge
fact that two of the remaining three points hat
in t subsam- of the subsample ranges as clear outliers. The estimates of

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
?
139
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION

10 - prisingly distant from the others and results in an isolated


UCL in Figure 8(b). I believe the discrepancy between the
values given by the robust estimators is caused by the sen-
5- 0 sitivity of the IQR to excessive and/or biased grouping of
0
o o 0 0 Q center values. Although it is not obvious in the figures, there
o 0
0
is heavy bias in the data toward values ending in digits of
o o 0
0- o o 0 0 1
O 0
0 0 and 5-these end digits occur in a remarkable 80 out
o o 000 0 0 t of the 96 observations. Furthermore, the rounding is at a
o 0 0 0 o V 0

0 ! coarse level relative to the overall variation so that a total


-5 - o of 20 duplications and triplications take place out of the
96 observations. Naturally, these duplications tend to take
place toward the center of the data and resulted in a total
-10 - of seven subsample IQR's equal to 0. The result seems to
I be a downward bias in IR25.
4

3 -
-15 I 4 ?

1 3 5 7 9 11 13 15 17 19 C
0
2-
(a) Sample 0
0
0
0
o
1 - 0 o o0
20
o 0 ?
>o
0 c zoO ?
'0 0 0 < 0a
cn o0 a? 0 >0
c/ 0 o o
15 0 0
? 0 C ) 0 o
R o o
-1- 0 0 0 0
A
0 0 0

D7 c
0 -2
10- -IR25 -- -- - -.. . .. .-- =
ATR 0
0
0 -3
0 O 1 3 5 7 9 11 13 15 17 19 21 23
5 0 0

o o 00 0 (a) Sample
0 0

0 0

0
5- ATR
1 3 5 7 9 11 13 15 17 19

o R
(b) Sample
4- ........... . ....................... .... .....

D7
Figure 7 (a) Pitch Diameter of Threads; Residuals From Removing
the Subsample Medians. Squares denote multiple values, solid circles 0 00

denote residuals more than 3&B7 from 0. Double-headed arrows locate


IQR's judged excessive by D7. (b) Range chart for pitch diameters with) 3-
UCL's from R, ATR, D7, and IR25. IR25
00 -.-.-.- .. ---- . .0. -.--.- - .-0-- ---
0
0
the process standard deviation were R, 2.66; ATR, 1.99; D7, 2- 00 0

2.16; and IR25, 2.02. The UCL values were set as 4.918&
0 0
using the factor D4 of table H of WSG.
0
My third dataset is a set of measurements of cotton yarn 0 0
count from pages 229-230 of WSG. This dataset was also 0 0 0

studied by Rocke (1989). The data consist of 24 subsamples


0
of size 4. As can be seen in the residuals plot, Figure 8(a),
1 3 5 7 9 11 13 15 17 19 21 23
the data do not contain isolated outliers of the type found
in Figure 6(a) for the melt index data. And, in contrast to(b) Sample
the thread data in Figure 7(a), if any inflation of subsample
ranges has occurred it is quite subtle. Figure 8. (a) Cotton Yarn Count; Residuals From Removing the Sub-
sample Medians. Squares denote multiple values, solid circles denote
The estimates of a were given as R, .99; ATR, 1.00; D7, residuals more than 3&B7 from 0. (b) Range chart for cotton yarn count
.858; and IR25, .534. The estimate provided by IR25 is sur-with UCL's from R, ATR, D7, and IR25.

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
140 LAWRENCE G. TATUM

It can be shown that the median is much more sensi- control events for which an assignable cause can be found.
tive to rounding errors than are M estimators of location A corollary would seem to be that no subsample range
(see chap. 11D of Hoaglin et al. 1983). Although I shall not
should be deleted if no assignable cause can be found. Al-
offer a formal analysis, I suggest that a parallel situation ex-
though the human-directed investigation of out-of-control
ists in this context: Like the median, estimators such as IR,points is essential to improving a process, it is important to
IR25, and M will be more sensitive to rounding errors thanhave a measure of the variability of the core process that is
are estimators based on M estimators or on the asymptotic independent of the skill or reliability with which operators
variance of M estimators.
are able to determine the special cause for each suspected
7. CONCLUSIONS AND RECOMMENDATIONS out-of-control event. There must be limits on the investiga-
tive skill of operators: Will operators ever fail to find a
The modified version of the biweight A estimator pro-
special cause for a false positive signal? And, if investiga-
vides estimators that combine the best features of the pre-
tion of an unusual subsample range fails to find a special
vious methods-the relatively high efficiency of the range-
cause, must that range be included in the estimator?
based methods when no disturbance is present, together
There is no inherent reason why the estimation of ua is
with the strong resistance to disturbances of the trimmed
best done in imitation of the steps taken when running an
IQR method. By focusing attention on the pooled residu-
established control chart. The classification of events as ei-
als, I am able to achieve a much higher breakdown bound
ther in-control or out-of-control encourages the use of es-
than in previous robust methods. I am also able to gather
timators based on all-or-nothing techniques of deletion. It
together sufficient numbers of points to exploit the inherent
is generally true that such abrupt treatment of the data pro-
advantages of a squared deviation-based estimator found
duces estimators that lack the efficiency of methods that
when working in a basically Normally distributed setting.
allow a smooth downweighting of extreme events; my sim-
I believe that the types of disturbances studied were rea-
ulation studies confirm that this holds true in the present
sonable facsimiles of several major types of problems
context.
that
arise when initializing control charts. Given the strong per-
formance of the new methods under these conditions and
the fact that little or no efficiency is lost when there are ACKNOWLEDGMENTS
no disturbances, I would recommend the use of D10 or D7
I thank the referees, the associate editor, and the editor
over previous methods.
In general, I would recommend D7 over D10 because of for their help in improving the original submission.
its superior overall performance. In a situation in which it
is known that the production process is seldom out of con-
APPENDIX: BREAKDOWN BOUNDS
trol, however, D10 would be preferred because it is a better
estimator of u when few or no disturbances are present. The lowest breakdown bound of M is 1/4 - /m, given
One surprise for me was that the adaptive trimmer ATR, n+g > 3.
a variant of a venerable technique, generally outperformed Proof: The breakdown bound of M will be b* = p/m,
the more recently proposed trimmed range methods such where p is the largest number of observations that can
as RIO and R25. This approach deserves greater study, al- be manipulated to take arbitrary values while M1 remains
though it seems highly unlikely that any subsample range- bounded. The MAD's will break down, roughly speaking,
based method could be competitive when faced with awhen half the values are manipulated. The fastest way to
mean-shift disturbance. One promising line of attack would break down M is to swamp half the subsamples by manip-
be a version of the backward-stepping methods of Simonoff ulating the minimum number of points needed to control
(1987), in which one would begin with a presumably reli- the subsample median in each. This, in turn, takes approx-
able central core of subsample ranges and work out fromimately half the points in each such subsample so that the
there. This approach avoids the masking problem commonbreakdown bound will be about 25%. More precisely, the
to ordinary adaptive trimmers. bound depends on whether n and g are even or odd, and
A robust estimator such as D7 has the ability to perform these four cases are analyzed here.
well under adverse circumstances, but it is not intended The median of an even number of values will be defined
to lessen interest in the causes of those adverse circum- as the average of the middle two ordered observations.
stances. Improving a production process requires getting to Case I: n Even, g Even. The total number of centered
the bottom of each problem as efficiently as possible; I have observations, m', will be even, and therefore the median of
demonstrated a robust estimator that gets to the bottom of the absolute values of the centered observations, M, will
the question of the variability of the core process. This will remain bounded as long as fewer than m'/2 centered ob-
not answer the question of what is causing the disturbances servations are manipulated. To control Xk, the median of
nor will it make the disturbances go away; it will help tothe kth subsample, will require the control of n/2 points
locate the problems faster and more reliably and aid in dis- in that subsample. Thus, manipulation of n/2 points in the
tinguishing diffuse from localized problems. kth subsample will suffice to drive the absolute values of all
An underlying theme found in several textbooks is that n values of the kth centered subsample beyond any bound.
the variance of the core process should be estimated fromTherefore, m'/2 centered observations can be controlled
the average of the ranges remaining after deleting all out-of- with (n/2)(g/2) original observations so that the break-

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms
ROBUST ESTIMATION OF THE PROCESS STANDARD DEVIATION 141

down bound will be as n + g > 3 because this ensures that 1/(4g) + 1/(4n) -
7/(4m) > -l/m. Therefore, the lowest breakdown bound
b*
b -(n/2)(g/2)-1
::=-m- 1 (A.1) (A.1)
1/rn. for M is 1/4 - /m, for n + g > 3.
mrn 4

Case I: n Odd, g Even. Because g is even, control


[Received August 1995. Revised May 1996.]
over all the centered observations in g/2 subsamples ca
be used to drive M beyond any bound. The absolute value
of the n- 1 observations in the kth centered subsample
{IYk,z }~ -1, can be controlled by the manipulation of th
REFERENCES
median of the original subsample, Xk, and this can be ac
Duncan, A. J. (1974), Quality Control and Industrial Statistics, Hom
complished by controlling (n+ 1)/2 of the original observa-
IL: Richard D. Irwin.
tions, {zXki} jn. Thus, M can be driven beyond any boun
Ferrel, E. B. (1953), "Control Charts Using Midranges and Medians," In-
via control over ((n + 1)/2)(g/2)
dustrial Qualityoriginal
Control, 15, 40-44. observations s
that the breakdown bound is Grant, E. L., and Leavenworth, R. S. (1988), Statistical Quality Control
(6th ed.), New York: McGraw-Hill.
b* ((n + 1)/2)(g/2)- 1 + 1/m. (A.2) Hampel, F. R. (1971), "A General Qualitative Definition of Robustness,"
m
rn + 4
1/(4- ) The Annals of Mathematical Statistics, 42, 1887-1896.
Case III: n Even, g
Hoaglin, D. C., Mosteller, Odd.
F., and Tukey, J. W. (eds.) (1983), Bec
Understanding
main bounded Robustuntil and Exploratory Data Analysis, New York: Wiley.
all the obs
Huber, P. J. (1964), "Robust Estimation of a Location Parameter," The
ples are manipulated, plus
Annals of Mathematical Statistics, 35, 73-101.
n/2
subsamples. Because
Iglewicz, B. (1983), "Robust Scaleall n
Estimators and valu
Confidence Intervals
can be controlledfor Location," in through the
Understanding Robust and Exploratory Data Analysis,
original subsample members,
eds. D. C. Hoaglin, F. Mosteller, and J. W. Tukey, New York: Wiley, pp.
404-431.
til (n/2) + (n/2)(g- 1)/2 of th
Langenberg, P., and Iglewicz, B. (1986), "Trimmed Mean X and R-charts,"
manipulated. Therefore,
Journal of Quality Technology, 18, 152-161.
the br
Lax, D. M. (1985), "Robust Estimators of Scale: Finite Sample Perfor-
b* (n/2)( + (g-
mance in Long-Tailed Symmetric1)/2) -1
Distributions," Journal of the Ameri-
m m
can Statistical Association, 80, 736-741.
nig/4 + S. A., Vetterling,
Press, W. H., Teukolsky, n/4- W. T., and Flannery, B. P. (1992),
ng/4
mn 4
+
Numerical Recipes n/4
in C, New York: Cambridge University Press.
Rocke, D. M. (1983), "Robust Statistical Analysis of Interlaboratory Stud-
Case IV: n Odd, g Odd. There
ies," Biometrika, are n- 1 observa
70, 421-431.
each centered subsample. M
(1989), will
"Robust remain
Control Charts," Technometrics, bounde
31, 173-184.
(1991), "Robustness
the observations in (g - 1)/2 subsamples are and Balance in the Mixed Model," Biometrics,
ma
47, 303-309.
plus (n - 1)/2 more observations in other subsam
Rousseeuw, P., and Yohai, V. (1984), "Robust Regression by Means of
trol over all n - 1 observations
S-Estimators," in Robust and in a Time
Nonlinear centered
Series Analysis (Lecture sub
quires control over (n + Notes in Statistics, Vol. 26), eds. J. Franke, W. Hardle, and R. D. Martin, obs
1)/2 of the original
in that subsample because n is odd.
New York: Springer-Verlag, pp. 256-272. Hence, M w

bounded until [(n + Schrader,


1)/2](9 R. A., and Hettmansperger,
- 1)/2 T. P. (1980),+
"Robust
(n Analysis
-of 1)/
Variance Based Upon a Likelihood Ratio Criterion," Biometrika, 67,
vations are manipulated so
93-101. that the breakdown b
be
Schrader, R. A., and McKean, J. W. (1977), "Robust Analysis of Variance,"

b*
[(n+ 1)/2](- 1)/2 + (n- 1)/2- Communications in Statistics, Part A-Theory and Methods, 6, 879-894.

m
Shoemaker, L. H., and Hettmansperger, T. P. (1982), "Robust Estimates
and Tests for the One- and Two-Sample Scale Models," Biometrika, 69,
ng- n + g - 1 + 2n- 6 47-53.

4m Simonoff, J. S. (1987), "Outlier Detection and Robust Estimation of Scale,"


Journal of Statistical Computer Simulation, 27, 79-92.
1
= 1/(4g) + 1/(4n)- 7/(4m). (A.4) Wadsworth, H. M., Stephens, K. S., and Godfrey, A. B. (1986), Modern
Methods for Quality Control and Improvement, New York, Wiley.
Yohai,
The breakdown bound for Case I is lower than for CaseV. J., and Zamar, R. (1988), "High Breakdown-Point Estimates of
Regression by Means of the Minimization of an Efficient Scale," Journal
II or Case III and will be lower than that of Case IV asoflong
the American Statistical Association, 83, 406-413.

TECHNOMETRICS, MAY 1997, VOL. 39, NO. 2

This content downloaded from


115.135.24.84 on Fri, 18 Mar 2022 12:18:53 UTC
All use subject to https://about.jstor.org/terms

You might also like