Offrey Vining PDF
Offrey Vining PDF
MONITORING AND
OPTIMIZATION
This Page Intentionally Left Blank
STATISTICS: Textbooks and Monographs
A Series Edited by
edited by
Sung H. Park
Seoul National University
Seoul, Korea
G. GeoffreyVining
Virginia Polytechnic Institute
and State University
Blacksburg, Virginia
m
MARCEL
MARCELDEKKER,
INC. -
NEWYORK BASEL
D E K K E R
Library of Congress Cataloging-in-Publication Data
Headquarters
Marcel Dekker. Inc.
270 Madison Avenue, New York, NY 10016
tel: 2 12-696-9000; fax: 21 2-685-4540
The publisher offers discounts on this book when ordered in bulk quantities. For
more information, writeto Special Sales/Professional Marketingat the headquarters
address above.
Neither this book nor any part may be reproduced or transmitted in any form or by
any means, electronic or mechanical, including photocopying, microfilming, and
recording, or by any information storage and retrieval system, without permission in
writing from the publisher.
iii
iv Preface
Sung H . Park
G. Ge0ffrc.y Vining
Contents
Preface iii
Contributors ix
V
vi Contents
PART 3 MULTIVARIATEPROCESSMONITORINGAND
CAPABILITY INDICES
PART 5 EMPIRICALMODELBUILDINGANDPROCESS
OPTIMIZATION
Index 483
This Page Intentionally Left Blank
Contributors
Bo Bergman, Ph.D. Linkoping University, Linkoping, and Department of
TQM, School of
Technology
Management,
Chalmers University
of
Technology, Gothenburg, Sweden
ix
x Contributors
R.
Gnanadesikan, Ph.D. Department of Statistics,RutgersUniversity,
New Brunswick, New Jersey
Kai Kristensen,
Dr. Merc. Department of Information Science, The
Aarhus School of Business, Aarhus, Denmark
Carl ModighArkwrightEnterprisesLtd.,Paris,France
Elart von Collani, Dr.rer. nat., Dr.rer. nat. habil., School of Economics,
University of Wuerzburg, Wuerzburg, Germany
1. INTRODUCTION
2. JAPANESEPRODUCTSANDAMERICANPRODUCTS
Many Japanese read an article on April 17, 1979, on the front page of Asahi
Shinbun, one of the most widely circulated newspapers in Japan, regarding a
comparison of the quality of color television sets produced by the Sony
factory in Japan with that of TVs produced by the Sony factory in San
Diego, California. The comparison was madeonthebasis of thecolor
distribution, which is related to the color balance. Although both factories
usedthe same design, the TVs fromtheSanDiegofactoryhadabad
reputation,andAmericanspreferredtheproductsfromJapan. Based on
this fact, Mr. Yamada, the vice presidentofSonyUnitedStatesatthat
time, described the difference in the article.
The difference in the quality characteristic distributions is shown in
Figure 1. It is seen from the figure that the color quality of Japanese-made
1
2 Taguchi
m-5 m m+5
Figure 1 Distribution of color quality in television sets of Sony U.S. and Sony
Japan.
tolerance
c'' -- 6 x standard deviation
tolerance
c- = 0.577
- (tolerance/m) x 6
L(m) = 0 (3)
L'(m) = 0 (4)
--
L 'I( m )
-
2! 0,-
4 Taguchi
The constant and linear terms(differential terms) become zero from Eqs. (3)
and (4). If the third-order term and the following terms can be omitted, the
loss function is then
lossofdisposingofafailedproduct -- A
k= -
(allowance)2 A2
Assume that the cost of repairing a failed color TV set is 600 yen. k is then
calculated as
600
k = -= 24.0 (yen)
52
L = 24.001 - m ) 2 (9)
L = ka'
L = 24.0 x [= 1
x IO x
This shows that there is a 1 1 1 (= 200.0 - 88.9) yen improvement, but that
the Sony U.S. quality level is still 22. I (= 88.9 - 66.7) yen worse than that of
Sony Japan.
I f such an improvement were attained by repairing or adjusting failed
products whose quality level exceeds m f 1013 but lies within m f 5, holding
33.3% of the total production asseen from Figure 1, at a costof 600 yen per
unit, then the cost of repair per unit would be
600 x 0.333
(13) = 200(yen)
An 1 1 1.1 yen quality improvement at a cost of200 yen is not profitable. The
correct solution to this problem is to apply both on-line and off-line quality
control techniques.
I visitedtheLouisvillefactoryofthe General Electric Company in
September 1989. On the production line,workerswereinstructedto use
c, = tolerance
60
L = k c2
3. WHAT IS ON-LINEQUALITYCONTROL?
quality control system design is the way to keep production lines from fall-
ing out of control. It is the objective of this chapter to briefly describe on-
line quality control methods and give their theoretical background.
In I Motor Company in the 1970s, there are 28 steps in the truck engine
cylinderblock productionline.Qualitycontrolactivity is necessary to
ensure normal production at each step. One of the steps, called boring by
reamers, is explained asan example, whichis also describedin detail in Ref. 1 .
Approximately 10 holes are bored at a time in each cylinder block by
reamers. A cylinder block is scrapped as defective if there is any hole bore
that is misaligned by more than I O pm, causing an 8000 yen loss, which is
denoted by A . The diagnosis cost to know whether holes are being bored
straight, designated by B, is 400 yen, and the diagnosing interval, denotedby
11, is 30 units. In the past half-year, 18,000 units were produced, and there
were seven quality control problems.
The average problem occurrence interval, denoted by U, is then
- 18,000
zi = -= 2570 (units)
7
12 2 u u
Putting I I = 30, A = 8000 yen, B = 400 yen, C = 20,000 yen, ii = 2570, and
C = 1 unit in the above equation, the quality control cost per unit product of
this example would be
On-LineQualityControl 9
400
L=-+-
30 +1 8000
__ +-20,000 1 x 8000
(2570) 2 30 2570 + 2570
= 13.3 + 48.2 + 7.8 + 3.1
= 74.2 (yen)
2(u + C)B ‘ I 2
n= [ A -C l i ]
n=[
+
2(2570 1) x 400
8000 - 20,000/2570 1’’* = 16 (units)
The quality control cost from Eq. (19) when the diagnosis interval is 16 is
L=-+-
+
400 16 1 8000
- +-
20,000 1 x 8000
(2570) 2 16 2570 2570+
+ +
= 25.0 + 26.5 7.8 3.1 = 62.4 (yen)
There is a savings of 72.4 - 62.4 = 10.0 yen per unit product, or360,000 yen
per year. The value of L does not changesignificantly even when n varies by
20%. When n = 20, for example,
10 Taguchi
20,000
8000
= 63.6 (yen)
The difference from Eq. (22) is only 1.2 yen. It is permissible to allow about
20% error for the values of system parameters A , B, C, U, and L , or it is
permissible to adjust 11 within the range of 20% after the optimum diagnosis
interval is determined.
Next,the
introduction of preventive
a maintenance system is
explained.Inpreventivemaintenance activities, there are periodic checks
andperiodicreplacement.Inperiodicreplacement,acomponentpart
(which could be the cause of the trouble) is replaced with a new one at a
certain interval. For example. a tool with an average life of 3000 units of
product is replaced after producing 2000 units without checking.
Periodic checking is done to inspect products at a certain interval and
replace tools if product quality is within specification at the time inspected
but there is the possibility that it might become out-of-specification before
the next inspection. In this chapter, periodic replacement is described.
In the case of reamer boring, a majority of the problems are caused by
tools. The average problem-causing interval is ii = 2570 units, and periodic
replacement is made at an intervalof ii’ = 1500, which is much shorter than
the average life. Therefore, the probability of the process causing trouble
becomes very small. Assume that the replacement cost, denoted by C’, is
approximately the same as the adjustment cost C, or 18,000 yen. Assume
that the probability of the process causing trouble is 0.02. This probability
includes the instance of a reamer being bent by the pinholes existing in the
cylinder block, or some other cause. Then the true average problem-causing
interval will be improved from the current 2570 units to
- 1500
11 = ~ = 75,000
0.02
i1= [ +
2 x (75,000 1) x 400
8000 - 20,000/75,0001 .
= 87 =. 100 (units)
75,000 1
18,000
400 101 8000
20,000 1 x 8000
-
1
-5
0[%%+i(m)
0
' +m+
= 12.0 + (4.0 + 5.4 + 0.3 + 0.1)
= 12.0 + 9.8 = 21.8 (yen)
This is an improvement of 63.6 - 21.8 = 41.8 yen per unit compared to the
case without preventive maintenance, which is equivalent to 1,500,000 yen
per annum. If there were similar improvements in each of the 27 cylinder
block production steps, it would be an improvement of 42 million yen per
annum.
Such a quality control improvement is equivalent to the savings that
might be obtained from extending the average interval been problems 6.3
times without increasing any cost. In other words, this preventive mainte-
nance method has a merit parallel to thatof an engineering technology that
is so fantastic that it could extend the problem-causing interval by 6.3 times
without increasing any cost. For details, see Chapters 4-6 of Ref. 1.
Equations (18) and (20) may be approximately applied with satisfac-
tion regardless of the distribution of the production quantity before the
problem and despite variations in the fraction defective during the problem
period. These statements are proved in Sections 5 and 6.
5. PROOFOFEQUATIONSFORNONSPECIFIC
DISTRIBUTION
Assuming that
Since the first, third, and fourth terms of Eq. (18) are self-explanatory, the
loss function is given by
n 2
Next, the equation for the optimum diagnosis interval is derived. The
average problem-causing interval is ii. Since the diagnosis is made at n-unit
intervals, it is more correctly calculated to consider the losses from actual
recovery actionsor by thetime lag causedonceevery +
ii n / 2 units.
+
Therefore, U n / 2 is substituted for ii in Eq. (31).
It is easily understood from the previous example thatii is much larger than
+
n/2. Also, since there is n in the second term of the equation, ii n / 2 in the
+
denominator may be approximated to be U n / 2 = ii. If the approximation
L = -+-(:)+1(1
n n +2 I A
B C
U -:)+-(I
nM CA
U ") n
U (33)
On-Line Quality Control 13
Since
1 1
A - $ -11% = ti( A - : ) ( C 1A -LA/U
- C/U )
sent on to thefollowing steps is D yen. After the process causes trouble, the
probability of detecting the trouble at the diagnosis is p and the probability
of failing to detect the trouble is 1 - p . Accordingly, the average number of
+
problems at the time the troubleis detected is ( n l)p/2. The probability of
detecting a problenl at thesecond diagnosis aftermissing the detection at the
first diagnosis is ( I - p ) p ; then the average number of defectives the inspec-
+
tion fails to detect is ( n 1)/2, and the number detectedis n p units. Thus we
obtain Table 2 .
From Table 2, theaverage loss by defectives when a process isin
trouble is
D is normally much larger than A . The amount of loss in Eq. (37) is mini-
mum when p = 1 and becomes larger when p is close to zero. Putting p = 0
in Eq. (37) gives /7D,showingthattheequationsfor L and IZ should be
changed from Eqs. (18) and (20) to
and
where ( 1 1 + I ) = I I is approximated.
When the fraction defective during the trouble periodis not 100Y0,it is
normal to trace back and find defectives when a trouble is found. In this
case, there are no undetected defectives, so D = A . Equation (37) is there-
fore
B I?; l(2:) C [A
L=-+- - + Y + T
t1 li 11
and
t1=[ +
2(U l ) B ’I’
2 A - C/U
] (43)
7. PREDICTIONANDMODIFICATION
1. Determinetheoptimummeasuringinterval.
2. Forecast the average quality of products produced before the next
measurement.
3. Determine the optimum modifying quantity against the deviation
of the forecasted value from the target value.
L = ka' (44)
where
Optimum
modifying
quantity = -B(I* - !qO) (47)
= 7.0 (49)
The daily loss in the loss function L, including the correcting cost, is
300
L = 7 X 7.0 X 20,000 + 4 x 2000 = 1,688,000 (yen) (50)
5-
300 20,000
L = 7 x 20,000 x
5- I1
+
= 0.028811~ 1 9 2 ~ + 40,000,000
I?
18 Taguchi
The optimum tI that minimizes Eq. (51) is about 430. Then the loss due to
prediction and correction is
REFERENCE
1. MEASUREMENTWITHINTOTALQUALITY
MANAGEMENT
19
20 Krlstensen
100
80
60
Firefighting
Control
40 c] Future
20
0 “
Actual Ideal
necessary adjustments to the processes and gives you time to make them
before they turn into unwanted business results. This is what modern mea-
surement of total quality is all about.
This idea isinvery goodaccordance with the official thoughts in
Europe. In a recent workingdocumentfromtheEuropeanCommission,
DGIII, the following is said aboutqualityandqualitymanagement
(European Commission, 1995):
The use of the new methodologies of total quality management is for the
leaders of the European companies a leading means to help themin the
currenteconomicscenario, whichinvolves not only dealing with
changes. but especially anticipating them.
Thus,totheEuropeanCommission,quality is primarilyaquestion of
changes and early warning.
To create an interrelated system of quality measurement it has been
decided to definethemeasurementsystem accordingtoTable I , where
measurements are classified according to two criteria: the interested party
(the stakeholder) and whether we are talking about processes or results.
Other typesofmeasurementsystemsaregiven in KaplanandNorton
( 1996).
As Table 1 illustrates, we distinguish between measurements related to
the process and measurements related to the results. The reason for this is
obvious in the light of what has been said above and in the light of the
definition of TQM. Furthermore we distinguish between three “interested
parties:” the company itself, the customer, and the society. The first two
should obviously be part of a measurement system according to the defini-
tion of TQM, and the third hasbeen included because thereis no doubt that
Figure 2 Theimprovementcircle.
TotalQualityManagement 23
Probability of loyalty
12 q
I
I
1
1
1.0 I
i
~ satisfaction
customer Average
Figure 3 Probability of loyalty as a function of customer satisfaction.
24 Kristensen
The coefficients of the equation are highly significant. Thus the standard
deviationoftheconstantterm is 0.33, andthat of theslope is 0.09.
Furthermore, we cannot reject a hypothesis that the slope is equal to 1.
It appears from this that a unit change in employee satisfaction gives
more or less the same change in customer satisfaction. We cannot, from
these figures alone, claim that this is a causal relationship, but we believe
that combined with other information this is strong evidence for the exis-
tence of an improvement circle like the one described in Figure 2. To us,
therefore, the creation of a measurement system along the lines given in
Table 1 is necessary.Onlyin this waywill management be ableto lead
Total QualityManagement 25
~ Employee satisfaction
the company upstream and thus prevent the disasters that inevitably follow
the firefighting of short-term management.
An example of an actual TQM measurement system is given in Figure
5 for a Danish medical company. It will be seen that the system follows the
methodology given in the Process section of Table 1.
Since optimization and monitoring of the internal quality are dealt with
elsewhere in thisbook we are going to concentrateontheoptimization
and monitoring of customers whether they are internal (employees) or exter-
nal. First a theoretical, microeconomic model of satisfaction and loyalty is
constructedandthen we establisha“controlchart”for themanagerial
control of satisfaction.
I1
CSI = II'jC,
0, -
- ct
ll'; Wi
vii
i.e., when the degree offulfilment of customer expectations is identical for all
areas. This is based on the fact that the first-order condition for maximiza-
tion of Eq. (2) is equal to
From this it will be seen that if the right-hand side of Eq. (5) is equal to 1
then a very simple rule for optimum customer satisfaction will emerge:
n = likelihood
buying
of
X
quantity
bought
- costs
where
where c; is the satisfaction of parameter i for the main competitor. Thus the
elements of the loyalty function are related to the competitive position of a
given parametercombined withthe importance of theparameter. We
assume that the quantity bought given loyalty is a function of the customer
satisfaction index. This means that we will model the income or revenue of
the company as
This tells us that you may be very satisfied and still not buy very much,
because competition is very tough and hence loyalty is low. On the other
hand, when competition is very low, you may be dissatisfied and still buy
from the company even though you try to limit your buying as much as
possible.
Combining (IO) with the original model in ( 2 ) , we come to the follow-
ing model for the company profit:
6l-I
-= Lcp‘w;
6ci
+ cpL;wi - 2k;c,
- = ff
N';
+ pL;
To put it differently, we have shown that if company resources have
been allocated optimally, then the degree to which you live up to customer
expectations should be a linear function of the contribution to loyalty. This
seems to be a very logical conclusion that will improve the interpretation of
the results of customer satisfaction studies.
Practical use of results (4),( 6 ) , and (15) will be easy, because in their
presentformyouonlyneedmarketinformationtousethem.Onceyou
collect information about c,, c;, )vi, and the customers' buying intentions,
the models can be estimated. In the case of a loyalty model you will most
likely use a logit specification for L and then L; will be easy to calculate.
.Y = (f)
where c is an 11 x 1 vector of evaluations and w is an rz x 1 vector of impor-
tances. Assume that .x- is multivariate normal with covariance matrix
and expectation
Assume a sample of N units, and let the estimates of (17) and (18) be
s= (i)
and
Let I be the identity matrix of order 11. Then our hypothesis may be written
Ho: (I1 - I ) =0
N-n
F= T2
(N - 1 ) ~
+
S(, = s,. SI,.- s,,,.- s,',. (25)
N-11
1
/'(? - G ) - - / ' S J
[N
( N - I)/?
Fa:,l,N-ll
1 I/'(PC - P I J
< /'(C
- - M')
1
+[N
-
( N - l)n
/'S(,/ N - 1 1 Fa:~~.N-~~ 1
Now assume that the hypothesis is true, and let
or
W
Figure 6 Quality map.
32 Kristensen
If a parameter falls between the dotted lines we cannot reject the hypothesis
that we have optimal allocation of resources. If, on the other hand, a para-
meter falls outside the limits, the process needs adjustment.
Weshouldrememberthatthelimitsaresimultaneous. If we want
individual control limits,which, of course, willbe muchnarrower, we
may substitute t a , N - l for
[
( N - I)/?
N-n 1
Fa:~~,N-~~
2.3. An Example
An actual data set from a Danish company is presented in Table 2. Seven
parameters were measured on a seven-point rating scale.
Now we are ready to set up the control chart for customer satisfaction.
We use formula (30) to get the limits,
= f(O.18lJ7.74 X 2.18 = f 0 . 7 4
From the control chart (Figure 7)we can see that most of the parameters are
in control but one parameter needs attention. The importance of the envir-
onnlental parameter is significantly greater than that of the evaluation of
Average 2.18
TotalQualityManagement 33
4 5 6 7
3. CONCLUSION
The use of the concept of total quality management expands the need for
measurement in the company. The measurementof quality will no longer be
limited totheproductionprocess.Now we need tomonitor“processes”
such as customer satisfaction and employee satisfaction. In this chapter I
have given a managerial model for the control of these processes, and we
haveconsideredapractical“control”chartthat will helpmanagement
choose the right parameters for improvement.
REFERENCES
Kristensen K, Dahlgaard JJ. ISS International Service System A/S "Denmark. The
European Way to Excellence. CaseStudy Series. DirectorateGeneral 111,
European Commission, Bruxelles. 1997.
Kristcnscn K, Martcnsen A. Linkingcustomersatisfaction to loyalty and perfor-
mance. ESOMAR Pub. Scr. 204: 159-169. 1996.
Kristcnsen K, Dahlgaard JJ, Kanji GK. On measurement of customer satisfaction.
Total Quality Managemcnt 3(2): 123--128. 1992.
3
Quality Improvement Methods and
Statistical Reasoning*
G.K. Kanji
Sheffield Hallam University, Sheffield, England
1. PRINCIPLES OF TOTALQUALITYMANAGEMENT
Total
quality
management (TQM) is aboutcontinuousperformance
improvement of individuals, groups, and organizations. What differentiates
total quality management from other managementprocesses is the emphasis
on continuous improvement. Total quality is not a quick fix; it is about
changing the way things are done-forever.
Seen in thisway, totalqualitymanagement is aboutcontinuous
performanceimprovement. To improveperformance,people need to
knowwhatto do andhowtodoit,havetherighttoolsto do it, be
abletomeasureperformance,and receive feedbackoncurrent levelsof
achievement.
Total quality management (Kanji and Asher, 1993) provides this by
adhering to a set of general governing principles. They are:
1. Delightthecustomer
2 . Management by fact
3. People-basedmanagement
4. Continuousimprovement
*For an extended version of this paper, see Kanji GK. Total Quality Management 5: 105. 1994.
35
36 Kanji
1.1. DelighttheCustomer
The first principle focuses on the external customers and asks “what would
delightthem?”Thisimpliesunderstanding needs-both of productand
service, tangible and intangible-and agreeing with requirements and meet-
ing them. Delighting the customer means being bestat what matters most to
customers, and this changes over time. Being in touch with these changes
and delighting the customer now and in the future form an integral part of
total quality management.
Thecoreconcepts of totalqualitythatrelatetotheprinciple of
delighting the customer are “customer satisfaction” and “internal customers
are real.”
Side I Side 2
Rl F@l
Side 3 Side 4
1.3. People-basedManagement
Knowing what to do and how to do it and getting feedback on performance
form one part of encouraging people to take responsibility for the qualityof
their own work. Involvement and commitment to customer satisfaction are
ways togeneratethis.Thethirdprinciple of totalqualitymanagement
recognizes that systems, standards, and technology in themselves do not
mean quality. The role of people is vital.
The core concepts that relate to people-based management are “team-
work” and “people make quality.”
38 Kanji
1.4. ContinuousImprovement
Total quality cannot be a quick fix or a short-term goal thatwill be reached
when a target has been met. Total quality is not a program or a project.It is
a management process that recognizes that however much we may improve,
our competitors will continue to improve and our customers will expect
morefrom us. The link between customerandsupplier withprocess
improvement can be seen in Kanji (1990).
Here,
continuous improvement-incremental change,
not
major
breakthroughs-must be the aim of all who wish to move toward-total
quality.
The core concepts that relate to the company’s continuous improve-
ment are “the continuous improvement cycle” and “prevention.”
Each concept is now discussed, together with an example of how that
concept was used by a company to bring about improvement.
2. CORECONCEPTS OFTQM
2.1. InternalCustomersAreReal
Thedefinition of quality [see Kanji (1990)], “satisfyingagreedcustomer
requirements,”relatesequallytointernalandexternalcustomers.Many
writers refer to the customer-supplier chain and the need to get the internal
relationships working in order to satisfy the external customer.
Whether you are supplying information, products, or a service, the
people you supply internally depend on their internal suppliers for quality
work. Their requirements are as real as those of external customers; they
may be speed, accuracy, or measurement.
Internal customers constitute one of the “big ideas” of total quality
management. Making the mostof this idea can be very time-consuming, and
manystructuredapproachestakealong time andcan be complicated.
However,onesuccessfulapproach is totakethe“cost of quality”and
obtain information about the organization’s performance and analyze it.
Dahlgaard et al. (1993) used statistical methods to discuss the relationship
between the total quality cost and the number of employees in an organiza-
tion.
2.3. Measurement
The third core conceptof total quality management is measurement. Having
a measure of how we are doing is the first stage in being able to improve.
Measurescan focusinternally,i.e.,oninternalcustomersatisfaction
(Kristensen et al., 1993), or externally, i.e., on meeting external customer
requirements.
Examples of internal quality measurements are
Production
Breach of promise
Reject level
Accidents
Process in control
Yield/scrap (and plus value)
Kristensen et al. (1993), when discussing a measurement of customer
satisfaction, used the usual guidelines for questionnaire design and surveys
and statistical analysis to obtain the customer satisfaction index.
2.4. Prevention
The core conceptof prevention is central to total quality management andis
one way to move toward continuous improvement.
Prevention means not letting problems happen. The continual process
of driving possible failure outof the system can, over time, breed a culture of
continuous improvement.
There are two distinct ways to approach this. The first is to concen-
trateonthe designofthe product itself (whether a hardproductor a
40 Kanji
2.5. CustomerSatisfaction
Many companies, when they begin quality improvement processes, become
very introspective and concentrate on their own internal problems almost at
the expense of their external customers.
Other companies, particularly in the service sector, have deliberately
gone out to their customers, first to survey what is important to the custo-
mer and then to measure their own performance against customer targets
(Kristensen et al., 1993). The idea of asking one’s customers to set customer
satisfaction goals is a clear sign of an outward-looking company.
One example is Federal Express, who surveyed their customer base to
identify the top I O causes of aggravation. The pointswere weighted accord-
ing to customer views of how important they were. A complete check was
made of all occurrences, and a weekly satisfaction index was compiled. This
allowed the company to keep a weekly monitor of customer satisfaction as
measured by the customer. An understanding of survey and statistical meth-
ods is therefore needed for the measurement of customer satisfaction.
2.6. Teamwork
Teamwork can provide an opportunity for people to work together in their
pursuit of total quality in ways in whichtheyhave not worked together
before.
People who work on their own or in small, discrete work groups often
have a picture of their organization and the work that it does that is very
compartmentalized. They are often unaware of the work that is done even
by people who work very close to them. Under these circumstances they are
usually unaware of theconsequencesofpoorquality in theworkthey
themselves do.
By bringingpeopletogether in terms with a common goal, quality
improvement becomes easier to communicate over departmental or func-
QualityImprovementandStatisticalReasoning 41
tional walls. In this way the slow breaking down of barriers acts as a plat-
form for change.
We defined cultureas“the waywe d o thingshere,”andcultural
changeas“changingthe way we do thingshere.” Thischange implies
significant personal change in the way people react and in their attitudes.
A benchmarkingapproachcanalsohelptochangethe waythey do
things.
Teamwork can be improved by benchmarking, a method that is simi-
lar to the statistical understanding of outliers.
2.7. PeopleMakeQuality
Deming has stated that the majority of quality-related problems within an
organization are not within the control of the individual employee.As many
as 80% of these problems are caused by the way the company is organized
and managed.
Examples where the system gets in the way of people trying to d o a
good job are easy to find, and in all cases simply telling employees to do
better will not solve the problem.
It is important that the organization develop its quality management
system, and it should customize the system to suit its own requirements.
Each element will likely encompass several programs. As a matter of fact,
this is where the role of statistics is most evident.
2.8.TheContinuousImprovementCycle
Thecontinuous cycle of establishingcustomerrequirements,meeting
those requirements, measuring success, and continuing to improve can be
used bothexternallyandinternallyto fuel the engine of continuous
improvement.
By continually checking with customer requirements, a company can
keepfindingareas in whichimprovementscan be made.Thiscontinual
supply of opportunity can be used to keep quality improvement plans up-
to-date and to reinforcetheidea thatthetotalqualityjourney is never-
ending.
In order to practice a continuous improvementcycle it is necessary to
obtain continuous information about customer requirements, i.e., d o mar-
ket research. However, we know that market research requires a deep sta-
tistical understanding for the proper analysis of the market situation.
42 Kanji
3. STATISTICALUNDERSTANDING
4. CONCLUSIONS
In recent years, particularly inJapan and the United States, there has been a
strongmovementforgreateremphasis ontotalqualitymanagement in
whichstatisticalunderstandinghas been seen to be a major contributor
for management development.
It is clear that statistical understanding plays a major role in product
and service quality, care of customers through statistical process control,
customer surveys, process capability, cost of quality, etc. The value of sta-
tistical design of experiments, which distinguishes between special causeand
commoncausevariation, is also well established in thearea of quality
improvement.
If we alsoacceptthat “allwork is process,”that allprocesses are
variable, and that there is a relationship between management action and
quality, then statistical understanding is an essential aspect of the quality
improvement process.
Further, in the areas of leadership, quality culture, teamwork, etc.,
development can be seen in various ways by the use of statistical under-
standing.
In conclusion, I believe that total quality management and statistical
understanding go hand in hand. People embarking on the quality journey
must therefore venture onto the road of total statistical understanding and
follow the lead of total quality statisticians.
REFERENCES
Dahlgaard JJ, Kristcnsen K, Kanji GK. Quality cost and total quality management.
Total Quality Management 3 ( 3 ) : 21 1-222. 1993.
Kanji GK. Totalqualitymanagement:Thesecondindustrial revolution. Total
QualityManagement l(1): 3-12, 1990.
Kanji GK. AsherM. TotalQualityManagement:ASystemicApproach.Carfax
Publishing Company, Oxfordshire, U.K., 1993.
Kristensen K, Kanji GK, Dahlgaard JJ. On measurement of customer satisfaction.
Total Quality Management 3(2): 123-128, 1993.
Snee RD. Statistical thinking and its contribution to total quality. Am Stat 44(2):
I 1 6 1 2 1 , 1990.
This Page Intentionally Left Blank
Leadership Profiles and the
Implementation of Total Quality
Management for Business Excellence
Jens J. Dahlgaard, Su Mi Park Dahlgaard, and Anders Narrgaard
The Aarhus School of Business, Aarhus, Denmark
1. INTRODUCTION
45
46 Dahlgaard et al.
(100 points)
r
hManagement
(90 points)
(80 points)
(140 points)
(90points)
; Satisfiction
(200 points)
- Business Results
( I 50 points)
Impact on
Resources
Society
(90 points)
(60 points)
4 -b
Eanblcn 50% Results 50%
*Success criteria taken from the EQA business excellence model hove been supplemented with
succcss criterla from the creative and learning organizations because although creativity and
learning are implicitly Included in total quality management, theory on total quality manage-
ment has to a certain degree neglected these two important disciplines. The aspect that unites
all of the chosen success criteria is that they all demand a strong commitment from the senlor
management of an organization.
Leadership Profiles and Implementation of TQM 49
To achieve our first aim an empirical study was carried out that involved
more than 200 leaders and managersof European companies and some1200
of their employees. The format of the study was as follows.
1. Fourhundred chiefexecutive officers fromFrance,Germany,
Holland, Belgium, the United Kingdom, and Denmark were ran-
domly selected fromvariousEuropeandatabases.The selection
criteria were that they had to be from private companies (100%
state-owned companies were excluded) with morethan 50 employ-
ees.
2. The selected leaders were asked to complete an 86-point question-
naire* composed of two sections:
a. 49 questions asking leaders to rate the importanceof a number
of aspects of modern business managementt
b. 37 questions asking leaders to rate the importanceof a number
of statements or success criteria on business excellence
3 . By analyzingthematerialsupplied by theleaders in response to
the first 49 questions, it was possible to plot the “leadership pro-
file” of each individual respondent. These leadership profiles are
expressed in eight different leadership “styles”.
4. The success criteria, which form the focus of the second section (37
questions), indicate thekey leadership qualities required to achieve
business excellence. The higher the leaders scored on these ques-
tions, the more they could be said to possess these qualities.
*The complete Leadership Prolile questionnaire in fact consisted of 106 questions. The addi-
tional 20 questions covered culturalissues that do not form partof this chapter. The questions
were developed by Gecrt Hofstede in 1994.
?The aspects of management were identified by a Danish focus group participating in a pilot
vcrsion of this survey in 1995. developed by Anders Nsrgaard and Heme Zahll Larscn. The
focus group conslsted of nine directors representing various areas of business, who were asked
to identifythe key attributes of a good business leader. The attributes so identified were
classified on the basis of an affinity analysis, and as a result 49 variables were established.
These variables could then be used to plot any individual leadership profile.
50 Dahlgaard et al.
Success Criteria
As described in Section 1, the success criteria for businessexcellence used in
thisresearchcomprisethreemainelements-totalqualitymanagement,
creativity, and learning. However, since the interaction between an organi-
zation's leadership and its employees has a major impact on whether these
criteria are achieved or not, this interaction becomes, in effect, a fourth
success criterion.
As Figure 3 shows, the achievementof these success factors is affected
by the leadership profiles of those in charge of the organization. Although
Busmess
Leadershlp
Excellence Profile
Figure 3 Theleadershipmodel.
Leadership Profiles and Implementation of TQM 51
not included within the scope of this chapter, it is reasonable to assume that
these leadership profiles are in turn influenced by a number of “basic vari-
ables” such as leader’s age, education, and experience and the size of the
company or the sector in which it operates.
The Captain
Key attributes: Commands respect and trust; leads from the front; is pro-
fessionally competent, communicative, reliable, and fair.
The Captain is in many ways a “natural” leader. He commands the
respect and trust of his employees and leads from the front. He has a con-
fidence based on his own professional competence, and when a decision is
made it is always carried out. He has an open relationship with his employ-
Leadership Profiles and Implementation of TQM 53
must also take into consideration the Ideal Leadership profile outlined by
the employees. By using quality function deployment (see Section 2.4) it is
possible for managers to work with the demands of the employees.
2.4. ModelforMeasuringExcellentLeadership
An Excellent Leadership model should integrate the demands that the suc-
cessful leader must consider when trying to achieve business excellence. The
model should clarify what the leader should do to improve his performance
as a leader in relation to the success criteria for achieving business excel-
lence.
A product itnprowment technique called quality function deployment
(QFD) is used asatoolformeasurement ofExcellentLeadership. The
essence of this technique consists of combining aset of subjective variables,
normally set out by the demands of customers, with a set of objective vari-
ables provided by the manufacturers’ product developers.As a result of this
interactiveprocessanumberoffocusareasfordevelopinghighquality
products become apparent, enabling manufacturers to adapt their products
more precisely to customer demands.
Treating the leaders as “products” and theemployees as “customers,”
Q F D is used as a technique for determining Excellent Leadership. This is the
reason for making the parallel between leaders and products. In QFD, the
voice of the customer is used to develop the product. A leader has many
“customers”suchas employees andstakeholders.In this project,the
employeesare selected asour link tothecustomerpart in QFD.This
means that the voice of the employees will serve as an important guideline
for leaders today in developing the right leadership qualities.
The information required for the Q F D construction consists of
Employee demands of an ideal leader. The employees’IdealLeader
,ofile represents the customers’ demands of the “product” in QFD.
The relationship between success criteria for achieving business excel-
lence.
The relationshipbetweensuccesscriteriaanddifferentleadership
styles.
Theindividual leader’s score onthe successcriteria and leadership
styles.
Information about the “best in class” leaders within the areas of per-
formance (quality, learning, and creativity).
The QFD technique provides the possibility to work with the follow-
ing aspects:
Leadership Profiles and Implementation of TQM 57
Zorreietlon
Attributes matrix
msessment 1
Attributes-Leadership Styles
The attributes matrix (far left in Fig. 5 )
-
includes the different attributesof leader-
06011
ship.Eightleadershipstyleshavebeen
identifiedinrelation tothisstudy. As
explainedearlier,
the
eight
leadership
styles
were created
the
on
basis of rating
n the importance of 49 aspects of modern
58 Dahlgaard et al.
ulena F l
the survey also evaluated the importance
of the 49 aspects of modern business man-
agement under consideration to their con-
cept of an ideal leader. This employees’
Ideal Leader profile provides a rating
a weightof importance for each of the
or
eight leadershipstyles. With this information the leader can identify possible
areas of improvement in meeting employee demands for an ideal leader.
Correlation Matrix
Substitution
The roof of the QFD house, (Fig. 5 ) con-
sists ofa correlation matrix that illustrates
the correlation between the three success
criteria. This part of the model is relevant
in determining potential
substitution
opportunities between thecriteria.Only
three criteria are included in this project,
which gives only limited information on
substitution. Using the 37 elements of the success criteria might make it
possible to come up with a more differentiated view of substitution between
the elements.
Leadership Profiles and Implementation of TQM 59
Assessment
The leader’s performance is measured on
the basis of the three success criteria. This
Ufi0
OO . ,
assessment is carried out by means of a
self-evaluation,duringwhichtheleader
answers 37 questions.Theanswersto
these questions indicate the leader’s and/
. or organization’s level ofactivity on the
success criteria(quality,learning,andcreativity),forachievingbusiness
excellence, illustrated by an individual score. This assessment provides the
leaders with a score of their current performance and critical areas in which
further allocation of resources is required for the development of business
excellence. It is important to have knowledgeof one’s current level if one is
to set relevant objectives for the future. The three successive critieria should
beindividuallyevaluated. A globalapproach is required,astheyare
strongly correlated
Benchmarking
Theright-handside of themodelillus-
ndnn
E
tratestheprofilesfor“bestinclass”
withinthethreesuccesscriteria.These
profilescan be used asabenchmark
against “best in industry,” which can gen-
erate new ideas for improvement. These
profilesserveasafoundationforthe
ExcellentLeadership profile,which takesintoaccountthethree success
criteria and employees’ demands of an ideal leader.
Areas of Improvement
The bottom matrix in Figure 5 illustrates
the ‘‘result’’ of theprocess.Multiplying
IO theweightsoftheemployeeswiththe
relationships between the leadership styles
and the three success criteria creates this
endproduct.Takingthe
employees,theareas
view of the
of improve-ment
for the leader can be identified. Inotherwords,theleader isprovided
with concrete ideas of ways in which the respective areas of improvement
are weighted according to employee demands.
60 Dahlgaard et al.
UO€lD’l E l
abenchmarkfortheleaders.It
ideal
leadershipprofile if the
wantsto succeed in managingquality,
learning,andcreativity.Inthisproject
is the
leader
theoverallobjectivewas tocreateone
profile of an excellent leader working actively with the management disci-
plines included in the success criteria. From this perspective this matrix at
the far right in Figure5, is considered the most important onein our use of
QFD.
The QFD technique has served asthebasisforourresearchand
resulted in the identification of the Excellent Leadership profile. The five
crucial drivers (leadershipstyles)forachievingexcellentleadership were
identified by a factor analysis. By correlating leadership styles with success
criteria for business excellence it was possible to identify the styles most
positively correlated to businessexcellence. Expanding the theoretical foun-
dation, as seen in this chapter to treat the empirical data on European
leaders withQFD andthereby take into consideration“employees’ ideality”
has resultedin a more accurate picture of the true drivers in the achievement
of business excellence.
The Excellent Leadership profile shown in the rightmost matrix in the
QFD-model can be benchmarked against any segment or group of leaders,
i.e., leaders from different countries or sectors,of different ages, and so on.
Two segments have been selected for further analysis:
1. European leaders’
leadership
profile versus the Excellent
Leadership profile.
2. Country-by-country comparison of European leaders’ leadership
profile.
2.5. TheExcellentLeadershipProfile
In order to evaluate whether or not a leaderis equipped to lead an organi-
zationtobusiness excellence, abenchmark Excellent Leadershipprofile
(ELP)mustbedeveloped.Thisillustratestheleadershipprofilethat is
best orientedtowardtheachievement of all three of themainbusiness
excellence success criteria.
Leadership Profiles and Implementation of TQM 61
1. The eight leadership styles that make up the leadership profiles are
measured on a scale of 0 to 100 (vertical axis of Fig. 6).
2. Scores above or below 50 points representdeviationsfromthe
average of each leadership style.
3. The closer a leader gets to 100, the more strongly his or her leader-
ship profile is characterized by theelementsidentified in the
description of that particular leadership style.
4. Conversely, the further a score falls below 50, the less applicable
those elements are as a description of the leader's profile.
As Figure 6 illustrates,twoleadershipstyleshavethepredominant
influencewithinthe Excellence Leadership profile-the Strategicand the
Task.
TheStrategic is clearlythemost importantleadership style when
it comes to identifyingthecharacteristicsrequiredofaleaderseeking
... ..
70 ~~~~ ~
.. .. . __-...-...
.-.
40 ..
30 ~ . . ~-~~
20
The TheThe The The Task
The The TheTeam
Captan Creative Involved Stratege lrnpulsrve Speclakt Bulkier
i
me
1 me
30
201
me The Task The me me Team The
glclnvoivedCreatrve Captan Builder
The Strategic:
1. Ascoreofalmost 60 indicates thatEuropean leaders do place
importanceonthe skills oftheStrategicleaderandputthem
into practice, by taking a long-term view of the company and its
direction, setting clear objectives, and being focused on maintain-
ing consistent work processes.
2. They need to develop these competencies even further, however, if
they wish to match the ELP.
3. The significant deviation between the leaders’ actual performince
and the requirements of the ELP is of considerable importance,
given that the Strategic leadership style is the most crucial element
of the ELP.
The Captain:
1. TheEuropean leaders’ low score on the Captain style category
indicates that they are not “natural” leaders. At best. they learn
leadership skills a s they grow into their assignment.
2. The below 50 scoreindicates that these leaders arenotstrongly
characterized by thecompetencies of thisparticularleadership
style-providing leadershipfromthefront,encouragingopen
communication, andcommanding the respect andtrust of
employees.
64 Dahlgaardetal.
3. Although the Captain is not as crucial to the overall ELP as, for
example, the Strategic leadershipstyle, the deviation here is still an
important one in termsofprovidingthebalanceofleadership
styles that is needed to achieve business excellence.
2.7. EuropeanLeadersVersusEmployees’Ideality
The employees’ Ideal Leadership profile embodies the preferences expressed
by the 1150 employees who participated in the survey. Direct subordinates
to chief executives and managing directors were asked to use their answers
to the first 49 questions of the survey to describe their “ideal” leader-
someone for whom they would be willing to make an extra effort in their
work. Comparing the leaders’ profile with the employees’ Ideal Leadership
profile shows whether the employees are in harmony withtheleader for
achieving business excellence and where they are in conflict.
Figure 8 highlightsfourmainareasofleadershipwhereEuropean
employees’expectations differ significantly fromtheactualperformances
of the leaders: the Captain, the Creative leader, the Involved leader, and
the Specialist leader. (A difference of 10 points or more is significant). The
two styles positively correlated to achieving business excellence are included
in the analysis.
mm
20 L
The Captain
Figure 8 shows a difference of approximately 18 points between employees’
expectations and actual performance in the Captain category.
The European leaders’ low score in the Captain style category indi-
cates that they are not “natural” leaders. Atbest, they learn leadership skills
as they grow into their assignment.
The below 50 score indicates that the leaders are not strongly char-
acterized by the competencies of this particular leadership style-providing
leadershipfromthefront,encouragingopencommunication,andcom-
manding the respect and trust of employees.
Employees place a much greater value on the leadership characteristics
of the Captain than their leaders do.
The employees’ score of 60 indicates that they react positively to a
strong “natural” leader who can guide them and to whom they can look
with respect, and that they appreciate the continuous flow of information
provided by the Captain.
The Creative
Figure 8 indicates a difference of approximately 22 points between actual
leadership performance and employee expectations in the creative style cate-
gory.
The Creative style is the leadership style showing the most significant
difference, with employees rating the Creative attributes very highly, at a
score of 70, while their leaders score below 50.
The high score (70) indicates that, in contrast with the Strategic and
Task styles, European employeesplaceahighvalue onleaderswhoare
characterized by the Creative leadership competencies.
The employees show a strong preference for a creative, inspiring, and
courageous leader, scoring higher on this leadership style than on the other
seven. This translates into a strong demand among Europeanemployees for
a leader of vision and innovation whois prepared to deal with the increasing
complexity of the business environment and who sees creativity and con-
tinuousimprovementasthekeysto success. European employees seek a
leader who acts as a source of inspiration, motivating the workforce and
taking courageous business decisions. These expectations, however, are sig-
nificantly abovetherequirementstheirleadersneedtomeet in order to
achieve business excellence.
The employees’ low score on the Specialist leadership style (below 35)
can be seen as the mirror image of the high value they place on the Captain
and Creative styles. The solitary nature of the Specialist leader, his lack of
“people” skills and ability to inspire. arethedirectantithesis of the
Captain’s and the Creative leader’s attributes. The Specialist style of leader-
ship is clearly not appreciated or regarded by employees as being of great
value.
European leaders, whose Specialist score was significantly above the
employee rating for that style, place a greater value on this leadership style
than their enlployees do.
2.8. Conclusions
In seeking to achieve business excellence, European leaders may encounter
resistance among their employees.Of crucial significance in this regard is the
fact that European employees place a markedly lower value on the Team
Builder and Strategic competencies than is required for business excellence.
By contrast, their “ideal” leader is heavily characterized as being creative,
inspiring. and an active problem solver.
The clear findings from this research study were that the five crucial
drivers of businessexcellence aretheTeam Builder,the Captain,the
Strategic,theCreative,andtheImpulsiveleadership styles (Fig. 4).
Leaderstryingtoachievebusinessexcellencemusttherefore view the
high-level attainment of these sets of leadership competencies a s their para-
mount objective.
It is important to remember, however. that this must not be done at
the cost of neglecting otherleadershipcompetencies. As theExcellent
Leadership profile demonstrates. the other leadership styles may be of less
importance to achieving business excellence than the five leadership styles
mentionedabove,but this doesnotmeanthat theyshould be neglected
altogether. The overall balance of the ELP requires the other leadership
styles to be maintainedat levels within theELPinterval.Maintaining a
certainfocus on thesecompetencies is therefore still an important aspect
of excellent leadership.
3. MONITORINGTHEIMPLEMENTATION OF THE
SUCCESS CRITERIA FOR BUSINESS EXCELLENCE
--
I
'I/ ', \
\ '\
Pian:
How? 1. Leadership
2. People Management
3. Policy and Strategy
4. Resources
5. Processes
What? 6. PeopleSatisfaction
7. Customer Satisfaction
8. impact on Society
9. Business Results
Action: O
I .Self-Assessment
Figure 9 The elements of Plan in relation to the yearly strategic planning process
(items 1-10),
Plan: 1. Leadership
2. People Management
3. Policy and Strategy
4. Resources
Do: 5. Processes
Check: 6. People Satisfaction
7 Customer Satisfaction
8. Impact on Society
9 Business Results
Action: O
I .Self-Assessment
With this raw material the company has strong input for the next PDCA-
leadership cycle for business excellence.
Letus look more specifically ateducationandtraining in the Do
phase.
condition for buildingvalues into the processes, i.e., value building of intan-
gible processes,which again will improvethetangibleresults.Figure 12
shows how this process is guided by theprinciplesof the TQM pyramid
supplemented by education and training.
If we look at Education and Training (Fig.12), we see that it forms the
foundation of a temple and that its aim, quality of people, is the roof of the
temple. The pillars of the temple are the main elements of Education and
Training: ( I ) learning, (2) creativity, and (3) team building. Trainingin team
building is a necessary element to support and complement creativity and
learning. The importance of team building was also clearly demonstrated in
Section 2 of this chapter (see Figs. 4 and 7).
The main elementsof the three pillars are shownin Figures 13-15. It is
seen that the elements of each pillar are subdivided into a logic part and a
nonlogic part. The logic part of each pillar contains the tools to be used for
improvementoftangibles(things,processes,etc.),andthenonlogicpart
contains the models, principles, and disciplines that are needed to improve
intangibles such as the mind-set of people (mental models, etc.). Learning
and applyingthetoolsfromthe logic part of thethreepillarsmayalso
gradually have an indirect positive effect on intangibles.
Most of the methods presented in this volume are related to the logic
part of the three pillars.To build quality into peopleand to achieve business
excellence, logic is not enough. Education and training should also comprise
the nonlogic part of the pillars, which is a precondition for effective utiliza-
tion of the well-known logical tools for continuous improvement. It is a
common learning point of worldclass companies that managers are the
most important teachers and coaches of their employees. That is the main
reason why education and training are integrated in the PDCA-leadership
cycle for business excellence.
4. CONCLUSION
U
5
tDo,
/ k
Intangible Results 5
Tangible Results
I %!
c
ngible Processes
ue building) Recognition
Achievement
Mental
7 new
models
pDCA/
Wish to
PDSA learn
Joy In
learnlng
e * Personal
Figure 13 The logic and nonlogic parts of Learning in Education and Training.
Non-logic
5a
P
1
5.
a
w
iw - La
I /
Figure 14 The logic and nonlogic parts of Creativity in Education and Training.
74 Dahlgaard et al.
Non-loglc
Touls: *Shared
nicatlon * Joy in
teamwork
RECOMMENDED READINGS
1. INTRODUCTION
77
78 DelCastilloetal.
found to modify the quality characteristic, then an EPC scheme is put into
place tocompensateforsuchdriftingbehavior.However,abrupt,large
shifts in the quality characteristic indicate major failures or errors in the
process that cannot generally be compensated for by the EPC controller.
For this reason, many authors have suggested that an SPC chartbe added at
the output of an EPC-controlled production process to detect large shifts.
There is noclearmethodology,however,thatmodelssuchintegration
efforts in a formal and general way.
In contrast, interest in SPC-EPC integration in discrete part manufac-
turing is morerecent.Inthistypeofproductionprocess,elementsthat
induce autocorrelation are not common. However, drifting behavior of a
process that “ages” is common. A typical example of this is a metal machin-
ing process in which the performance of the cutting tool deteriorates (in
many cases, almost linearly) with time. Many years ago, when market com-
petition was not so intense, specifications were wide enough for a produc-
tion
process
drift
towithout
producing
large
a proportion of
nonconforming product. With increasing competition, qualityspecifications
have become more rigorous, and drifting behavior, rather than being toler-
ated, is actively compensated for by simple EPC schemes.
Academic interest in the area of SPC-EPC integration has occurred as
anaturalreactiontotherequirements of industrialpractices.However,
most of the approaches suggested during the discussion on this problem
argued from the point ofviewof practical necessities alone. Proponents
of either side admit that many control problems in modern manufacturing
processes cannot be solved by either SPC or EPC alone. As a consequence.
methods from each field are recommended as auxiliary tools in a scheme
originally developed either for SPC or for EPC applications alone. None of
these approaches have been really successful from a methodological point of
view. The models used were originally designed for either proper SPC or
EPC applications but not for an integrationof the two. The practical neces-
sity of an integrating approach to industrial control problems is obvious,
but a rigorous mathematical model to reflect this need is still missing. As a
reaction to this methodological gap, the present chapter establishes a simple
model that integrates the positions of SPC and EPC.
Although diverse authors have discussed the different aims and strategies of
SPCandEPC (e.g., Barnard, 1963; MacGregor, 1988, 1990; Box and
Kramer, 1992; Montgomeryetal., 1994), few specific modelshavebeen
integration ofSPCandEPC in Manufacturing 79
proposed for the integration of these fields. Among these models we find
algorithmic statistical process control (ASPC) and run-to-run control pro-
cedures.
2.1. ASPC
Vander Weil et al. (1992) (see also Tucker et al., 1993) model the observed
quality characteristic 6, of a batch polymerization process at time t as
where the first term on the right represents a shift of magnitude p that occurs
at time t o , x, is the compensatory variable, and thenoise term is a stationary
ARMA( 1 , l ) stochastic process. In what the authors refer to as algorithmic
statisticalprocesscontrol(ASPC), processshifts aremonitored by a
CUSUM chart, whereas the ARMA noise is actively compensated for by
an EPC scheme. Using a similar approach, Montgomery et al. (1994) pre-
sentedsomesimulationresults.Clearly,ASPC is focused on continuous
production processes.
A basic weakness of the APSC approach is that there is no explicit
stochastic model for the time f o of shift occurrence.
3.3. ProcessChangesinSPCModels
Statistical process controlis designed for manufactuqing processes thatexhi-
bit discrete parameter shifts that occur at random time points. Thus in SPC
models the most general form of the output process (E,;)N,, is the sum of a
mrrrked point process anda white mise component.Thisapproach is
expressed by the model
82 Del Castilloetal.
kf! = P, + E ,
In this formula (P,)~, is a marked point process,
with a target p*, with marks ?i2, . . . representing the sizes of shifts 1 , 2 , . . .
andacountingprocessthat gives thenumber ofshifts in timeinterval
[O; t). in Eq. (3) is awhitenoiseprocessindependentof ( P , ) ~ , ,The
.
white noise property is expressed by
E[&,]= 0, V [ E , ]= 0 2 , E [ E , E , +=
~ ]0 for all t E No;k E N
(5)
where the random variable y is the sign of the deviation from target with
3.4. ProcessChangesinEPCModels
Engineering process control is designed for manufacturing processes that
exhibit continuous parameter drifts. Some typical instances of open-loop
output sequences in EPC models are as follows.
Integration of SPC and EPC in Manufacturing 83
ARMA Models
An important family of models usedto characterize drifting behavior occur-
ring due to autocorrelated data is the family of ARMA(p, q) models (Box
and Jenkins, 1976):
where (E,)N,is a white noise sequence [see Eq. ( 5 ) ] . By introducing the back-
operator Bk3’if(
shift Eq. (7) can be writtenas
or as
Deterministic Drift
If the drifting behavior is caused by aging of a tool (see, e.g., Quesenberry,
1988), a simple regression model of the form
For example, if h,(f?)= I , then (9) is a random walk with drift tt that has
behavior similar to that givenby (8) but with variance that increaseslinearly
with time.
For the random walk with drift model [see (9) with h,(B)= 11, let
r=O
t; = 3/((P:.1)),s5,9. . . 3
(1)
(P.Y( E ; ) ),ss,7(9,s ).T5,?. ' . 7 (v)),s5,, (16)
4.1. AdditiveDisturbance
In many cases an abrupt shift can be modeled as a translation of the output
value 6;. To express this situation in the terms of model (16), we choose
where ( P , ~ ) ~is, ,a shift process of the type introduced by Eq. (4), (qs)Nois a
process that represents the effect of continuous drifts [see Eqs. (12), (13),
and (15)], and ( E , ~ ) ~is" a white noise sequence[see Eq. (5)]. In many cases, we
simply have G , ( ( E , ~ ) ,=~ ~E ,, )[see Eqs. (1 1) and (12)]. For examples of func-
tions G , ( ( E , ~ ) ,that
~ ~ , )express a cumulative effect of the white noise variables,
see Eq. (14).
4.2. ShiftinDriftParameters
Usually the models for drift processes ( that are used in EPC depend on
parameters. These parameters can be subject to shifts during production.
Engineering controllers, however, are designed for fixed and known para-
meter values and cannot handle such sudden parameter shifts. Even adc1ptive
EPC schemeshave thefundamentalassumptionthatthechanges in the
parameters are slow compared to the rate at which observations are taken
(AstromandWittenmark, 1989). ThussupplementarySPC schemes are
required to detect these abrupt changes (Basseville and Nikiforov, 1993).
Let us consider two simple models thatwill be investigated in some detail in
Section 5.
14, = T + dt
K=2, M=l,
r=l I= I I= I
K = 2M, = 1,
5. ENGINEERINGPROCESSCONTROLLERS
where T denotes the process target and N is the total number of observa-
tions the process is going to be run. Minimization of JI results in a m i n b n z m ~
I I I ~ N M s c p r e c~rror(MMSE) controller (Box and Jenkins, 1976), which is also
controller by Astrom (1970) if E,, denotes devia-
called a n 1 i r 7 i r n u m ~~rrirrnce
tion from target, in which case T = 0 in (23). From the principle of optim-
alityofdynamicprogramming, it can be shownthatthe minimizing
criterion (23) is equivalentto minimizingeach E[(k,- T)’] separately
(Soderstrom, 1994, p. 313).
Other cost indices have been proposed for quality control applications.
The following cost index was proposed by Box and Jenkins (1963) for their
“machine tool” problem:
which implies that the full effect of the compensatory variable is felt imme-
diately on the quality characteristic. Furthermore,we assume as before that
the noise variables ( E , ) ~ , form a white noise sequence. These two assump-
tions guarantee that the closed-loop variablesto, tl,. . . are all independent.
This makes it easier to see how the MMSE criterion (23) is equivalent to
requiringthateachsquaredeviationbe minimizedseparatelywithout
recourse to dynamic programming techniques.
It is clear that the controlrule has tobe designed for the case where the shift
components pj‘) are on their targets p;, i.e., for the case
s , = -d(t + 1)
Hence the MMSE controller as defined by (28) corresponds to a pure
“feedforward”controller(i.e.,theobservation is not “fed back”into
thecontrolequation,butrathertheanticipateddisturbance is used).
Controller (28) is equivalent to rule dl in Quesenberry (1988) if the sample
size k of that paper equals 1.
integration of SPC and EPC in Manufacturing 91
r=l
As in Section 5.1, the control rule has to be designed for the case in
which the shift components p:‘) are on their targets pT, i.e., for the case
XI = -nt - C(5;- T )
I= I
92 DelCastillo et al.
The second term on the right-hand side of (32) justifies the name “discrete
integral controller” used for this type of control rules.
Finally, let us evaluate the effect of control rule (31) on the output
quality characteristic under the effect of the shift components (p!‘))N,l. From
(30) and (32) we obtain
St = T + E, for all t E N
where 1.17 = 0, p; = d are the target values, A I > 0 is the absolute shift size,
vI is the random time until occurrenceof the shift, and y I is the random sign
of the shift.
Under these assumptions the output equation (29) of the deterministic
trend model becomes
where
For the control variable of the random walk with drift model we obtain, by
inserting (35) into (34),
6.2. ShiftsOccurringDuringProductionTime
In the deterministic trend case, the controller defined by (28) has no feed-
back from the output and is thus not able to compensate for random shifts.
As is obvious from (36), an additive shift takes the process mean away from
its target T to T + 7, A , , but the output at least remains stable i n its mean.
A shift in the drift parameter is even more harmful. After such a shift, the
output mean has a trend component ( t - Lv2j)y2A2. It is obvious that in the
94 DeiCastiiioetai.
and
The effect of this type of parameter shift in the trend and random walk
models is exactly the same as in Section 6.2.
s I = -d(r + 1) if t I -A/d - 1
if t > - A / ( ] - 1
Under the simple shift components of type (35) the output [ I satisfies the
right-hand side of (36) for t 5 - A / t l . For t > - A / d we obtain
From (43) the distribution of K can be found by first determining the con-
ditional distribution under vi and y I and then integrating with respect to the
corresponding densities. We shall not investigate this problem here.
The relationship between the control variable s , of (39) and the con-
strained control variable ,:. is
x, = x,A if t < K
i f f l ~ (44)
Whether there are parameter shifts or not, the outputexhibits twice as great
a variance as in the case of using the correct model. This case occurs in
Quesenberry’s (1988) d, and [I3 rules. If p:’) = 0 and p:” = tl for all t , then
E,, = T + ( I - a)&,, which is an MA(1) process, an always-stationary time
series model (Box and Jenkins, 1976).
If parameter shifts occur, we have the following result. Except for the
+
single outlier for v I < t 5 v I I , 6, is permanently off target for t > v, with
absolute deviation A*. This, and the uncertainty about the correctness of the
assumptions of the model, make it advisable to use SPC methods i n addition
to the simple EPC schemes.
If, on the contrary, the deterministic trend controller (28) is used in a
random walk with drift process, the closed-loop equation is, by (30).
I I
In the long run, if we let t grow without bound and we use the inverse
of the difference operator, namely,
If p;’)= 0 and p:?) = d for all t , then the previous equation reduces to
In this case, whether there are parameter shifts or not, the output exhibits
variance that increases linearly with time compared with the case of using
the correct model. Thus it is evident that using an EPC controller designed
for a random walk with drift model is “safer” than using an EPC controller
designed for a deterministic trend process in case we selected (by mistake)
the wrong drift model.
Taking the shifts into account we have the following result. There is a
shift in the mean for v l < t and a shift that results in a trend for v2 < t .
Again, given the uncertainty about the correctness of model assumptions it
is obviously advisable to use additional SPC methods.
Under the control rules (Sl), (S2), (S3), the 1’1111 length, i.e., the number q of
samples until occurrence of an alarm, is defined by
We assume production speed I ; i.e., one item is produced in one time unit.
Then the total time until occurrence of an alarm (time to signrcc) is
llh + I1 - 1
Define 6, = A , / o . We use this standardization to avoid the nuisance para-
meter 0.
The ARL cannow be defined as the expected value of q , considered a s
a function A(zl. t i l , z 2 ,6?) of the shift amounts 6, and the signs zI of shifts:
Thus, in particular,
In the case of A2 > 0 (shift in the drift parameter), the alarm prob-
abilities vary in the number li of the sample. The distribution of*the run
length q is determined by the probabilities
102 Del Castillo et al.
rn= I
7.2. Example
Chemical mechanical planarization (CMP) is an important process in
themanufacturing of semiconductors. A key qualitycharacteristic in a
CMP process is theremovalrateof silicon oxide fromthesurfaceof
eachwafer.Sincethepolishingpadswearoutwith use, anegativetend
is experienced in this response, in addition to random shocks or shifts. The
removalratehasatarget of 1800 and is controlled via adeterministic
trend EPC scheme. Theerrorsare norm$ly distributed withmeanzero
and CJ = 60, and an estimate of the drift d is used for control purposes. It
is desired notto let theprocessrunformorethan an averageof 10
samples if abias in thedriftestimate of magnitude 0.010 = 0.6 exists.
In the absence of shifts in the mean or trend, an ARL of 370is desired.
In addition, positive shifts of size A I = l o should be detected, on average,
after a maximum of 12 samples if the aforementioned biased trend esti-
mate is (incorrectly) used by the EPC.
Table 1 shows numerical computations for this problem using Eqs.
(57)-(60) and varying 17 from 1 to IO. Clearly, the desired ARL of 370is
obtained with c = 3; thusthetableshowsresultsforthis valueof c.
Fromthe
table, A ( 0 , 0, - 1 , O . O l ) = 10.86 for I I = 4, and
A(1, I , - 1 , O . O l ) = 13.51 for 11 = 5. Therefore, the chart designwith the
smallest sample size that meets the design specifications calls for using
n = 5 and c = 3. The h design parameter ( h 2 H ) should be decided based
Integration of SPC and EPC in Manufacturing 103
ACKNOWLEDGMENTS
Dr. Castillo was funded by NSF grants INT 9513444 and DM1 9623669.
Drs. Gob and von Collani were funded by DAAD grant 315/PPP/fo-ab.
104 Del Castilloet al.
0 ?. b 6 B s s ~ b ~ ~ ~ ~ ~ ~ b ~ ~ 0 ~
Sample, k
REFERENCES
Astrom KJ. 1970. introduction to Stochastic Control Theory. Academic Press. San
Diego, CA.
Astrom KJ, WittenmarkB. 1989. Adaptive Control. Addison-Wesley. Reading, MA.
Barnard GA. 1959. Control charts and stochastic processes. J Roy Stat Soc Ser B
XXI(2): 239-27 I .
Busseville M. Nikiforov IV. 1993. Detection of AbruptChanges--Theoryand
Application. Prentice-Hall. Englewood Cliffs. NJ.
Box GEP.Jenkins G. 1963. Furthercontributions to adaptivequalitycontrol:
Silnultaneous estimation ofdynamics: Nonzero costs. Bull Int Stat lnst 34:
943-974.
Box GEP, Jenk'ins G. 1976. Time Series Analysis. Forecstihg. and Control. Rev. ed.
Holden Day, Oakland, CA.
Integration of SPC and EPC inManufacturing 105
Box GEP. Kramer T. 1992. Statistical process monitoring and feedback adjustment
-A discussion. Technometrics 34(3): 251-267.
Butler SW. Stefani JA. 1994. A supervisory run-to-run control of a polysilicon gate
etch using in situ ellipsometry. IEEE Trans Semicond Manuf 7(2): 193-201.
Crowder SV. 1992. An SPC model for short production runs: Minimizing expected
cost. Technotnetrics 34: 6 4 7 3 .
Del Castillo E, Hurwitz A. 1997. Run to run process control: A review and some
extensions. J Qual Technol 29(2): 184196.
Del Castillo E. 1996. Some aspects of process control in semiconductor manufactur-
ing. Proceedings of the 4th W~rzburg-Umea Conferencein Statistics. pp. 37-
52.
Ingolfsson A. Sachs E. 1993. Stability andsensitivity of an EWMA controller.J Q u ~ I
Technol 25(4): 271-287.
MacGregor JF. 1988. On-line statistical process control. Chetn Eng Prog October,
pp. 21-31.
MacGregor JF. 1990. A different view of the funnel experiment. J Q L ITechnol
~ 22:
255-259.
Montgomery DC, Keats JB. Runger GC, Messing WS. 1994. Integrating statistical
process control and engineering process control. J Qual Technol 26(2): 79-87.
Quesenberry CP. 1988. An SPC approach to compensating a tool-wear process. J
Qual Technol 20(4): 220-229.
Sach E, Hu A. Ingolfsson A. 1995. Run by run process control: Combining SPC and
feedback control. IEEE Trans Semicond Mnnuf 8( I ) : 2643.
SoderstromT. 1994. Discrete TimeStochastic Systems. EstimationandControl.
Prentice-Hall. Englewood Cliffs. NJ.
Tucker
WT, Faltin
FW,
Vander Wiel SA. 1993. ASPC: An
elaboration.
Technometrics 35(4): 363-375.
Vander Weil SA. Tucker WT, Faltin FW, Doganaksoy N.1992. Algorithmic statis-
tical process control: Concepts and an application. Technometrics 34(3): 286-
297.
This Page Intentionally Left Blank
Reliability Analysis of Customer Claims
Pasquale Erto
University of Naples Federico 11, Naples, Italy
1. INTRODUCTION
107
108 Erto
1. Vehicletypecode
2. Assembly date
3. Componentand defect code
4. Mileage tofailure
In formal statistical language, the warranty data are failure observa-
tionsfromasamplethat is bothtruncated(attheend of thewarranty
period) and has items suspended at the number of kilometers effectively
covered by the respective customers. Thus, to carry out areliability analysis.
both the numberof failures and the numberof suspensions for each mileage
interval are required. Obviously, the warranty data give no mileage infor-
mation about those vehicles that are sold and reach the end of each mileage
interval without any claim being made. Thus, no direct information is avail-
able about the population towhich the reported number of failed items must
be referred. Therefore, the usual procedures used by the automotive indus-
try [ 2 4 ] require the a priori estimation (often arbitrary) of the vehicle dis-
tribution versus mileage in order to partition the total number of vehicles
under warranty into mileage intervals. Note that this distribution may also
be verydifferentfromcaseto case, since it mayconcern vehicles under
110 Erto
3. AREAL-LIFERELIABILITYANALYSIS
3.1. TheAvailableDataSet
Failure data normally refer to about 40 different components (or parts) of
some car model. The kilometers to failure are typically grouped into equal-
width lifetimes, each of 10,000km, and all vehicles under consideration are
sold during the same year in which repairs are made.
In our case from real life [SI, 498 cars were sold in the year, and the
totalnumber of warrantyclaims referred tothemanufacturer was 70.
Furthermore, irrespective ofthepartsinvolved,thisnumber ofclaims
were distributed over the lifetimes as shown in Table 1.
The characteristic that makes this case peculiar is that no age (from
selling date) distribution and no distributionof covered kilometers are given
for the fleet under consideration. Hence, it is not possible to allocate the
unfailed units in each lifeime. To overcome this difficulty we can use the
reliability analysisapproach,introducinganestimationprocedurethat
involves at the same time both the failure and kilometer distributions.
Number
of claims
Reliability Analysis of Customer Claims 111
a+b
ll
{ 1 - exp[-(a + b)t])
and, similarly, the probability that an item is suspended before t becomes
b
~ {l-
a+b
99.9
99
90
70
50
30
20
io
5
1
0.5
0.1
0.1 0.01 1 10 100
t , covered kilometers (km/lOOO)
Figure 1 Weibull kilometer distributions at 3, 6. 9, and 12 months and the corre-
sponding compound distribution (heavy line).
Reliability Analysis of CustomerClaims 113
3 1170 6 mo 9 mo Compound
12 mo
Weibull shape 1.13 1.17 1.13 1.14 0.97
parameter
Weibull scale 5.85 1 1.04 17.56 23.25 13.56
parameter
(6a)
and
114 Erto
C
nl
=-
N
* -ln(l - n l / n )
c =
T2
3.4. PracticalExampleandComments
The maximum likelihood method was applied to the sample of warranty
claimsfromreal life given in Table 1. The followingestimatesofthe
unknown parameters a and b were found:
The model chosen appears to fit the experimental data with an extremely
high degree of accuracy. In Table 3 the observed and the estimated numbers
of failures for each lifetime are reported. The estimated average number of
kilometers for the fleet under test is 1/b = 7734 km, which is a very plausible
value for a fleet of cars whose ages are distributed over 12 months.
REFERENCES
6. Jones JA, Hayes JA. Use of a field failure database for improvementof product
reliability. Reliab Eng Syst Safety 55: 131-134, 1997.
7. Waync KY. Kapur KC. Customer driven reliability: Integration of QFD and
robust design. Proceedings
of Annual Reliability and Maintainability
Symposium, Philadelphia. 1997. pp. 339-345.
8. Erto P, GuidaM.Somemaximum likelihoodreliability estimatesfromwar-
ranty data of cars in users' operation.Proceedings of European Reliability
Confercnce, Copenhagcn, 1986, pp. 55-58.
9. Erto, P.Reliabilityassessments by repair shops via maintenance data. J Appl
Stilt 16(3): 303-313. 1989.
I O . Erto P, Guida M. Estimation ofWeibullreliability from few life tests. J Qual
Reliab Eng Int l(3): 161-164, 1985.
I I . Erto P, Giorgio M. Modifiedpractical Bayes estimators. IEEE Trans Reliab
45(1): 132-137, 1996.
Some Recent Developments in Control
Charts for Monitoring a Proportion
Marion R. Reynolds, Jr.
Virginia Polytechnic Institute and State University, Blacksburg, Virginia
Zachary G. Stoumbos
Rutgers University, Newark, New Jersey
1. INTRODUCTION
117
118 Reynolds and Stoumbos
studied before [see, e.g., Gan (1993)l. As in the case of 100% inspection, a
disadvantage of the CUSUM chart in this situation has been that designing
one for a particular application has been difficult unless the values of I? and
po in the application happen to correspond to publishedresults. A contribu-
tion of the current chapter is to show how todesign a CUSUM chart for the
binomial distribution using relatively simple and highly accurate approxi-
mations.
The third control chart to be considered here is a chart that can be
applied when it is not feasible to use 100% inspection but it is feasible to
varythe sample size used at each sampling point depending on the data
obtained at that sampling point. The sample size is varied by applying a
sequentid prohnhility ratio test (SPRT) at each sampling point. This SPRT
chart for monitoring p is a variable sampling rate control chart, and it is
muchmore efficient thanchartsthattakea fixed-size sample.Methods
based on relatively simpleand highly accurateapproximationsare given
for designing the SPRT chart.
The remainder of this chapter is organized as follows. Sections 2-5
pertain to the Bernoulli CUSUM chart, Sections6-8 pertain to the binomial
CUSUM chart, and Sections 9-12 pertain the the SPRT chart. For each
chart, a description is given, the evaluation of statistical properties is dis-
cussed, a design method is explained, and a design example is given. Some
general conclusions are given in Section 13.
When a l l items from the process are inspected, the results of the inspection
of the ith item can be represented as a Bernoulli observation X ; , which is 1 if
the ith item is defective and 0 otherwise. Then p corresponds to P ( X , = 1).
The control chart to be consideredfor this problem is a CUSUM chart
based directly on the individual observations X , , X,, . . . without any group-
ing into segments or samples. This Bernoulli CUSUM chart is defined here
for the problem of detecting an increase in p . The problem of detecting a
decrcase in p . as well as additional details about the Bernoulli CUSUM
chart, are given in Reynolds and Stoumbos (1999).
For detecting an increase in p , the Bernoulli CUSUM control statistic
is
Control Charts for Monitoring a Proportion 121
Then, from the basic definition of an SPRT (see Section 9), it can be shown
that the appropriate choice for y is
(5)
For given values of rl and r2 and a desired value for the in-control ANOS,
Eq. (6) can be used to find the required value of hi, and then (4) and ( 5 ) can
be used to find the required value of h g . Finding /I: using (6) can be accom-
plished by simple trial and error.
In most applications it will be desirable to determine how fast a shift
from po to p l will be detected. The C D approximation to the ANOS when
P =PI 1s
Consider a production process for whichit has been possible to maintain the
proportion defective at a low level, po = 0.005, except for occasional periods
in which the value of p has increased above this level. All items from this
production process are automatically inspected, and a Shewhart p-chart is
currently being used to monitor this process. Items are grouped into seg-
ments of I I = 200 items for purposes of applying the p-chart. If 3 0 limits are
used with the p-chart, then the upper control limit is 0.01996, and this is
equivalent to signaling if TI z 4, where q is the number of defectives in the
j t h segment. When p = p o = 0.005, this results in P ( q 2 4) = 0.01868, and
it was decided that this probability of a false alarm was too high. Thus, the
upper control limit of the p-chart was adjusted so that a signal is given if
T/ 5 , and this gives a probability of 0.00355 for a false alarm. There is no
lower control limit because giving a signal for T, = 0, the lowest possible
value of T/, would result in P(T, = 0) = 0.3670 when p = p o , and thus the
false alarm rate would be unacceptably high. When p = p o , the expected
number of segmentsuntila signal is 1/0.00355 = 282.05.Eachsegment
consistsof 200 items, so thiscorrespondstoanin-controlANOS of
56,410 items.
To design a Bernoulli CUSUM chart for this problem, suppose that
process engineers decide that it would be desirable to quicklydetect any
special cause that increases p from 0.005 to 0.010 and that the in-control
ANOS should be roughly 56,410 (the value corresponding to the p-chart in
currentuse).Fromaprevious discussionofthecaseof 170 = 0.005 and
y l = 0.010, it was shown that adjusting p I slightly from 0.010 to 0.009947
would give r 1 / r 2= 1/139, and thus I H = 139. Using trial and error to solve
(6) to give ANOS(po) X 56,410 resultsin a value of /I> of 6.515 [this valueof
/I> will give an in-control ANOS of 56,408 according to the approximation
of Eq. (6)]. Then, using(4) and ( 5 ) to convert to / I , gives ~ ( p= ) 4.646,
&(po)m = 0.328, and /lg = 6.187. As a point of interest, the exact in-
control ANOS using h B = 6. I87 can be calculated to be56,541by using
the methods given in Reynolds and Stoumbos (1999). Thus, in this case the
C D approximation gives results that are extremely close to the exact value
and certainly good enough for practical applications.
After / I , has been determined, Eq. (7) can be used to determine how
fast a shift from p o to p , willbe detected. Using /I> = 6.515 in (7) gives
Control Charts for Monitoring a Proportion 125
6. THEBINOMIALCUSUMCHART
plicated by the fact that the CUSUM statistic may not be at its starting
value when the shift in p occurs. If it is assumed that the CUSUM statistic
has reached its stationary or steady-state distribution by the time the shift
occurs.thentheexpectedtimefromthe shift tothe signal is called the
steady-state ATS(SSATS).Whenpeformingcomprehensivecomparisons
of different control charts, it is appropriate to consider the SSATS as a
measure of detectiontime.However,forthelimitedcomparisonsto be
given in the design examples in this paper, the ATS will be used.
exact value is 47 1.3). Neither of these ANSS values is extremely close to the
desired value of 282, but suppose that itis decided that 228.7 corresponding
to h Y = 5 is close enough. Using h Y = 5 will give an in-control ATS of
approximately4 x 228.7 = 914.8. Using h Y = 5 and 7,/; = 5.33 in (11)
gives ANSS(pI) x 11.6 (theexactvalue is 11.9). Thiscorrespondstoan
ATS at p = p 1 = 0.0098 of approximately 4 x 11.6 = 46.4hr. At p = p I ,
the ATS of the p-chart is 80.3 hr. Thus, the binomial CUSUM chart will
detect a shift to p I fasterthan the p-chart will. Note that the p-chart is
sampling at a higher rate than the CUSUM chart (200 every 4 hr versus
140 every 4 hr), but the CUSUM chart has aslightly higher false alarm rate.
When the p-chart is being used to detect small increases in p above a
small value of p o , it is necessary to use a large sample size to detect this
increase in a reasonable amount of time. This may require that the sampling
interval cl be relatively long in order to keep the sampling cost to a reason-
able level. However, for the binomial CUSUM chart it is not necessary to
have large; it is actually better to take smaller samples at shorter intervals.
Thus, as an alternative to taking a sample of = 140 every r l = 4 hr, con-
sider the option of taking a sample of/? = 70 every cl = 2 hr. If the binomial
CUSUM chart uses 11 = 70 and p 1 = 0.009820, then the reference value will
be 11 = 70/140 = 0.5, and the possible values for Y, will be integer multiples
of 0.5. Thus, it is sufficient to look atvalues for h Y that are integer multiples
of 0.5. If the p-chart has an in-control ATS of 1128 and it is desirable to
have approximately the same value for the binomial CUSUM with d = 2,
then the in-control ANSS should be I128/2 = 564. Using h y = 5.5 in (9)
gives /I;. = 5.83, and using this i n (IO) gives ANSS(po) X 558.5 (the exact
value is 557.9). This corresponds to an in-control ATS of approximately
2 x 558.5 = 1 1 17.0. Using (1 1) gives ANSSGI~)x 24.6 (the exact value is
25. I). This corresponds to an ATS at p = pI = 0.0098 of approximately
2 x 24.6 = 49.2 hr. Compared to the p-chart. this binomial CUSUM chart
has almost the same false alarm rate and a lower sampling rate, yet it will
detect a shift to p 1 much faster.
As another alternative to taking a sample of = 140 every d = 4 hr,
consider the option of taking a sample of = 35 every hour. If the binomial
CUSUM chart uses I? = 35 and pI = 0.009820. then the reference value will
be n y = 35/140 = 0.25, and the possible values for Y, will be integer multi-
ples of 0.25. Thus, it is sufficient to look at values for h y that are integer
multiples of 0.25. If the p-chart has an in-control ATS of 1128 and it is
desirable to have approximately the same value for the binomial CUSUM
with d = I , then the in-control ANSS should be 1 128. Using h Y = 5.75 in (9)
gives h;. = 8.08, and using this I?;, in (10) gives ANSS@{)) 1228.2 (the
exactvalue is 1226.6). Because cl = 1, this correspondstoanin-control
ControlChartsforMonitoring a Proportion 129
,=I
is the total number of defective items in the first ,j items inspected at sam-
pling point k.
The SPRT chart requires the specification of two constants CI and h,
h < ( I , and uses the following rules for sampling and making decisions.
1. At sampling point k , if b < Sk/< a, then continue sampling.
2 . At sampling point k , if S,, 2 a, then stop sampling and signal that
p has changed.
3. At sampling point k , if S,, 5 h, then stop sampling at sampling
point k and waituntilsamplingpoint +
k 1 to beginapplying
another SPRT.
The inequality
Control Charts for Monitoring a Proportion 131
When evaluating any hypothesis test, a critical property of the test is deter-
mined by either the probability that the test accepts the null hypothesis or
the probability that the test rejects the null hypothesis, expressed as func-
tions of the valueof theparameterunderconsideration.Followingthe
convention in sequential analysis, we work with the operating cllaracteristic
(OC) function, which is the probability of accepting Ho as a function of p.
For most hypothesis tests the sample size is fixed before the data are taken,
but for a sequential test the sample size, say N , depends on the data and is
thus a random variable. Therefore, for a sequential test the distribution of N
must be considered. Usually, E ( N ) ,called the a w a g e sample tlumber (ASN),
is used to characterize the distribution of N .
132 Reynolds and Stoumbos
1
ANSSQ) =
1 - OCQ)
When there is a fixed time interval el between samples and the time
required to take a sampleis negligible, then the ATS is the product of d and
the ANSS. Thus, the ATS at p , say ATSO,), is
Exact expressions for the OC and ASN functions of the SPRT for p
can be obtained by modeling the SPRT as a Markov chain [see Reynolds
and Stoumbos (1998)l. These expressions, however, are relatively compli-
cated,andthus it would be convenient to havesimplerexpressionsthat
couldbe usedin praticalapplications.Theremainderof thissection is
concernedwithpresentingsomesimpleapproximationstothe OC and
ASN functions. These approximations to the OC and ASN functions are
presented here for the casein which 0 < po < 0.5. The case in which po 2 0.5
is discussed in Reynolds and Stoumbos (1998).
When the SPRTis used for hypothesis testing,it is usually desirable to
choose the constantsg and / I such that the test has specified probabilities for
type I and type I1 errors. The CD approximations to the OC and ASN
Control Charts for Monitoring a Proportion 133
It is shown in Reynolds and Stoumbos( 1 998) that choices for g and /I* based
on the CD approximations are
/I* 23 I log(--)
- I-P
1’2
and
If nomial values are specified for CY and p, then g and /I* can be determined
by using Eqs. (21) and (22), and then the value of / I can be obtained from /I*
by using Eq. (20).
The CD approximation to the ASN at po and pl can beexpressed
simply in terms of c1 and P [see Reynolds andStoumbos (1998)l. For
p = p o , this expression is
Thus, for given c1 and [3,evaluating the ASN at po and p 1 is relatively easy.
pling rate, and specifying ATS(po) will determine the false alarm rate. Once
these quantities are specified, the design proceeds as follows.
The value of c1 is determined by using Eq. ( I 8) and the specified values
of d and ATS(po). Then, using (23), the value of p can be determined from
the specified value of ASN(p,) andthe valueof c1 justdetermined.
Expression(23) cannot be solvedexplicitlyfor p in terms of 01 and
ASN(p,), so thesolutionfor p will have to be determinednumerically.
Once c1 andaredetermined,Eqs. (21), (22), and (20) can be used to
determine g and h.
The in-control ASN of this chart should be approximately 200 (the exact
value is 198.97), and the in-control ATS should be approximately 1128 hr
(theexactvalue is 1128.48 hr). Using Eq. (24),thischart’s ATSat
p = pI = 0.009947 should be approximately d/(l - p) = 4/ (1 - 0.7231) =
14.45hr (the exactvalue is 14.51 hr).Thus,comparedtothe valueof
78.73 hr for the p-chart, the SPRT chart will provide a dramatic reduction
in the time required to detect the shift from po to p1.
The value chosen forp I is really just a convenient design device for the
SPRT chart, so this value of p would usually not be the only value that
should be detected quickly. Thus, when designing an SPRT chart in prac-
tice, it is desirable touse the CD approximation (or the exact methods)given
in Reynolds and Stoumbos (1998) to find the ATS for a rangeof values of p
around p l . For the evaluation to be given here, exact ATS values for the
SPRT chart were computed and are given in column 3 of Table 1. ATS
values for the p-chart are given in column 2 of Table 1 to serve as a basis of
comparison. Comparing columns 2 and 3 shows that, except for large shifts
in p , the SPRT chart is much more efficient than the p-chart. When con-
sidering the binomial CUSUM in Section 8, it was argued that it is better to
take small samples at more frequent intervals than to take large samples at
long intervals. To determine whether this is also true for the SPRT chart, an
SPRT chart was designed to have an approximate in-control ASNof 50 and
a sampling interval of d = 1 hr. This would give the same sampling rate of
50 observations per hour as in columns 2 and 3. The ATS values of this
second SPRT chart are given in column 4 of Table 1. Comparing columns 3
and 4 shows thatusing a sampling interval of d = 1 with ASN = 50 is better
than using a sampling interval of d = 4 with ASN = 200, especially for
detecting large shifts.
In some applications, the motivation for using a variable sampling rate
control chart is to reducethesamplingcostrequired to produce a given
detection ability [see Baxley (1996), Reynolds (1996b), and Reynolds and
Stoumbos (1998)l. Because the SPRT chart is so much more efficient than
the p-chart, itfollows thattheSPRTchartcould achievethedetection
ability of the p-chart with a much smaller average sampling rate. To illus-
trate this point, the design method given in Section 11 was used to design
some SPRT charts with lower average sampling rates. Columns 5 and 6 of
Table 1 contain ATS values of two SPRT charts that have an in-control
averagesamplingrateofapproximatelyhalfthevalueforthep-chart
(approximately 25 observationsperhour).The SPRT chart in column 5
uses d = 2.0 and has ASN(p0) % 50, and the SPRT chart in column 6 uses
d = 1.0 and has ASN(po) % 25. Although these two SPRT charts are sam-
136 Reynolds and Stoumbos
Control Charts for Monitoring a Proportion 137
pling at half the rate of the p-chart, they are still faster at detecting shifts in
p. Columns 7 and 8 of Table 1 contain ATS values of two SPRT charts that
have an in-control average sampling rate of approximately one-fourth the
value for the p-chart (approximately12.5 observations per hour). The SPRT
chart in column 7 uses ti = 2.0 and h a s ASN(po) x 50, and the SPRT chart
in column 6 uses d = 1 .0 and has ASN(po)x 25. Comparing columns 5 and
6 to column 2 shows that the SPRT charts with half the sampling rate of the
p-chart offer faster detection than the p-chart. Columns 7 and 8 show that
an SPRT chart with about one-fourth the sampling rate of the p-chart will
offer roughly the same detection capability as the p-chart.
13. CONCLUSIONS
It has been shown that the Bernoulli CUSUM chart, the binomial CUSUM
chart, and the SPRT chart are highly efficient control charts that can be
applied in different sampling situations. Each of these charts is much more
efficient than the traditional Shewhart p-chart. The design methods based on
the highly accurate C D approximations provide a relatively simple way for
practitioners to design these charts for practical applications. Although the
design possibilities for these charts are limited slightly by the discreteness of
the distribution of the inspection data, this discreteness is much lessof a
problem than for the p-chart.
The SPRT chartis a variable sampling rate control chart that is much
more efficient than standard fixed sampling rate charts such as the p-chart.
The increased efficiency of the SPRT chart can be used to reduce the time
required to detectprocess changes or to reduce the sampling cost required to
achieve a given detection capability.
REFERENCES
Christina M. Mastrangelo
University of Virginia, Charlottesville, Virginia
1. INTRODUCTION
The standard assunlptions when control charts are used to monitor a pro-
cess are that the data generated by theprocesswhen it isin control are
normally and independently distributed with meanp and standard deviation
0.Both p and 0 areconsidered fixed andunknown.Anout-of-control
condition is created by an assignablecausethatproducesachangeor
shift in p or 0 (or both) to some different value. Therefore, we could say
that when the process is in control the quality characteristic at time 1, x,,is
represented by the model
where E, is normallyandindependentlydistributedwithmeanzeroand
standard deviation 0. This is often called the Shewhart modelof the process.
When these assumptions are satisfied, one may apply either Shewhart,
CUSUM, or EWMA control charts and draw reliable conclusions about the
state of statistical control of the process. Furthermore, the statistical proper-
ties of the control chart, such as the false alarm rate with 3 0 control limits,
or the average run length, can be easily determined and used to provide
guidance for chart interpretation. Even in situations where the normality
139
140 MontgomeryandMastrangelo
X
-I
Figure 1 A tank with volume V and input and output material streams.
ProcessMonitoringwithAutocorrelatedData 141
s , = Wo( I - e"'T)
where CI = I - e - A ' / T .
The properties of the output stream concentrationx, depend on those
of the input stream concentration 11-, and the sampling interval. Figure 2
illustrates the effect of the mean of w, on s,. If we assume that the w, are
uncorrelatedrandomvariables,thenthecorrelation between successive
values of s , (orautocorrelation between s , and isgiven by
Note that if At is much greater than T . then p E 0. That is, if the interval
between samples At in the output stream is long, much longer than the time
constant T , then the observations on output concentration will be uncorre-
lated. However, if At 5 T , this will not be the case. For example, if
AtlT = I , p = 0.37
A t / T = 0.5, p = 0.61
A t / T = 0.25, p = 0.78
A t / T = 0.10, p = 0.90
where Cov(.x,, .x,-k) is the covariance of observations that are k time periods
apart, and we have assumed that the observations(called a time series) have
constant variance given by V ( s , ) .We usually estimate the values of p k with
the sample autocorrelation function:
ProcessMonitoringwithAutocorrelatedData 143
78000 .
76 000 '
74 000 '
72.000
68.000 I V Y
A /1
v ,
66 000
64.000 i
I
62.000 '
- = - Q - Q - Q - c - C - Q - Q - i - Q -
- " ~ , N - . ~ f f n ~ C O ~ 0 P x X ~ ~
Time. 1
Table 1 ConcentrationData
Time, t X Time, t X Time, t X Time, t X
1 70.204 26 69.270 51 70.263 76 71.371
2 69.982 27 69.738 52 71.257 77 71.387
3 70.558 28 69.794 53 73.019 78 71.819
4 68.993 29 79.400 54 71.871 79 71.162
5 70.064 30 70.935 55 72.793 80 70.647
6 70.29 1 31 72.224 56 73.090 81 70.566
7 71.401 32 71.930 57 74.323 82 70.311
8 70.048 33 70.534 58 74.539 83 69.762
9 69.028 34 69.836 59 74.444 84 69.552
10 69.892 35 68.808 60 74.247 85 70.884
11 70.1 52 36 70.559 61 72.979 86 71.593
12 7 1.006 37 69.288 62 71.824 87 70.242
13 70.196 38 68.740 63 74.6 12 88 70.863
14 70.477 39 68.322 64 74.368 89 69.895
15 69.510 40 68.713 65 75.109 90 70.244
16 67.744 41 68.973 66 76.569 91 69.7 16
17 67.607 42 69.580 67 75.959 92 68.914
18 68.168 43 68.808 68 76.005 93 69.216
19 69.979 44 69.931 69 73.206 94 68.431
20 68.227 45 69.763 70 72.692 95 67.516
21 68.497 46 69.541 71 72.251 96 67.542
22 67.113 47 69.889 72 70.386 97 69.136
23 67.993 48 7 1.243 73 70.519 98 69.905
24 68.1 13 49 69.701 74 71.005 99 70.515
25 69.142 50 71.135 75 71.542 100 70.234
144 MontgomeryandMastrangelo
77 ""
e
761 e
75 j e
e f i e
74 I
66 hX 70 72 7.1 76 70
x,.,
that is, concentration observations that are one period apart are positively
correlated with r l = 0.88. This level of autocorrelation is sufficiently high to
distort greatly the performance of a Shewhart control chart. In particular,
because we know that positive correlation greatlyincreases the frequency of
false alarms. we should be very suspicious about the out-of-control signals
on the control chart in Figure 3.
Several approaches have been proposed for monitoring processes with
autocorrelated data. Just asin traditional applications of SPC techniques to
uncorrelated data, our objective is to detect assignable causes so that if the
causes are removed, process variability can be reduced. The first is to sample
from the process less frequently so that the autocorrelation is diminished.
For example, note from Figure5 that if we only took every 20th observation
on concentration, there would be very little autocorrelation in the resulting
data. However, since the original observations were taken every hour, the
new sampling frequency would be one observation every 20 hr. Obviously,
the drawback of this approach is that many hours may elapse between the
occurrence of an assignable cause and its detection.
Thesecondgeneralapproachmay be thought of as a n~ocidhcrseci
npprocrch. One way that this approach is implementedinvolvesbuilding
an appropriate model for the process and control, charting the residuals.
The basis of this approach is that any disturbances from assignable causes
that affect theoriginalobservations willbe transferredtotheresiduals.
Model-basedapproachesarepresented in thefollowingsubsection. The
nw0el:fiee n p p r o c d ~doesnotusea specific model fortheprocess; this
approach is discussed in Section 3 .
2. MODEL-BASEDAPPROACHES
2.1. ARlMAModels
An approach to process monitoring with autocorrelated data that has been
applied widely i n the chemical and process industries is to directly model the
correlative structure with an appropriate time series model, use that model
to remove the autocorrelation from the data, and apply control charts to the
residuals. For example, suppose we could model the quality characteristic s,
as
+
where 5 and 4 (-1 < < 1) are unknown constants and E, is normally and
independently distributed with mean zero and standard deviation CT.Note
howintuitive this model is fortheconcentrationdatafromexamining
146 MontgomeryandMastrangelo
e, = .x, - x,
x, = 8.38 + O . 8 8 ~ , - ~
Wemaythinkofthisas analternativetotheShewhartmodelforthis
process.
Figure 6 is an individuals control chart of the residuals from the fitted
first-order autoregressive model. Note that now no points are outside the
control limits. In contrast to the control chart on the individual measure-
ments in Figure 3, we would conclude that this process is in a reasonable
state of statistical control.
4.000
3.000
2.000
5 1,000
z
t 0.000
-1.000
-2.000
-3.000
Time, t
x f P lx,-2,
, and so forth. Another possibility is tomodelthedependence
through the random component E , . A simple way to do this is
This model often occurs in the chemical and process industries. The reason
is that if the underlying process variable.x, is first-order autoregressive and a
random error componentis added to x,, the result is the mixed model in Eq.
(6). In the chemical and process industries, first-order autoregressive process
behavior is fairly common. Furthermore, the quality characteristic is often
measured in a laboratory (or by an on-line instrument) that has measure-
ment error, which we can usually think of as random or uncorrelated. The
reported or observed measurement then consists of an autoregressive com-
ponent plus random variation, so the mixed model in Eq. (6) is required as
the process model.
148 MontgomeryandMastrangelo
l-P,+P
ARLRES =
where P , is the probability that the run has length 1, that is, the probability
that the first residual exceeds f 3 ,
p , = Pr(run length = 1)
= 1 - @(3- 6)+ @(-3 - 6)
Correlation 6/c
0 0 0.5 4 1 2
0.00 370.38 152.22 43.89 6.30 1.19
0.25 370.38 212.32 80.37 13.59 1.32
0.50 370.38 280.33 152.69 37.93 2.00
0.90 370.38 364.51 345.87 260.48 32.74
0.99 370.38 368.95 362.76 3 12.00 59.30
Note: ARLs measured in observations.
.?,+I ( 1 ) = "/
+
where ,: = As, (1 - h ) ~ , is
- ~the EWMA. The sequenceof one-step-ahead
prediction errors,
150 MontgomeryandMastrangelo
0.500
0.400
0.300
0.200
*
;0.100
a
p, 0.000
VY
2 -0,100
-0.200
-0.300
-0.400
-0.500
Time, t
errors. This chart is slightly different from the control chart of the exact
autoregressive model residuals shown in Figure 6, but not significantly so.
Both indicate a process that is reasonably stable, with a period around t =
62 where an assignable cause may be present.
Montgomery and Mastrangelo (1991) point out that it is possible to
combineinformationaboutthestate of statisticalcontroland process
dynamics on a single control chart. If the EWMA is a suitable one-step-
ahead predictor, then one could use z , as the centerline on a control chart
for period r + 1 with upper and lower control limits at
UCL,,, = z , +3 0
and
LCL/+, = z / - 3 0 (12)
80.000
78.000 h
76.000
74.000
72.000
70.000
68.000
66 000
the residual or EWMA prediction error control chart in Figure7, but oper-
ating personnel often feel more comfortable with this display.
A R L Pcrformmce. Because theEWMA-basedprocedurespresented
above are very similar to the residuals control chart, they will have some of
thesameproblemsindetectingprocessshifts. Also. TsengandAdams
(1994) note that because the EWMA is not an optimal forecasting scheme
formostprocesses[except theIMA( 1 , l ) model], it will notcompletely
account for the autocorrelation, and this can affect the statistical perfor-
mance of control charts basedon EWMA residuals or prediction errors.
MontgomeryandMastrangelo (1991)suggesttheuseof supplementary
procedures called trackingsignalscombined with thecontrolchartsfor
residuals. There is evidence that these supplementary procedures consider-
ablyenhancetheperformance of residuals controlcharts.Furthermore,
MastrangeloandMontgomery (1995)show that if anappropriately
designed tracking signal scheme is combined with the EWMA-based proce-
dure we have described, good in-control performance and adequate shift
detection can be achieved.
Estirmrting crnd Monitoring 0 . Thestandarddeviation of theone-
step-ahead errors or model residuals o can be estimated in several ways.
If h is chosen as suggested above over a record of n observations, then
dividing the sum of the squared prediction errors for the optimal h by I I
will produce an estimate of 02.This is the method used in many time series
analysis computer programs.
Another approach is to compute the estimateof CJ as is typically done
in forecasting systems. The mean absolute deviation (MAD) could be used
in this regard. The MAD is computed by applying an EWMA to the abso-
lute value of the prediction error,
MacGregor and Ross (1993) discuss the use of exponentially weighted mov-
ing variance estimates in monitoring the variability of a process. They show
how to find control limits for these quantities for both correlated and uncor-
related data.
r=l
The batch size h can be selected to tune performance against a specified shift
6.
The weights w i must sum to unity for YJ to be an unbiased estimate of
the process mean p. For A R b ) processes, the optimal weights are identical
in the middle of the batch but differ in sign and magnitude for the first and
last values in the batch. For the AR(I) model, the weights are
154 MontgomeryandMastrangelo
For example, with h = 64 and 4 = 0.99, the middle weights are all 0.016,
and the first and last weights are -1.57 and 1.59, respectively.
Given normal data and any bath size h > I , the optimal weights pro-
duce batch means that are i.i.d. normal with mean
and variance
1
Var( Y,) =
(1 - $)’(h - 1)
To adjust the on-target ARL to equal ARLO,,,, one computes the control
limit by solving for A. in
where h i n the numerator accounts for thefact that each batch is h observa-
tionslong.Thentheaverage run length forthe weighted batchmeans
(WBM) chart (measured in individual observations) can be computed as
Table 3 (continued)
1
I,'. - - i = I, h
"h'
This approximation, which assumes that the batch means are i.i.d. normal
withmean p andstandarddeviation oUBM as given in Table 4, was
confirmed by Monte Carlo analysis (Table 5 ) .
Sinceestimating ARLs with (27) is simplerthan extensive Monte
Carloanalysis,theapproximation is used in Table 6. Table 6 compares
this ARL with the ARLs of the other two charts for selected values of the
autocorrelation parameter 4. The batch sizes h were chosen by using Table 3
to provide a WBM chart sensitive to a shift 6 = 1. The comparison was
made with the in-control ARL,, = 10,000. Table 6 shows that both batch
means charts outperform the residuals chartin almost all cases shown, with
the UBM chart performing best of all.
REFERENCES
1. MULTIOPERATIONANDMULTI-INDEXSYSTEM
2. PROBLEMSENCOUNTEREDINIMPLEMENTING
QUALITY CONTROL AND DIAGNOSIS IN A
MULTIOPERATION, MULTI-INDEX SYSTEM
161
162 Zhang
Operation 2
r""""""""""""""""""""
IShewhart
Total
I Quality Chart
Total
Quality
-,Shewhart
Chart
I
I
I Diagnosis
I
I I System
I
I
I
Partial -,Cause-Selecting I
I
I Quality Chart I
L""""""""""""""""""""1
where S,, i #.j, is the covariance, the T 2 control chart can consider com-
pletely the correlation among variables.
The multivariate T' control chart was proposed by Hotelling in 1947
and was well used i n Western countries for multivariatecases. Its merits are
that ( I ) it considers the correlations among variables and (2) it can give us
exactly the probability of thefirst kind of error, a. But its greatest drawback
is that it cannot diagnose which variable induced the abnormality when the
process is abnormal. On the other hand, the bestmerit of thediagnosis
theorywithtwokinds of quality is that it canbe used to diagnosethe
cause of abnormality in the process. Hence Zhang proposed a new multi-
variate diagnosis theory with two kinds of quality to combine the above-
stated theories together so that we can concentrate their merits and at the
same time avoid their drawbacks.
5. HOW TOSIMULTANEOUSLYDIAGNOSETHE
PRECEDING INFLUENCE AND THE CORRELATION
AMONG INDICES IN A MULTIOPERATION, MULTI-INDEX
SYSTEM
From the preceding discussions it is evident that we need to use the diag-
nosis theory with two kinds of quality in order to diagnose the preceding
influence, and we also need to use the multivariate diagnosis theorywith two
kinds of quality in order to diagnose the correlated indices. In such a com-
plex system, it is not enough to depend on the technology only; we must
consider statistical process control and diagnosis (SPCD) too. Besides the
diagnosis theories of Western countries always diagnose all variables simul-
taneously. Suppose the number of variables is p and the probability of the
first kind of error in diagnosing a variable is ct, then the probability of no
first kind of error in diagnosing p variables is
MultivariateDiagnosisTheory 167
P()=(I-cc)”zl-ppa
Thus, the probability of the first kind of error in diagnosing p variables is
P , = 1 -Po z p a
i.e.it is proportional to the number ofvariables.Inthe case of agreat
numberofvariables,thevalueof P I maybecomeintolerable. To solve
this problem, Zhang and his Ph.D. candidate Dr. Huiyin Zheng (Zheng,
1995) proposed the multivariate stepwise diagnosis theory in 1994.
p,.o,. i = 1 , 2 ,..., 27
But we cannot supervise the correlations among variables, i.e., the covar-
iances among indices, with such a univariate S-R,~control chart. There are
altogether 351 [= 27(27 - 1)/2] covarianceparametersor coefficients of
correlation to be supervised. Only by using multivariate diagnosis theory
+
with two kinds of quality can we supervise all 405 (= 27 27 + 351) process
parameters and implement the SPCD. Using the DTTQ2000 software we
have diagnosed eleven factories in China, and all the diagnostic results have
been in fairly good agreement with the actual production results.
Using the DTTQ2000 software with a microcomputer, it takes only
about 1 min to perform one diagnosis; thus, it saves much time on the spot.
Not only is the diagnosis correct, but it also avoids the subjectivity of the
working personnel.
MultivariateDiagnosisTheory 169
6. APPLICATIONS OF THEMULTIVARIATEDIAGNOSIS
THEORY WITH MI0 KINDS OF QUALITY
Example 1
Operations4and 5 of aproduction line forthedruganalgin have five
indices, three of which belong to the preceding operation; the other two
belong to the succeeding operation. Their data are a s follows (see group
51 data in Table 2):
Using the DTTQ2000 software, we know that the T’ value is 18.693, greater
than the upper control limit (UCL) of 13.555 of the T 2 control chart (Fig.
2), whichmeans that the process is abnormal. Then, by diagnosing with
DTTQ2000, we know that index x 5 is abnormal.
Example 2
Using the DTTQ2000 Windows software to diagnose the same desmear/
PTHoperations ofthreeprintedcircuitfactories, A, B, C, we obtained
Figure 3. Compare and criticize these three factories.
13.555
L"""""""""""""""""""""""""""O.(#)O
16 18 20 22 242628303234363840 42 44 46 48 50 52 54 56 58
l ~ I , I l I I I , I I " I , I ' I ' I , I ' I ' I ' I ' I ~ I ' I ' I I I ' ' ~
7. CONCLUSION
L"""""""""""""""""""""""""". 0.000
1 3 5 7 9 11 1315 17 21
19 23 25 27 29 31 33 3
21.004
_""""""""""""""""""""""""----- 0.000
1 3 5 7 3 11 13 15 17 19 21 23 25 27 29 3l
, . : . > : , I I . ' . I . , , ! ,
ic)
MultivariateDiagnosisTheory 173
REFERENCES
Chen Z.Q., Zhang G . (199621). Fuzzy control charts. China Q L ~October , 1996.
Zhang G. (1980). A new type of qualitycontrolchart allowing the presence of
assignable cause-the cause-selecting control charts. ActaElectron Sin 2: 1-10.
Zhang G. (198221). Control chart dcsign and a diagnosis theory with two kinds of
quality. Second AnnualMeeting of CQCA.February 1982, Guilin. P. R.
China.
Zhang G. (198%). Multiple cause-selecting control charts. Acta Elcctronic Sin 3:31-
36. May 1982.
Zhang G.(1983). A universal method of cause-selecting-The standard transforma-
tion method. Acta Electron Sin 5: 1983.
Zhang G. (1984a).Cause-Selecting ControlChart:Theoryand Practice. The
Publishing House of Post and Telecommunication, Beijing, 1984.
Zhang, G. (1984b). A new type of control charts-cause-selecting control charts and
athcory of diagnosis with controlcharts. Proceedings of WorldQuality
Congrcss '84, pp 175-185.
Zhang G . (1985). Cumulative control charts and cumulative cause-selecting control
charts. J China Inst of Commun 6:31-38.
Zhang G. (1989). A diagnosis theory with two kinds of quality. Proceedings of the
43rd AmericanQualityCongressTransactions. pp 594-599. Reprinted in
TQM, UK. No. 2, 1990.
Zhang. G . (1992a). Cause-Selecting ControlChartandDiagnosis.TheAarhus
School of Business, Aarhus, Denmark, 1992.
Zhang G. (1992b).Textbook of QualityManagement.Thc Publisher of High
Education. 1992.
Zhang G. Dahlgaard JJ. Kristenscn K. (1996b). Diagnosingquality:theoryand
practice.Rcsearch Report. MAPP, Denmark. 1996.
Zhang G. (1997). An introductiontothemultivariatediagnosistheory wit11 two
kinds of quality. China Q u ~ February
. 1997, pp 36-39.
Zheng H. (1995). Multivariate thcory of quality control and diagnosis. Doctoral
dissertation, Beijing Univcrsity of Aeronautics and Astronautics, 1995.
This Page Intentionally Left Blank
10
Applications of Markov Chains in
Quality-Related Matters
Min-Te Chao
Academia Sinica, Taipei, Taiwan, Republic Of China
1. INTRODUCTION
175
176 Chao
2. BASICFACTSABOUTMARKOVCHAINS
for a l l Bore1 sets A c R , B c R“”. If, in addition to Eq. (I), the Y s take
values only i n a finite set S, which without loss of generality we may assume
to be S = { I , 2 , ...,s } , then we say that the X’s follow a finite Markov chain.
For a finite Markov chain, the information contained i n Eq. ( I ) can be
summarized into, say.
Markov Chains in Quality-Related Matters 177
If, in addition, P , , : ~of (2) is independent of 11, then the Markov chain is said
to have stationary transition probabilities. In this case, let
and let II = (x,. x,...,x,s),x, = PIX. = i]. It can be shown that for a Markov
chain with stationary transition probabilities the knowledge of n and the
matrix P is sufficient to determinethejointprobabilitydistribution of
( X o , X,, X,, ...). We call II theinitialdistribution and P the (stationary)
transition probability matrix.
In what follows, we shall always assume that the Markov chains under
consideration are finite with a certain initial distribution and a stationary
transition matrix.
A good technicalreason to use a matrix P is that we can employ
matrixalgebra to simplify variouscalculations. For example,thekth-
order transition probability
is simply the (i,j)th element of 9,the kth power of the transition matrix P,
1.e..
The entries of P are probabilities, so the row sums of P are unity and
the entries themselves are all nonnegative. I t may happen that some of the
entries of P are 0. But if we look at the sequence P, P ‘ , P’, ..., it may happen
that atsome k > 0 all entries of 9 are strictly positive. If this is the case, this
means that if one starts from any statei, in k steps it is possible to reach state
,j, and this holds true for all 1 5 i,.j 5 s. If$ > O for some I< > O and for all
1 5 i,.j 5 s, then we say that the Markov chain is irreducible.
Let,f;”’)be the probability that in a Markov chain starting from statej,
the first time it goes back to thejth state is at time 11, i.e.,
f“) = P[X, #.j, #,j, .... X,,-, = j l X o =.jl
# . j , X,,
Let
,I= I
178 Chao
The quantity p, is the average time at which a Markov chain starting from
state j returns to statej , and it is called the mean recurrence time for statej .
If p, < 00, then stateJ is said to be ergodic. If p, <00 for all j E S, then we
say that the Markov chain is ergodic.
If a Markov chain is irreducible and ergodic, then the limits
exist and are independent of the initial state i. Furthermore, uj >O, x;=l
u, = I , and
The vector u = (uI, 142, ..., u,J is called the absolute stationary probability. If
a = u, then it can be shown that
for a l l j E S and for all 2 0, i.e., the Markov chain is stationary (instead
of just having a stationary transition probability).
Aninterestingfeature of Eq. (6) is thatitsrate of convergence is
geometric.Let U be an s x s matrixconsistingofidenticalrows,where
each row is u. Then by (7), PU = U P = U , so by induction we have
The fact that ( P - U)" -+ 0 exponentially fast follows from the Perron-
Frobeniustheorem,and since it is a little bit tootechnical we shallnot
pursue it further. This basically explains that for a well-behaved Markov
chain, the series in (5) usually converges because it is basically a geometric
series. Also, the long-term behavior of an ergodic Markov chain is indepen-
dent of its initial distribution.
Let A be a subset of S, and let T = inf{n 2 1, X, E A ] . Then T i s the
first entrance time to the set A . For a control chart modeled by a Markov
chain,the set A mayconsistoftheregionwhere analarmshould be
triggeredwhen X,,E A occursforthe first time.Thus T is thetime
when the first out-of-control signal is obtained, and E ( T ) is closely related
totheconcept of averagerunlength(ARL).Whenthecontrolcharts
become moreinvolved,theexactorapproximatecalculationsof ARLs
become involved or impossible with elementary methods. However, most
MarkovChainsinQuality-RelatedMatters 179
(not all) control charts can be properly modeled by a Markov chain, and
essentially all methods developed to calculate the ARLs are more or less
based onthe possibility thatonecanembedthecontrolschemeintoa
Markov chain.
3. DISCRETECASE:EXACTRESULT
In this section we discuss cases for which an exact finite Markov chain can
be found to describe the underlying control chart. I first describe a general
scenario where such a representation canbe arranged and explainwhy it can
be done.
Assume that the basic observations are X,, X,,..., which are i.i.d. and
take values in a finite set A of size k . The key point is that the X ' s are
discrete and the set A is finite. This may be the case when either the X's
themselves are discrete or the X ' s can be discretized.
Most control charts are of"finite memory"; i.e., at time n the decision
whether
of to flag an
out-of-control
signal
depends on ..., X,,
only. In other words, we may trace back to consult the recent behavior of
the observations to decide whether the chart is out of control, but we do it
for at most r steps back, r < 00. The case for which we have to trace back to
the infinite past is excluded.
Let Y,, = (X,,-,.+,, ..., X,,). Therandomvector Y,, cantakeas
many as s = k r < m possible values. It is easy to see that the Y's follow a
Markov chain with an s x s transition matrix. Since at time n, Y,, is used
to decide whether the process is out of control, we see that, conceptually
atleast,forthescenariodescribedabove,there exists a finite Markov
chainforwhichthebehaviorofthecontrolchartcan be completely
determined.
However, s = k' can be a very large number, so the s x s matrix can be
toolargeto havepracticalvalue.Fortunately, this matrix is necessarily
sparse (i.e., most entries are 0), and if we take a good look at the rules of
the control chart, then the chances are we may find some means to drasti-
cally reduce the number of states of the Markov chain. Hence, to implement
our general observation, we need case-by-case technical works for various
control charts.
Note that the X's are thecodedvaluesofthe X ' s . As long as our only
concern is whethertheprocess is undercontrol,the behaviorofthe X
chart can be completely determined by the coded X's. The coded X ' s are
still i.i.d., and this is a special kind of Markov chain. Its transition matrix,
when the process is under control. consists of three identical rows:
ARL = E ( N )
N = inf(n 2 1 : Sf, 2 t ]
integer-valued random variables Y,, and Z,, are observed. Define S,(O) =
S,(O) = 0, where
+
SH(n)= max(0, Y,, S H ( n- 1))
SI>(??) +
= min(0, Z,, SL(n - I ) )
and
N = inf(n >_ 1 : SH(n)2 t l or SL(rz) 5 - t z )
Normally, we would have Y, = X , - k l , Z , = X , + k2 for some known inte-
gers k l , k2. The X‘s arethebasicsequence of thequalitycharacteristic
measured for control. The bivariate process (SH(n),SL(n))takes values in
{O,l, ..., t l } x {O,l, ..., fz}, and it is possible to write a finite Markov chain
+
with s = ( t l l ) ( t 2+ 1 ) states (see Lucas and Crosier, 1982). For a two-
sided CUSUM, the number of states of the underlying Markov chain can
be reduced to aboutt , t 2 / 2 by careful arrangement of states (Woodall, 1984).
However, it is not known whether we can always reduce the Markov chain
of the two-sided CUSUM to a linear function of t l + r2.
4. GENERALRESULTS
+
pf‘)= E ( N ; ( N ,- 1) ... ( N , - l (1 1)) 1)
r = 1 , 2 , ... .
MarkovChains in Quality-RelatedMatters 183
5. APPROXIMATIONS:THECONTINUOUSCASE
if 113, the threshold size for our "roundoff' procedure, is small. Since [ X , ,-
Y,,l 5 II' for all 1 2 , we would intuitively expect Y,, =X,,. and ARLs based on
the X ' s , which we may find exactly via the Markov chain method, can be
used to find a reasonableapproximation of the ARLsfortheoriginal
CUSUM based on continuous distributions.
How small should 11' be in order to induce a reasonable approxima-
tion'? Very little is known mathematically although we believe it is workable.
However, it is reported (Brook and Evans, 1972) that it is possible to obtain
agreement to within 5% of the limiting value when f = 5 and to within 1 %
when f = 10.
The basic idea ofBrook andEvanscan be applied to various
CUSUMs. Since thebasicconcept is thesame, we shallonly list these
MarkovChains in Quality-Related Matters 185
cases. Successful attempts have been reported for the two-sided CUSUM
(Woodall, 1984) and multivariate CUSUM (Woodall and Ncube, 1985). In
these cases, however, the sizes of the transition probability matrices increase
exponentially with the dimension of the problem, andso far no efficient way
to drastically reduce the matrix size is known. The Brook and Evans tech-
nique also applies to weighted CUSUMs (Yashchin, 1989), CUSUMs with
variablesamplingintervals (Reynolds et al., 1990), and theexponentially
weighted moving average schemes (Saccucci and Lucas, 1990). I n all these
examples, the control scheme can be described in the form
Si,, = g j ( X , , ,S j , f , - l ) , ? I ; i = I , 2, ..., 111
where g, are fixed functions and the Y s are i.i.d. continuous ordiscrete. For
example, for the two-sided CUSUM, we have H I = 2 and
If the Si’s are discretized to t different values, then the control scheme can be
approximately described by an s-state Markov chain, s = t”’.
6. OTHERAPPLICATIONS
we see that S,, follows a Markov chain if X,, follows a Markov chain. Hence
the general idea described in Section 4 still applies. However, studies in this
respect, although workable, are rare in the literature. The only related work
seems to be Chao (1989).
The Markov chain method also finds its application in various linearly
connected reliability systems. A general treatment can be foundin Chao and
Fu (1991). Readers are referred to the review article by Chao et a l . (1995).
7. CONCLUSION
ACKNOWLEDGEMENT
REFERENCES
Bakir ST, Reynolds MR Jr. (1979). A nonparametric procedure for process control
based on within-group ranking. Technometrics 21: 175-183.
Blackwell MTR. (1977). The effect of short production runs
on
CSP-I.
Technometrics 19:259-263.
BrookD,EvansDA. (1972). Anapproachtotheprobabilitydistribution of
CUSUM run length. Cumulative sum charts; Markov chain. Biometrika 59:
539-549.
Brugger RM. (1975). Asimplification of skip-lot procedureformulation. J Qual
Technol. 7:165-167.
Brugger R M (1989). A simplified Markov chain analysis of ANSIiASQC 21.4 used
without limit numbers. J Qual Technol. 21:97-102.
Champ CW. Woodall WH. (1987). Exact results for Shewhart control charts with
supplementary runs rules. Quality control; Markov chain; average run length.
Technometrics 29:393-399.
Champ CW. Woodall WH.(1990). A program to evaluate the runlength distribution
of a Shewhart control chart with supplementary runs rules. J Qual Technol 22:
68-73.
Chao MT. (1989). The finitetime behavior of CSP whendefects are dependent.
Proceedings of the National Science Council, ROC, Part A 13:18-22.
ChaoMT.FuJC. (1991). The reliabilityoflargeseriessystems underMarkov
structure. Adv Appl Prob 23:894908.
Chao MT, Fu JC, Koutras MV.(1995). Survey of reliability studies of consecutive-k
out-of-n:F and related systems. IEEE Trans Reliab 44:120-127.
CrosierRB. (1986). A new two-sided cumulativesumqualitycontrolscheme.
Technometrics 28:187-194.
Grinde R, McDowell ED, Randhawa SU. (1987). ANSIiASQC 21.4 performance
without limit numbers. J Qual Technol 19:204-215.
Hahn GJ, Gage JB. (1983). Evaluation ofa start-updemonstration test. J Qual
Technol 15:103-106.
188 Chao
Lucas JM, Crosier RB. (1982). Fast initial response for CUSUM control schemes.
Technometrics 24: 199-205.
Neuman CP, BonhommeNM. (1975). Evaluation of maintenance policies using
Markov chains and fault tree analysis. IEEE Trans Rehab 24:3745.
Reynolds M R Jr, Amin RW,Arnold JC. (1990). CUSUMcharts with variable
sampling intervals. Technometrics 32:371-384.
Saccucci MS. LucasJM. (1990).Average run lengths forexponentially weighted
moving average control schemes using the Markov cham approach. J Qual
Technol 22:154-162.
Salvia AA. (1987). Performance of pre-control sampling plans. J Qual Technol 19:
85-89.
Stephens KS, Dodge HF. (1976). Two-stage chain sampling inspection plans with
different sample sizes in the two stages. J Qual Technol 8:209%224.
Van Dobben de Bruny CS. (1968). CumulativeSum Tests: Theory and Practice.
Statistical Monographs and Courses No. 24, Griffin.
Western Electric Company. (1965). Statistical Quality Control Handbook. Western
Electric Co.. Indianapolis.
Woodall WH. (1984). On the Markov chain approach to the two-sided CUSUM
procedure run length distribution. Technometrics 26:4146.
WoodallWH,NcubeMM.(1985).MultivariateCUSUMquality-controlproce-
dures.Cumulativesum;Markovchain; Hotelling’s T’. Technometrics 27:
285-292.
Yashchin E. (1989). Weighted cumulative sum technique. Technometrics31 :321-338.
11
Joint Monitoring of Process Mean and
Variance Based on the Exponentially
Weighted Moving Averages
Fah Fatt Gan
National University of Singapore, Singapore, Republic of Singapore
1. INTRODUCTION
189
190 Gan
short, the problem of monitoring the mean and variance is a bivariate one,
and both the mean and variance chartsneed to be looked at jointly in order
to make meaningful inferences.
The use of combined schemes involving simultaneous mean and var-
iance chartsbased on the EWMAs of sample mean and varianceis discussed
in Section 2. The averagerunlength (ARL) performance ofthevarious
schemes is assessed in Section 3. A simple design procedure of a combined
EWMA scheme with an elliptical “acceptance” region is given in Section 4.
A real data set from the semiconductor industry is used to illustrate the
design and implementation in Section 5.
Consider the simulated data set given in Gan (1995). The data set comprises
80 samples, each of sample size 17 = 5. The first 40 samples were generated
from the normal distribution N ( p O ,o;),where = 1 and 0; = 1, and the rest
+
were from N ( p o 0.4oO/,h, (0.90,)~). Thus, the process was simulatedto be
in control for thefirst 40 samples, and between the 40th and 41st samples the
mean shifted upwardto po + o.4a0/,hand thevariancedecreasedto
( 0 . 9 0 ~ )A
~ . EWMAchart for monitoring the mean is obtained by plotting Qo
= po and Q, = ( I - A,,,,)Q,-, + hnrX, against the sample number t , where 2,
is thesamplemean at samplenumber t . A signal is issued if Q, > or
Q , < - /I,,,. Similarly,a EWMA chartformonitoringthevariance is
obtained by plotting yo = E[log(Sf)] = -0.270 (when o2 = 06)7 and y, =
(1 - A,,)q,-, + A v log($), where Sf is the sample variance at sample number
t . A signal is issued if y, > H v or 4 , < - hc.. More details on the EWMA
charts can be found in Crowder (1 987, 1989), Crowder and Hamilton(1 992),
Lucasand Saccucci (l990),andChang (1993). The EWMA meanand
variance charts based on the parameters given in Gan (1995, Table 2, p.
448, scheme EE) are constructed for the data and displayed in Figure 1.
A quality control engineer has to constantly combine the information
in thetwocharts (whichmight not be easily done in practice)tomake
meaningful inferences. To ensure that the charts are interpreted correctly,
the two charts couldbe combined into one, and this canbe done by plotting
x
the EWMA of log(S2) against the EWMA of as shown in Figure 2. The
chart limits of the two charts form the four sides of a rectangular “accep-
tance” region. Any point that falls within the region is considered an in-
control point (for example, pointsA and B ) , and any point thatfalls outside
the region is considered an out-of-control point (for example, pointC). The
thick bar on the plotis not an out-of-control region but represents the most
MonitoringofProcessMeanandVariance 191
0.40
0.30
14 0.20
"0 0.10
< 0.00
2 -0.10
$ -0.20
-0.30
-0.40
0 5 10 15 20 25
30
35
40
45
50
55 60 65 70
75 80
0.30
0.10
-0.10
"
A
2 -0.30
:
4-
-050
-0.70
$ -0.90
-1.10
0 5 10 15 20 25 3035 40 45 50 55 60 65 70 75 80
Sample Number
Figure 1 EWMA charts based on J? and log(S') for a simulated data set where the
first 40 samples were generated from the normal distribution N ( p u , oi), where 11" = 1
and c ir= I , and the rest were from N (I+, + 0.40u/./h, (0.90~)').
desirable state, where the mean is on target and the variance has decreased
substantially.
The advantage of this charting procedure is immediate: Any inference
made can be based on both the EWMAs jointly. The interpretation of an
out-of-control signal is easier because theposition of the point gives an
indicationofboththemagnitudeanddirectionoftheprocessshift.
However, the order of the points is lost if they are plotted on the same
0.3
6: -0.3
3 -0.5
0
-0.7
0.9
C.
-0.4 0.0 0.4
EWMA of X
Figure 2 A combined EWMA schemc with a rectangular acceptance region.
192 Gan
plot. To get around this problem, each point canbe plotted on a new plot in
a sequence a s shown later in Figure 13. The disadvantage is that it is not as
compact a s the traditional procedure illustrated in Figure 1.
The traditional way of plotting the mean and variance charts sepa-
rately [see, forexample, Can (1995)] amounts to plotting the EWMA of
log(S’) against the EWMA of 2 and using a rectangular acceptance region
formaking decisions. Themainproblem with arectangularacceptance
region is that both points A and B (see Fig. 2 ) are considered in control,
although it is fairly obvious that point B represents a far more undesirable
state than that of point A . An acceptance region that is more reasonable
would be an elliptical region as shown in Figure 3. Takahashi (1989) inves-
tigated an elliptical type of acceptanceregionforacombinedShewhart
scheme based on 2 and S or R. An economic statistical design for the 2
and R chartswas given by Saniga (1989). Apoint is consideredout of
control if it is outside the elliptical acceptance region. For example, point
B is an out-of-control point, but point A is an in-control point, for the
elliptical regiongiven in Figure 3. This chart is called a bull’s-eye chart,
as any hit on the bull’s-eyewill provide evidence of the process being on
target.
For the same smoothing constants hnr and A,,,in order for a EWMA
schemewith an elliptical region to havethe same ARL as theEWMA
scheme with a rectangular region, the chart limits of the mean and variance
chartshaveto be slightly larger,asshown in Figure 4. The ideaof an
elliptical regioncomesfromtheHotelling’sstatistic to be discussed later.
Point A is an in-control point for the rectangular region, butit is an out-of-
control point for the elliptical region. Similarly, point B is an out-of-control
point for the rectangular region but an in-control point for the elliptical
region. Thus, an elliptical region would be expected to be more sensitive in
detecting large changes in both the mean and variance and less sensitive in
C 0.3
5 0.1
gJ -0.1
z -0.3
-0.5
93 -0.9
O
-0.7
w -1.1
-0.4 0.0 0.4
EWMA of x
Figure 3 A combined EWMA scheme with an elliptical acceptance region.
Monitoring of Process Mean and Variance 193
-2
N 0.3
0.1
2 -0.1
-0.3
0 -0.5
gw -0.7
-0.9
-1.1
-0.4 0.0 0.4
EWMA of X
Figure 4 A combined EWMA scheme with both rectangular and elliptical accep-
tance regions.
2.0 ”. a-
II
1.0 h
PI 0.2
0.0 s. 0.0
h
c4 -1.0 -2 -0.2
“0 -0.4
2
- -3.0
bo -2.0
< -0.6
2 -0.8
-4.0 -1.0
-5.0 -1.2
-1.0 -2 .0 0.0 -0.5
1.0 2.0 0.0 0.5
x EWMA of X
Figure 5 Shewhart bull’s-eye chart and a EWMA bull’s-eye chart based on 10,000
random points (x,
log(S?)) from an in-control normal distribution.
194 Gan
for a point (er, 4 , ) located below the horizontal line. For a point above the
horizontal line,
2.5 0.3
2.0 0.1
-0.1
1.5 -0.3
T2 -0.5
1.0 -0.7
-0.9
0.5 -1.1
-0.4 0.0 0.4
0.0
0 10 20 30 40 50 60 70 80 EWMA of 2
Sample Number
Figure 6 A multivariate EWMA T’ chart for a simulated data set where the first
40 samples were generated from the normal distribution N (po, o;),where po = 1 and
O; = 1, and the rest were from N ( p o + O.4oo/J7i, (0.90~))’.
Monitoring of ProcessMeanandVariance 195
For a comparison of schemes based on the ARL, the in-control mean and
variance are assumed to be po = 0 and oi = 1, respectively. Each sample
comprises IZ = 5 normally distributed observations. The mean and variance
investigated are given by p = po + Aoo/&i and o = 6oo,where A = 0.0,
0.2, 0.4, 1.0, and 3.0 and 6 =0.50, 0.75,0.95, 1.00, 1.05, 1.25, and 3.00.
Combined schemeswith rectangular and elliptical acceptance regions are
compared in this sectiotn. All the schemes have an approximate ARL of
250. The ARL values of the schemes EE, and SS, (combined EWMA and
Shewhart schemeswith rectangularacceptanceregions) were computed
exactlyusing theintegralequationapproach given in Gan (1995). The
restwere simulated.Alternatively,the ARL of theEWMA bull’s-eye
chartEE, couldbecomputed by using theMarkovchainapproach of
Brook and Evans (1972) or the integral equation approach. Waldmann’s
method (Waldmann, 1986a, 1986b) could be used here for approximating
the run length distributionof a bull’s-eye chart. Let the starting values of the
EWMA mean and variance charts be u and v, respectively; then the ARL
function L,.(u,v) of thecombinedschemewith an elliptical acceptance
region B can be derived as
L&, v) = 1
196 Gan
where,/:? and.f;og(S2) are the probability density functions of 2 and log (s'),
respectively.
The schemes CC (combined CUSUM schemewith a rectangular
acceptanceregion)and EE,. arethesameasthose givenin C a n (1995).
Thecombined scheme CC consists of atwo-sided CUSUM mean chart
and a two-sided CUSUM variance chart. This scheme is obtained by plot-
ting So= To = 0.0, s, = max[0, s,-,+.VI - and TI = min [0, T,+l + s ,
+I<,,,] againstthesamplenumber t forthemeanchartand by plotting
+
SO= TO= 0.0, S, = max[0, SI_, log(.$) - kr,[;], and T, = min [0,
+
TI-, log(.#) + against t forthevariancechart.Thechartparameters
of the various schemes are given in Table I . More details on the CUSUM
charts can be found in C a n ( 1 991) and Chang (1993). The ARL compar-
isons are summarized in Table 2.
The ARL values of combined schemes CC, EE,, and SS, were simu-
lated such that an ARL that is less than I O has a standard error of 0.01; an
ARL that is at least I O but less than 50 has a standard error of0.1; an ARL
that is at least 50 but less than 100 has a standard errorof about 0.2; and an
ARL that is at least 100 has a standard error of about 1.0.
EE,. I V ~ S Z L Y CC. The performances of these two schemes aresimilar
except that when there is a small shift in the mean and a small decrease in
the variance, the EE, scheme is much more sensitive. When there is a large
increase in the variance, the EE, scheme is marginally less sensitive.
EE,. w r . s ~ ~EE,.
v The performances of these two schemes are similar.
The EEL,scheme is generally more sensitive than the EE,. scheme in detect-
ing increases i n the variance and less sensitive in detecting decreases in the
variance for the various means investigated.
1.0 0.2
0;
0.0
0.0 0 -0.2
h
I - 1.0 % -0.4
2M -2.0 Q -0.6
.- -3.0 E -0.8
-4.0 3 -1.0
-5.0 LU -1.2
-g -2.0
-3.0
Q -0.6
I: -0.8
-4.0 3 -1.0
-5.0 LU -1.2
-2.0 0.0 2.0 -0.5 0.0 0.5
A,.
~ ~~ ~
0.620 o.021 0.621 0.622 0.622 0.623 0.623 0.623 0.623 0.624 0.624 0.624 0.625 0.625
0.182 0.241 0.268 (1.294 0.319 (1.343 0.366 0.389 0.410 0.431 0.451 0.471 0.509 0.546
0.820 0.910 0.953 0.995 1.036 1.077 I . l l h 1.155 1.194 1.232 1.270 1.308 1.383 1.456
0.710 0.710 0.711 0.711 0.712 0.712 0.782 0.713 0.713 0.713 0.713 0.713 0.714 0.715
0 . 1 ~ 0.242
3 0.269 0.295 0.320 0.344 0.367 0.389 0.41 I 0.432 0.452 0.472 0.510 0.547
(1.822 0,')Il 0.Y55 0.996 1.037 1.078 1.117 1.157 1.196 1.234 1.271 1.309 1.384 1.458
0.752 0.752 0.751 0.753 0.753 0.754 0.754 0.755 0.7.55 0.755 0.756 0.756 0.756 0.757
0.183 0.242 0.269 (1.295 0.320 0.344 0.367 0.390 0.411 0.432 0.453 0.472 0.510 0.547
0.~23 0.917 0.955 0 . ~ 9 7 1 . 0 3 ~1 . 0 7 ~I . I I X 1.157 1.196 1.234 1.273 1.310 1.384 1.458
0.792 0.793 0.793 0.79.3 0.7Y4 0.794 0.794 0.795 0.795 0.796 0.796 0.796 0.797 0.797
0.183 0.243 0.270 0.296 0.320 0.344 0.367 0.390 0.41 I 0.432 0.453 0.472 0.510 0.547
0.823 0.913 0.955 0.997 1.038 1.079 I . I I X 1.158 1.196 1.235 1.273 1.310 1.385 1.458
0331 0.832 0.832 0.833 0.833 0.833 0.814 OM4 0.834 0.835 0.835 0.835 0.835 0.836
o . 1 ~ 0.243
4 (1.270 0.296 o.321 0.345 0.36~ 0.390 0 . 4 1 1 0.432 0.453 0.472 0.510 0.547
0.823 0.913 0.955 0.998 1.039 1.079 1.119 I.I5X 1.196 1.235 1.273 1.311 1.385 1.459
0.907 0 . ~ 0 70 . ~ 0 ~ (I.YOX 0.908 0.909 0.909 o.910 0.909 O.YOY 0.91I 0.91 I 0.91 I
0.908
o . 1 ~ 50.243 0.271 0.296 0.321 0.345 n . m 0.390 0.412 0.433 0.451 0.473 o.511 0.548
OX25 0.914 0.957 0.999 1.040 I.080 1.120 1.159 l . l Y X 1.236 1.274 1.313 1.387 1.460
0.943 0.Y43 0.943 0.944 0.944 0.944 0.945 0.945 0.945 0.945 0.946 (1.946 0.946 0.947
O.IX5 0.244 0.271 0 . 3 7 0.322 0.345 0.368 0.390 0.412 0.433 0.453 0.473 (1.511 0.548
0.825 0.914 0.Y57 0.999 1.040 1.080 1.120 1.159 l.lY8 1.236 1.274 1.313 l.3X6 1.460
1.012 1.013 1.013 1.013 i . 0 1 3 1 . 0 1 4 1.014 1.015 1 . 0 1 5 1.015 1.016 1.016 1.016 1.016
(1.185 0.244 0.271 (1.297 0.322 0.346 0.369 0.391 0.413 0.433 0.454 0.474 0.51 I 0.548
0.825 0.915 0 . 9 X 1.000 1.041 L O X ? 1.121 1.161 1.199 1.237 1.276 1.313 1.387 1.461
1.045 1.046 1.047 1.047 1.0471.047 i . 0 4 ~1.048 1 . 0 4 ~ 1.049 1.049 i.n.50 1.049 1.m
0.185 0.244 0.272 0.297 0.322 0.346 0.369 0.391 0.413 0.434 0.454 0.474 0.512 0.548
0.826 0,915 0.959 1.000 1,041 1.081 1.121 1.160 1.200 1.238 1.270 1.314 1.3X7 1.462
1.079 1.07') 1.079 1,079 l.0XO I.0XI 1.0XI I.0XI 1.082 1.082 1.082 1.082 1.083 1.083
0 . 1 8 5 0.244 0.272 0.297 0.322 0.346 ( 1 . 3 6 ~0.391 0 . 4 1 3 0.434 0.454 0.474 0.512 0.548
0.826 0.916 0.958 i.000 1.042 i.ox2 1 . 1 2 2 1.161 im 1 . 2 3 ~1.276 1.314 1.389 1.462
MonitoringofProcessMeanandVariance 201
Similar tables covering other in-control ARLs and sample sizes are available
fromtheauthor.Theseareobtained by usingsimulationsuchthatthe
simulated in-control ARL has an error of 1.0. The starting value of the
mean chart is given by the in-control mean pO, and the starting value of
the variance chart is given by
Suppose a combined scheme with knr = 0.14 and k I I = 0.16 is desired. Then
the chart parametersof the combined scheme can be obtained from Table 3
easily as follows:
Mean chart:
Variance chart:
for the elliptical curve above the horizontal line q, = E[log(S')]and using
202 Gan
5. A REALEXAMPLE
110 g I
100
90
X 80
70
60
50
7.0
-
%
6.0
5.0
4.0
-
v
g 3.0
2.0
1.0
0.0
0 5 15 10 20 25 30 35 40 45
Sample Number
Figure 8 Shewhart charts based on J? and log (S’) for the ball shear strength data.
Monitoring of Process Mean and Variance 203
Figure 9 EWMA charts based on J? and log (S') for the ball shear strength data.
4.0
3.5 t 7.0 3 I I
3.0 6.0
2.5 5.0 0
T2 2.0 4.0 0
3.0
1.5
2.0
1.0 1.o
0.5 0.0
0.0 72 50 94
0 5 10 2015 25 3530 40 45
x
Sample Number
Figure 10 A multivariate Shewhart T 2 chart for the ball shear strength data.
204 Gan
3.5 j I
3.0
2.5
2.0
T 2 1.5
1.0
0.5
0.0
0 5 10 2015 25 30 35 40 45
EWMA of X
Sample Number
Figure 11 A multivariate EWMA T’ chart for the ball shear strength data.
6. CONCLUSIONS
Three ways of charting 2 and log (S’) for the purposeof joint monitoring of
both mean and variance were discussed with respect to ease of implementa-
tion and ease of interpretation. The traditionalway of plotting the mean and
variance charts separately amountsto plotting log (S’) against J? based on a
rectangular “acceptance” region. Using the justification of a Hotelling-type
statistic, it was shown that an elliptical acceptance region is more natural
and appropriate. This led to the EWMA bull’s-eye chart and themulti-
variateEWMAchart based ona Hotelling-type T‘ statistic.AEWMA
bull’s-eye chart provides valuable information on both the magnitude and
direction of a shift in the process characteristics. The multivariate EWMA
206 Gan
Monitoring of Process Mean and Variance 207
chart provides only the magnitude and not the direction of a shift. It is
recommended that a EWMA bull's-eye chart be plotted beside a multivari-
ate T 2 chart to help quality control engineers gaina better understanding of
the process characteristics. Average run length comparisons show that the
performances of schemesCC and EE, are similarexcept that when there is a
small shift in the mean and a small decrease in the variance, the EE, scheme
is much more sensitive. When there is a large increase in the variance, the
EE, scheme is marginally less sensitive. The performances of the EE, and
EE, schemes are also found to be similar. The EE, scheme is generally more
sensitive than the EE, scheme in detecting increases in the variance and less
sensitive in detecting decreases in the variance. The difference between SS,
and SS, is more substantial. The SS, scheme is more sensitive than the SS,
scheme in detecting increases in the variance but substantially less sensitive
in detecting decreases in the variance, especially when there is little or no
change in the mean. The EWMA schemes were found to be substantially
more sensitive than the Shewhartschemes except for the case when there is a
big change in thevariance.Finally,asimpledesignprocedurefor an
EWMA bull's-eye chart was provided.
REFERENCES
1. INTRODUCTION
In many quality control settings the product under examination may have
two or more related quality characteristics, and the objective of the super-
vision is to investigate whether all of these characteristics aresimultaneously
behaving appropriately. In particular, a standard multivariate quality con-
trol problem is to consider whether an observed vector of measurements
.x = (.xI, ...,.xk)’ from a particular sample exhibits any evidence ofa location
shift
froma set of “satisfactory” or“standard” meanvalues p0 =
(p:, ..., p:)’. The individual measurements will usually be correlated due to
the nature of the problem, so that their covariance matrix C will not be
diagonal. In practice, the mean vector p0 and covariance matrix C may be
estimated from an initial large pool of observations.
.x I , ..., .\J’
209
210 Hayter
other hand, if individual error rates of a / k are used, then the Bonferroni
inequality ensures that the overall error rate is less than the nominal level a.
However, this procedure is not sensitive enough since theactualoverall
errorratetendsto be muchsmallerthan a becauseof thecorrelation
between the variables.
Amultivariatequalitycontrolprocedurethatcan be successfully
implemented in manufacturing processes should meet the goals of
1. Controlling the error rate of false alarms
2. Providingastraightforwardidentification oftheaberrantvari-
ables
3. Indicating the amount of deviation of the aberrant variables from
their required values
In addition, for certain problems it is desirable that the multivariate quality
control procedure
4. Be valid without requiring any distributional assumptions.
An overview of the multivariate quality control problem canbe found
in Alt (1985). In this chapter some more recentwork ontheproblem is
discussed. Specifically, Section 2 considers the situation where the normality
assumption is made,andtheHayterandTsui (1994) paper is discussed
together with work by Kuriki (1997). Section 3 considers the work on non-
parametric multivariate quality control procedures by Liu (1995) and Bush
(1 996).
2.1. ConfidenceIntervalsProcedure
Hayter and Tsui (1994) proposed a procedure that provides a solution to
this identificationproblemandtotherelatedproblemofestimatingthe
magnitudes of any differences in the location parameters from their stan-
dard values pp. The procedure operatesby calculating a set of simultaneous
confidence intervals for the variable means p i with an exact simultaneous
coverage probability of 1 - a. The process is deemed to be out of control
whenever any of these confidence intervals does not contain its respective
control value ,LL.?,and the identification of the errant variable or variables is
immediate.Furthermore,thisprocedurecontinuallyprovides confidence
intervals for the “current” mean values p i regardless of whether the process
is in control or not or whether a particular variable is in control or not.
-
Let X Nk(O, R ) , where R is a general correlation matrix with diag-
onal elements equal to 1 and off-diagonal elements given by pij, say, and
define the critical point CR,*by
p(Ixi1 5 ~ R , E 1; 5 i 5 k)
In the more general case -
when X N k ( p , C) for any general covariance
matrix C, let the diagonal elements of C be given by 0:. 1 5 i 5 k , and
the off-diagonal elements by oji. Then if R is the correlation matrix gener-
ated from C,so that pi, = oU/oio,,it follows that
POX, - pil/o, 5 C R . ~1 ;5 i 5 k )
However, this equation can be inverted to produce thefollowing exact 1 - M
confidence level simultaneous confidence intervals for the p i , 1 5 i 5 k :
whose confidence intervals do not contain 11: are identified as those respon-
sible for the aberrant behavior.
This simple procedure clearly meets the goals set in the introduction
for a good solution to the multivariate quality control problem. An overall
error rate of a is achieved, since when p = I?) there is an overall probability
of 1 - a that each of the confidence intervals contains the respective value
p:. Also, the identification of the errant variables is immediate and simple,
and furthermore, the confidence intervals allow the experimenter to assess
the new meanvalues of theout-of-controlvariables.This is particularly
useful when theexperimenter canjudge theprocess to be still “good
enough” and hence allow it to continue.
2.2. Example
Consider first the basic multivariate quality control problem with IC = 2 so
that there arejusttwo variablesunderconsideration.Inthiscase,the
required critical point CK.?depends only on the error size a and the one
correlation term pi? = p, say. I n tables B. 1-B.4 of Bechhofer and Dunnett
( 1988), values of the critical pointare given for a = 0.20, 0. IO, 0.05, and 0.0 I
and for p = 0(0.1)0.9 (the required values for CK,? correspond to the entries
for 1’ = 2 and v = 00). More complete tables are given by Odeh (1982), who
tabulates the required critical points for additional values of CY and p (the
values C,<.%at k = 2 correspond to theentriesat N = 2). Interpolation
within these tables can be used to providecritical values for other cases
not given. An alternative method is to use a computer program to evaluate
the bivariate normal cumulative distribution function.
As an example of the implementation of the procedure with k = 2,
consider the problem outlined i n Alt (1985) of a lumber n~anufacturingplallt
that obtains measurements on both the st(!jrrle.ss and the berldirlg strength of
a particular grade of lumber. Samples of size 10 are averaged to produce an
observation s = ( s i , s,)’,and standard values for these averaged observa-
tions are taken to be p7) = (265,470)’ with a covariance matrix of
111this case the correlation is p = 0.6, so that with an error rate of c1 = 0.05,
the tables referenced above give the critical point as CR,?= 2.199.
Following an observation s = (si, .x-?)’, the simultaneous confidence
intervals for the current mean values p = (p,, p 2 ) ’ are given by
Multivariate Quality Control Procedures 213
2.3. IndependenceAssumption
A general assumption of the multivariate quality control procedures is that
observations obtained from the process under consideration can be taken to
be independentofeachother. Specifically, if a controlchart based on
Hotelling's T' statistic is employed, then it is assumed that the two statistics
Tf = (SI - pO)'f; - I (.\.I - 11')
and
' - p0 )'C"(S2
T'2 - (s- - - klO)
obtained from two observations ,X' and .X? of the process are independent of
each other. Individually, these two statistics each have a scaled F-distribu-
tion, but any lack of independence between them may seriously affect the
interpretation of the control chart.
Kuriki (1997) shows how the effect of a dependence between the vari-
ables can be investigated. In general, the joint cumulative distribution func-
tion of the statistics Tf and T: is
I -I
P(Tf 5 2 1 , T: 5 . 2 ) = P(J?;S-'J?II:I,.v~S 5 :2)
3. NONPARAMETRICPROCEDURES
. . . . . . . . . . . .. .. . . . . . . .. . . .
I yes
Traditional
Distribution-tree
testing testing
technlques
Distribution-treetesting
procedures
I Distribution-free testlng
procedures
3.1. NonparametricMultivariateControlCharts
Liu (1995) provides some nonparametric multivariate quality control pro-
ceduresthat follow theright-handdotted line ofFigure 1 in that they
compare current observations with an initial pool of “in-control” observa-
tions. The main idea is to reduce the current multivariate observation to a
univariateindexthatcan be plottedonacontrolchart.Three types of
controlchartsare suggested that are truly nonparametric i n natureand
can be used to detectsimultaneouslyanylocationchange or variability
change in theprocess. Liu’s procedures are motivated by the “depth” of
current measurements within the initial pool of observations and are con-
ceptually equivalent to the procedures described in Bush (1996) employing
functional algorithms to calculate the scores that are described in detail in
the following sections.
Ho: The new observation and the initial pool can be considered
to be p + I observations from the same unknown distribution.
216 Hayter
3.3. VariableTransformation
I t is convenient to define testing procedures in terms of a set of transformed
observations. If the initial pool and the new observation are combined to
+
form a set of p I observations, then let the sample average vector be S =
( S I ,..., X k ) and the sample covariance matrix be S,. The quality control
methodsrequirecalculatingadistancemeasurebetweenvariouspoints,
and a sensible way to do this is with the Mahalanobis distance, where the
distance from to s' is defined to be
s f
p+2- Ro
p-value =
P+ 1
Thisp-value reflects theproportion of the p + 1 observationsthat have
scores Sino smaller than So.
3.5. DecisionRules
The decision rules under which a process is declared to be out of control can
be chosen by the engineers implementing the procedure. Notice that the p-
value is limited by the number of observations i n the pool. For example, if
there are p + 1 = 100 observations and Ro = 100, then the p-value for the
procedure is 0.01, and the process can be declared to be out of control if the
specified probability of a false alarm, (x, is greater than or equal to 0.01.
Traditionally, the specified error rate for a quality control procedure is often
taken to be smaller than (x = 0.01, which implies that for this nonparametric
procedure a larger initial pool would be needed.
In addition to the consideration of individual p-values, “runs rules”
may also be employed. In univariate control charts, several successive points
on the sameside of the centerline are often allowedto trigger a stopping rule
218 Hayter
suggesting that there has beena change in the mean of the distribution.
Similarruns rules may be adoptedfor these nonparametricprocedures.
For example, suppose that the p-values for a series of successive observa-
tions are each less than 0.20 but that none of the individual p-values is less
than the specified a level. One might declare the process tobe out of control
on the basis that these new observations are a l l near the fringes of the initial
pool of observations.
Runs rules can be designed to locate changes in either the mean or the
variance of the distribution. Any appearance that a set of new observations
are not "well mixed" within the initial pool suggests that the distribution
may have changed. Changes in the mean imply changes in the location of
thedistributionandmay be identified by a locationalshift in the new
observations.Changes in thecovariancestructure C should be indicated
by changes i n the shape of the distribution. Specifically, increases in the
varianceof a variableshould be characterized by frequentobservations
outside or on the fringes of the distribution.
I n conclusion,theconsideration of theindividualp-values ofnew
observations together with an awareness of the location of thenew observa-
tions relative to the initial pool of observations should allow an effective
determination of out-of-control signals.
Functional Algorithms
With functional algorithms the scores are calculated from a series of com-
parisons of the observations .I"with each other. Specifically, the score Si is a
function of .I,' = (J:,, ...,.vi)and every other point in the pool and can be
written as
S; =f'(y';
yo, ..., JJ')
The function is defined so that observations that are far from the center of
the set of observations receive high scores while observations near the center
receive lowscores.Threepossiblechoicesforthefunctionaredescribed
below.
1. The easiest method to consider is
MultivariateQualityControlProcedures 219
r=l J=O
where is theindicatorfunction.Inthiscasethescorefunction
can be thought of as simply being calculated from a count of how
many points are on either side of a particular observation and as
being similar to a multivariate sign test. The score Siwill be close
to zero for points in the center of the distribution, because at the
center there are roughly an equal number of observations in every
direction. At the perimeter other observations tend to be to one
side, and thus the score will be large. For these scores the magni-
tude of the difference between two pointsy' and JJ is ignored, and
only the direction of the differenceis important. Note that thereis
a large potential for ties in the scores to occur with this method.
2 . Asecondprocedure is similartothe first except that the actual
distancesbetweenpointsareused to calculatethescores.The
score Siis calculated as the sum of the Euclidean distances from
y' to every other point y', 0 5 ,j 5 p , so that
Thus Siis the sum of the p distances from y' to all points in the
combined pool. It is clear that the scoresof the observations at the
center of thegroup will tendtobe lower thanthescoresfor
perimeter observations.
3. The scoresobtainedfromthethirdmethodarecalculated by
comparing an observation y' with a statistic based on the com-
bined pool. This statistic,
M = ( M I , ..., M k )
is chosen to be a "middle value" of the combined pool of observa-
tions such as the mean vector or the median vector. Typically the
scores can be calculated as the distances of the observations from
this middle value so that
s;= (JI;- M)'CV' - M )
Again, note that observations near the center of the pool will have
small scores while observations on the perimeter will have larger
scores. Note also that this method requires far fewer calculations
220 Hayter
Linkage Algorithms
Linkage algorithms resemble a linking clustering algorithm in that the p + 1
observations are linked together one point at a time. The cluster begins at
the center of the distribution and branches to all of the observations i n the
combined pool. Points are added to the cluster in succession until al l p + 1
points are part of the cluster. The criterion for choosing the next point tobe
added to the cluster is that it should be the “closest” observation to the
cluster. The distanceto the cluster canbe measured in several different ways,
which are discussed below. The score Siis defined to be equal toj when y i is
theJth point added to the cluster (note that i n this case R, = Si).The first
point to be added to the cluster can generally be taken to be the pointclosest
to .F. Observations closest to the center will tend to be added first, and those
on the perimeter willbe added last. Also, observations in heavily concen-
tratedareas will tend to be addedtothecluster before observationsin
sparsely concentrated areas, since in dense regions observations are closer
together,andthereforeobservations will tendtobe linked in succession
once the first observation in that region has been added to the cluster.
When these linkage algorithms are appliedit can be useful to construct
a “center value” M , which is considered to be the first point in the cluster
(although it may be removed from the cluster later). Three possible ways to
decide the order inwhich observations are added to the cluster are described
below.
I. If observation 11’is not already in the cluster, then it is added to the
cluster if it is the closest (amongall observations not alreadyin the
cluster) observation to any observation already in the cluster. In
other words, for each observationy‘ not already in the cluster, the
minimum distance
D..
!I
= (I! - I,./)’(\,‘ - J*/)
4. SUMMARY
REFERENCES
John C. Young
McNeese State University, Lake Charles, Louisiana
1. INTRODUCTION
223
224 Mason and Young
2. DETECTIONOFAUTOCORRELATION IN MULTIVARIATE
PROCESSES
1
1000.00
800.00
-
n
p) 600.00
400.00
m
'E 200.00
: 0.00
; -200.00
2 -400.00
Q.
-600.00
-800.00
- 1000.00
Time
present. However, the noted trend is due to a "lurking" variable that has a
seasonalcomponent. Sincethe effects ofsuch"lurking"variables,when
they are known to exist, can be accounted for by making adjustments to
the associated observable variable, the detection of these situations can be
a greataid in thedevelopmentofapropercontrolprocedureforthe
process.
Detecting autocorrelation in univariate processes is accomplished by
plotting the process variable against time. Depending on the nature of the
autocorrelation, the plotted points will either move up or down or oscillate
back and forth over time. Subsequent data analysis canbe used to verify the
time trend, determine lag times, and fit appropriate autoregressive models.
The simple and straightforward method of graphing individual components
against time can be inefficient when there are a large number of variables,
and the interpretations can become confoundedwhen these components are
correlated. Despite these disadvantages, we have found that graphing each
individualvariableovertime is still useful i n multivariate processes. I n
addition to studying autocorrelation, it can lead to the discovery of other
influential variables.
To augment the above graphical method and reduce the number of
individual graphs that need to be produced, we additionally suggest that a
time-sequence variable be added to the data set.If any of the other variables
correlates with the time-sequence variable, it is highly probable that it cor-
relates with itself over time. Using this approach, one can locate potential
variables that are autocorrelated. Detailed analysis, including the graphing
of the individual variable over time,will either confirm or deny the assertion
for individual variables. Other techniques, such a s that given in Tracy et a l .
(1993). also should be explored.
226 MasonandYoung
3. VARIOUSFORMSOFAUTOCORRELATION
Time
Figure 2 Life cycles over time.
Autocorrelation in MultivariateProcesses 227
14
12
10
'5
> 8
3
u 6
2 4
0-I
Time
T 2 = (X- X)'S"(X - X)
= T: + TZ.1 + . . . + T,,.12...~l-l)
2
where X and S are the usual estimates of the population mean vector and
covariance matrix obtained by using an in-control historical data set. In this
procedure [see Mason et al. (1995) foracompletedescription],the first
component of a particular decomposition, termed the unconditional term,
is used to determine whether the observation on thejth variable of a signal-
ing data vector is within the operational range of the process. The general
form of the jth unconditional T 2 is given by
This is the square of theith variable adjusted by the estimates of the mean
and variance of the conditional distribution of x,
given s l , s 2 , ...,
Theordering of the components in the data vectordeterminesthe
representationofeachtermofthedecomposition. As pointed out by
Mason et al.( 1 995). there are p ! different arrangements of the p components
of a data vector, and these lead to p ! decompositions, each consisting of p
terms. Mason and Young (1997)show that the unique terms of a l l such
decompositions will contain all possible regressionsof an individual variable
on all possible subgroups of the remaining p 1 variables. For example, the
first component, s , , of a three-dimensional data vector would be regressed
against all possible subgroups of the other two variables. These regressions
and the corresponding conditional T’ terms are presented in Table 1. Using
the tabulated results, a control procedure based on the T’ statistic can be
developed for a set of process variables that exhibit uniform time decay in
the observations and, at the same time, are correlated with other process
variables. As an example, consider a bivariate vector (.Y,.Y) where the vari-
able J, exhibits a first-order autoregressive relationship [i.e., AR(l)]. Note
that the observations are actually of the form (Xt, Y t , YtPl), where t repre-
sentsthetimesequenceofthe data. The AR( 1) relationshipforcan be
represented in model form as
where bo and h l are unknown constants. If J, were being monitored while its
relationshipwith s wasignored, a signal would be produced when the
observed value of J‘ was not where it should be a s predicted by the estimate
of the model in Eq. (3). However, if one chooses to examine the value of J’
adjusted for the effect of s and the time dependency, a model of the form
The
reconstructed
data vector
would be of the
form (.\-,, The
use of suchtime-dependentmodelsrequiresprocessknowledgeand an
extensive investigation of the historical data.
530 1
520 -
510 -
500 -
5 490 -
c
480 -
460
470 1
Time
proposedcontrolchartprocedure.Theseincludethree processvariables,
labeled TEMP, L3, and L1, and a measure of feed rate, labeled RPI. All,
with the exception of feed rate, show some type of time dependency.
Temperature measurements are available from many different loca-
tions on a reactor, and together these play an important role in the perfor-
mance and control of the reactor. To demonstrate the time decay in all of
the measured temperatures, we present in Figure 4 a graph of their average
over a good production run.The plot indicates that the average temperature
of the reactor gradually increases over the life cycle of the unit.
Figures 5 and 6 contain graphs of the other two process variables, L3
and L1, over time. The decay effect for L3 in Figure 5 has the appearance of
an AR(1) relationship, while that for L1 in Figure 6 has the appearance of
some type of quadratic (perhaps a second-order quadratic) or an exponen-
tial autoregressive relationship.
Feed flow (RPl) toa reactor consistsof three components: theflows of
0 2HC1, gas, and CzH4.However, since these components must be fed in at
a constant ratio, one graph is sufficient to illustrate the feed. During a run
60.000 T
10.000
0.000
1
Time
Figure 5 L3 versustime.
232 MasonandYoung
2.00 T
0.00 J
Time
Figure 6 L1 versus time.
cycle, the feed to the reactor is somewhat consistent and does not system-
atically vary with time. This is illustrated i n Figure 7.
The correlation matrix for the four variables RP1, L1, L3, and TEMP,
including the first-order lag variables for LI, L3, and temperature (LLI,
LL3, and LTEMP), is presented in Table 3. Note the very strong lag corre-
lation for the three process variables. For example, L1 has a correlation of
0.93 with its lag value, indicating that over 80% of the variation on this
variable can be explained by the relationship with its lag value. This strong
correlation implies that an AR(1) model is a good approximation to the true
time dependency. Also, note the strong relationship between L1 and the lag
of the temperature. The correlation of 0.80 implies that over 64% of the
variation in the present value of L1 can be explained by the temperature of
the unit during the last sampling period.
To see the effect of these time-lag variables on a T' control proce-
dure, wewill comparethe T' values obtained with andwithoutthe lag
variables. Forcomparisonpurposes, we denotethe T' based onthe
chosen fourvariables RPI, L1, L3, andTEMP by T i andthe T'
based on all sevenvariables,includingthethreelagvariables LLI,
LL3. and LTEMP, by T;. Assume that each observation vector is repre-
250000 T
200000 "
100000 - -
Time
Figure 7 RPI versus time.
AutocorrelationinMultivariateProcesses 233
Observation T; T:
No (Criticalvalue = 39.19) (Criticalvalue = 28.73)
4.75 I 16.98
2
24.27 3 37.28
.X84 41
27.82 5 39.10
6
25.27 7 37.18
13.51 8 31.74
9
10
19.49 11 20.76
12
13
14
Autocorrelation in MultivariateProcesses 235
where X,;represents the covariance structure of the observations for the ith
stage,i = 1,2,3; and x,,i # j , denotes the covariance structure ofthe
observations between stages. Using a historical data set, standard estimates
(x. S ) , of the unknown population parameters (p,C) can be obtained, and a
control procedure based on an overall T’ can be developed. Note that the
estimates are partitioned in the same fashion a s the parameters.
As an example of theproposedcontrolprocedure,supposea new
observation, X. is taken on a given unit in its third stage. The overall T’
for this observation is given by
Interpretation
Component of component
7. SUMMARY
REFERENCES
Alt FB, Deutch SJ, Walker JW. (1977). Control charts for multivariate, correlated
observations.ASQCTechnicalConferenceTransactions.Milwaukee, WI:
American Society for Quality Control, pp 360-369.
Mason,RL,Young,JC. (1999). improvingthe sensitivityof the T Z statistic in
multivariate process control. J Qual Technol 31. In press.
Mason RL, Tracy ND, Young JC. (1995). Decomposition of T’ for multivariate
control chart interpretation. J Qual Technol 27:99-108.
Mason RL. Tracy ND, Young JC. (1996). Monitoring a multivariate step process. J
Qual Technol 28:39-50.
Mason RL, Tracy ND, Young JC. (1997). Apracticalapproachforinterpreting
multivariate T Z control chart signals. J Qual Technol 29:396-406.
Montgomery DC. (1991). Introduction to Statistical Quality Control. New York:
Wiley.
Montgomery DC, Mastrangelo CM. (1991). Some statistical process control meth-
ods for autocorrelated data. J Qual Technol 23:179-193.
Tracy ND, Mason RL. Young JC. (1993). Use of the covariance matrix to explore
autocorrelation in processdata.In: Proceedingsof theASA Section on
Quality and Productivity. Boston, MA: American Statistical Association. pp
133-135.
This Page Intentionally Left Blank
Capability Indices for Multiresponse
Processes
Alan Veevers
Commonwealth Scientific and Industrial Research Organization,
Clayton, Victoria, Australia
1. INTRODUCTION
241
242 Veevers
2. CAPABILITYSTUDIES,PROCESSMONITORINGAND
CONTROL
the process mean is not optimally targeted. The information providedby the
C,, value tells us that the process is potentially capable without further need
to reduce variation. Process performance willbe improved, monitored by
CPk,by suitably adjusting the target for the process mean.
Similarconsiderationsapplytomultiresponsecapability indices.
Specifically, there is a clear justification for developing analogs of C,, for
the multivariate case that, of course, take no account of targeting. Such an
index will measurethepotentialoftheprocesstomeet specifications
(addressing question 1) but will not, by intent, measure actual performance.
Different measures must be devised for the latter purpose.
Another source of confusion arises when process capability and pro-
cess control issues are not separated. An illustrationof the point being made
here is based on the following example. During the 1997 Australian Open
Tennis tournament, someof the top players complained about the quality of
the balls being used. International regulations specify that they shall weigh
not less than 56.7 g and not more than 58.5 g and must be between 6.35 cm
and 6.67 cm i n diameter. The tennis ball production process must be set to
achieve both these specifications simultaneously. This defines a rectangular
specification region forthebivariatequalitymeasureconsistingofthe
weight and diameter of a tennis ball. A small sample of measurements on
ordinary club tennis balls was obtained that showed a correlation of 0.7
betweenweight and diameter. This information wasused to contrive the
situation shown in Figure 1 to illustrate the differencebetweencapability
andcontrolconsiderations.Supposethataperiod of stableproduction
produceddataapproximately following abivariatenormaldistribution
with a correlation coefficient of 0.7. A 99% probability ellipse for such a
distribution is shown in Figure 1. Now suppose that the next two measured
balls are represented by the + signs in the figure. Two conclusions can be
drawn, first that the process has gone out of statistical control and second
that the two new balls are perfectly capable of being used in a tournament.
In fact, the two new balls are arguably better, in thesense of being nearer to
the center of the specification region, than any of the balls produced in the
earlier stable phase.
Fromthe process controlpoint of view, theout-of-control signals
must be acted upon and steps taken to bring the process back into stable
production.Multivariate process controltechniques,such as thatintro-
duced by Sparks et al. [5] or those discussed in a previous chapter of this
book, are available for this purpose. Based on multivariate normal theory,
ellipsoidal control regions form the natural boundaries for in-control obser-
vations. Points falling outside the control region are usually interpreted as
meaning that something has gone wrongwith the process. From theprocess
capability point of view, it is whether or not production will consistently
veevers
meet specifications that is of primary importance. I n this case, the t-xt that
the region bounding the swarm of data points maybe ellipsoidal is of minor
importance. The main concern is whether or not it fits into the specification
region. Capability indices are not tools for process control and should not
be thought of as measures by which out-of-control situations are detected.
They are simply measures of the extent to which a process could (potential)
or does (performance) meet specifications. Issues of control and capability
need to be kept separate; otherwise unnecessary confusion can occur. For
example, although correlation is of critical importance i n control methodol-
ogy, it is largely irrelevant for many capability considerations.
3. MULTIVARIATECAPABILITYINDICES
region. Within those groups there are indices that measure capability poten-
tial and some that measure capability performance.
Let X, = (Xl, X,, ..., X(,)’
represent the vector of q quality character-
istics, andsupposethatanadequatemodelfor X, understableprocess
conditions is multivariate normal with mean vector 11 and variancexovar-
iance matrix C. Taking the widely accepted value of 0.27% to be the largest
acceptable proportion of nonconforming items produced, a process ellipsoid
(X - p)’C”(X - p) = 2
any type, and exploration of the posterior predictive distribution ofC,, given
D is limited only by available computing power.
Most of the above indices are not easy to use in practice and present
difficult problems in the exploration of their sampling distributions. Two
approaches that don’t suffer from this are given by Boyles [14] and Veevers
[ 151. Boyles moves away from capability assessment and promotes capabil-
ity improvement by using exploratory capability analysis. Further develop-
ments in this area are described by Boyles (in the present volume). Veevers’
approach is based on the concept of process viability, which is discussed in
the next section.
4. PROCESS VIABILITY
Veevers [ 15, 161 realized the difficulties associated with extensions of Cl, and
Cl,k to multiresponse processes and concluded that the reasons lay in the
logic underlyingthe structure of Cl, and C,)k. This led to thenotion of
process viability as a better way of thinking about process potential than
the logic underlying C,. He introduced the viability i d e s first for a single-
response process and then for a multiresponse process.
Basically, viability is an alternative to capability potential, leaving the
word “capability” to refer to capability performance. For a single-response
process it is easy to envisageawindowof opportunity for targeting the
process mean. Consider the process distribution, which need not be normal
and,conventionally, identifythe lower 0.00135 quantileandtheupper
0.99865 quantile. Placethisdistributionona scale of measurementthat
has the lower and upper specification limits (LSL and USL, respectively)
marked on it, with the lower quantile coincident with the LSL. If the USL is
to the right of the upper quantile, slide the distribution along the line until
the upper quantile coincides with the USL. The line segment traced out by
the mean of the distribution is the window of opportunity for targeting the
mean. The interpretation of the window is that if the mean is successfully
targeted anywhere in it, then the proportion of nonconformingitems will be
no greater than 0.27%. A process for which a window of opportunity such
a s this exists is said to be v i c h k ; i.e., all that needs to be done is to target the
mean in the allowable window. If. however, the USL is to the left of the
upper quantile (after the first positioning), then there is clearly more varia-
tion in the response than is allowed for by the specifications, and the process
is not viable. Sliding the distribution to the left until the upper quantile and
the USL coincide causes the mean to trace out a line segment that, this time,
can be thought of as a “negative” window of opportunity for targeting the
mean. Referring to the length of the window, in both cases, as w , a viable
240 Veevers
11'
V, =
USL - LSL
I
-5 0 5
x1
Figure 2 The window of opportunity (dotted rectangle) for targeting the mean for
a viable bivariate process. The solid rectangle is the specification region.
i= I
where V,.(X,) is the viability index for the ith quality characteristic X,. For
nonviable processes, Veevers [I51 defines negative windows of opportunity
in such a way as to ensure that the viability value obtained for a (q - I )
dimensional process is the same as would be obtained from the q-dimen-
sionalprocess by settingthemarginalvariance ofthe yth characteristic
equaltozero. Hence, is defined i n all nonviablecases to be
I,'.(/ =1-
where
0 if V,.(X,) 2 0
1 if V,.(X,)<O
As with any index for multiresponse processes, the viability index is best
used in acomparativefashion. In a processimprovementcampaignthe
viabilities can be compared after each improvement cycle, thus providing
a sinlple measureoftheprogressbeingmade. Vrq depends only onthe
marginal viabilities and is therefore independent of the correlation structure
of X,. The correlation coefficients do, however, affect the proportion of
nonconforming items that would occur if the process was i n production.
If anupperbound of 0.27% is required,then a conservativechoiceof
quantiles to use for the calculation of the nlarginal viabilities is 0.00135/q
and 1 - O.O0135/q. The specific choice in an improvementcampaign is
unimportant. since the emphasis is on changes in VrY rather than the pro-
portion nonconforming.
Having had some experience with multiresponse viability calculations,
the following modification to the Vr4 index is proposed. First, note that a
viable process with, say, q = 6 and marginal viabilities of 0.25 each (corre-
sponding to Cl, values of I .33) has Vr4 = 0.00024. It is difficult to relate this
small number to the reasonable level of viability it represents. Further, it
depends on q, and for larger values of q the viability index would be very
sInall. These difficulties can be overcome by defining a modified index
Capability Indices for Multiresponse Processes 251
for viable processes. This has thebenefit of being interpretable on the scale
of V,., independently of q. For nonviable processes, Vr9 is negative, so c9
must be defined as
which is also valid for viable processes and provides a general definition of
V& A plot of V;* for a two-response process is shown in Figure3. If desired,
VF9 can be convertedto a capabilitypotentialindex, Ci9, by C;q =
1/(1 - G 9 ) .
Viability calculations are illustrated in the following exampleused by
Sparks et al. [5] to demonstrate the dynamic biplot for multivariate process
monitoring. A flat rolled rectangular metal plate is supposed to be of uni-
form thickness (gauge) after its final roll. Measurements are made at four
positions on the plate, giving a four-dimensional response for the process.
The positions can be conveniently referred to as FL (front left), F R (front
right), BL (back left), and BR (back right). The original data are subject to a
confidentialityagreement, so theyhavebeentransformedbeforebeing
plotted as pairwise scatter diagramsin Figure 4. Typical specification limits
are superimposed, but it must be remembered that this is being done to
visualize process dispersion relative to specificationsand does not represent
actual process performance with respect to targeting. The two-, three-, and
four-dimensional specification regions are squares, cubes, and a hypercube,
as appropriate.
The individual viabilities for FL, FR, BL, and BR are calculated as
0.147, 0.185, 0.1 11, and 0.137, respectively. This implies the existence of a
positive window of opportunity for targeting the mean and gives Vr4 =
0.000415and c4 = 0.143.Using the relationship betweenviability and cap-
(D (D
P v iL
7-
N
~
..... N
E o am0
N N
P LSLA L""
I?
-f
'9 '9
-6 -4 -2 0 2 4 6
FL
. ". . ~ .~
us1
usL r"~
N
d o
N
LsL!- I? :- -I
LSI
-
6
(D
4 .4 -2 0 2 4 6
(D
USL
1 N
usL
-I
LSL
J
m
rn
-6 -4 -2 0 2 1 6 6 - 4 - 2 0 2 4 6
FR OL
Figure 4 Pairwise scatter diagrams of thickness data at four locations -FL, FR,
BL, BR---on 100 mctal sheets. Specification rectangles are superimposcd.
Capability Indices for Multiresponse Processes 253
5. PRINCIPALCOMPONENTCAPABILITY
each of which could be used in its own right as a capability index. However,
it seems a sensible compromise to take the average of these two as a measure
ofcapabilitypotential.Hence,aprincipalcomponentcapabilityindex is
defined as
with entries rounded to four decimal places. From this, i, = 0.1 105, giving
= 1.21. As an absolute value this should be interpreted with caution, but
for process improvement purposes it is useful as a comparative value. A
sample of 25 items from a batch produced underslightly different conditions
Capability Indices for Multiresponse Processes 255
gave i, = 0.086 and C/,,.= 1.37, showing a marked improvement. The tnan-
ufacturer’s aim is to keep theprocess at these conditions, which show it to be
potentially capable, and then concentrate on targetingat the nominal values
to ensure a capable performance.
6. CONCLUSION
REFERENCES
R. Gnanadesikan
Rutgers University, New Brunswick, New Jersey
J. R. Kettenring
Telcordia Technologies, Morristown, New Jersey
1. INTRODUCTION
257
258 GnanadesikanandKettenring
observations that are more similar within groups than across them. This
setting is also known as unsupervised learning. There are, of course, many
real-world situationsthat fall betweenthetwoscenarios, andoftenone
needs a combination of the two approaches to find useful solutions to the
problem at hand. For instance, whiletheearlydevelopmentof so-called
neural networks, which basically are automatic classifiers implemented i n
either software or hardware, focused on supervised learning methods, the
current uses of these encompass both supervised and unsupervised learning
algorithms.
This chapter has three objectives. First, taking a broad view of busi-
ness and industry, it seeks to identify a variety of aspects of such enterprises,
as well as examplesof specific problem arising i n suchfacets,wherein
classification and clustering techniques are used to find appropriate solu-
tions. Second, using the theme of quality and productivity as a focus, it
describesasampleofapplications(drawnfromboththeliteratureand
our experience) in which this theme is a clear objective of using such tech-
niques. Third, it is aimed at discussing some methodological issues that cut
acrossapplicationsand need to be addressed by practitionerstoensure
effective use of the methods as well a s by researchers to improve the options
available to practitioners.
More specifically, Section 2 identifies areas of business and industry, as
well as some specific examples of problems in such areas, where classifica-
tion and clustering techniques have been used. Italso describes i n a bit more
detail a subset of the examples where assessment and improvement of qual-
ity, efficiency, and/or productivity are explicitly involved a s a goal of the
analysis. Section 3 discusses some general methodological issues that need to
be considered. Section 4 consists of concluding remarks.
Perhapsthebetterknownindustrialapplications of patternrecognition,
including some that were mentioned in the introduction, are in manufactur-
ing. However, one can identify a number of facets that are integral parts of
business and industry as a whole and give rise to problems that are amen-
able to the meaningful use of pattern recognition methods. Table 1 contains
a partial list of different facets of a business enterprise and some specific
examples of applications of classification and clustering methods in each
category. A subsetoftheexamples (identified by asterisks) in Table I ,
PatternRecognition in Industry 259
Marketing
Use of cluster analysis for market segmentation on the basisof geodemographic
similarity [Sce, e.g., Chapter 12 of Curry (1993)l and the recent development of
database marketing
*Use of cluster analysis foridentifying “lead users” and for product developmentin
light of the needs of such lead users (Urban and Von Hippel. 1988)
Resource allocation
Utilization of robotics (entailing the recognition of “shapes” and “sizes” of objects
to be assembled into a product) in assembly line manufacturing with gains in
quality and productivity arising from decreased variability and speed as well as
lower costs in the long run [See, e.g., Dagli et al. (1991).]
Niche applications of neural networksfor such things as speech and writing recogni-
tion (e.g.. voice-activated dialing of telephones; automatic verification of payments
of bills paid by customers via checks)
Use of cluster analysis for grouping similar jobs prior to the developmentof regres-
sion models for aiding assessment and improvement of utilization of computing
resources (Benjamin and Igbaria, 1991)
*Use of cluster analysis in the development of a curriculum that better meets job
needs and is likely to enhance worker productivity (Kettenring et a l . , 1976)
Software engineering
Useof fuzzy clustering to improve the efficiency ofa database querying system
(Kame1 et al., 1990)
Use of discriminant analysis for predicting which software modules are error-prone
(Conte et al., 1986)
*Use of neural networks for “clone” recognitionin large software systems (Carter et
al., 1993; Barson et al., 1995)
Strategic planning
*Use of cluster analysis for identifying efficient system-level technologies (Mathieu,
1992; Mathieu and Gibson, 1993)
260 GnanadesikanandKettenring
2.1. Finance
ASnoted in Table I , classification and clustering are used to establish
categories of comparable risk so astodetermineappropriateratesof
return.
Historically, and particularly duringthe 197Os, one role of governmen-
tal regulatory bodies in the United States was to set allowed rates of return
on equity for the companies they regulated. The regulated companies argued
that in order to attract investors they needed higher rates of return, while the
regulators felt pressured to keep them low. An accepted tenet for resolving
the two conflicting aims wasthat the rate of return should be commensurate
with the “risk” associated with the firm. For ilnplelnenting this principle,
one formal approach employs the capital assets pricing model espoused by
Lintner (1965), Markowitz (1959), and Sharpe (1964).Chen et al. (1973)
tookadifferent and more empirical approach by using dataconcerning
several variables that are acknowledged to be risk-related (e.g., debt ratio,
price/earnings,ratio,stock price variability) and findingcompanieswith
similar risk characteristics that could then be compared in terms of their
rates of return. Standard & Poor’s COMPUSTAT database pertaining to
over 100 utilities and over 500 industrials was the source, and a particular
interest of the analysis was to compare AT&T’s rate of return within the
group of firms that shared its risk characteristics.
At an initial. general level of analysis, Chen et al. (1973) addressed the
question of AT&T’s classification as belonging to either the utility group or
the industrial group through the use of discriminant analysis. They found
strong evidence thatAT&T belonged with theindustrials. To providea
different look, one could use cluster analysis to find groups of firms with
similar risk features and further investigate the particular cluster to which
AT&T belongs. Since the primary interest of the authors was in the latter,
and also partly because the number of firms was large, an attempt was made
to find a “local” cluster near AT&T in terms of the risk measures rather
than clustering all the firms [see Cohen et al. (1977) for details of the algo-
rithm involved]. This analysis led to detectingaclusterof 100 industrial
firms with risk comparable toAT&T’s. I n terms of the performance measure
of rate of return, AT&T’s value was found to lie below the median of the
rates of return of this cluster, thus providing a quantitative basis for arguing
a higher rate of return.
PatternRecognitioninIndustry 261
2.2. Marketing
In market research, classification and clustering can serve as aids in product
development in light of the needs of lead users.
Urbanand VonHippel (1988) describeaninnovativeapproachto
product development in situations where the technology may be changing
very rapidly. Efficiency in developing a product with an eye to capturing a
significant share of the market is the desired goal. The efficiency arises from
studying a carefully chosen subset of the potential market andyet ending up
having a product that is likely to satisfy the needs of and be adopted by a
much larger group of customers. The main steps of the approach proposed
by Urban and Von Hippel are to use cluster analysis for identifying a set of
“lead” users of the product, then seek information from such users about
whatfeaturesandcapabilitiestheywould like the producttohave,and
finally apply this information not only to develop the product but also to
test its appeal and utility for a wider group of users. The specific product
used to illustratetheapproach is softwareforcomputer-aided design of
printed circuit boards (PC-CAD). Careful choice of variables that are likely
to indicate “lead” users is a key part of and reason for the success of the
initial cluster analysis. Variables used included measures of in-house build-
ing of PC-CAD systems,willingness to adopt systems atearlystages of
development, and degree of satisfaction with commercially available sys-
tems. Atotal of 136 firmswereclustered on the basis of suchvariables.
Bothtwo-andthree-clustersolutionswerestudied,andtheformerwas
chosen as satisfactorywithoneofthetwoclustersbeingpredominantly
“lead” users. Treating the two clusters as if theywereprespecified ~-i.e.,
the discriminant analysis framework, for instance-the authors report that
the fraction correctly classified in the two clusters was almost 96%. More
interesting,wheninformationgatheredfromtheleaduserswasusedto
design a new PC-CAD system and this new designwaspresented to the
participants i n the study, about 92% of the lead user group and 80% of the
non-lead group rated it as their first choice! Urban and Von Hippel (1988)
also discuss the advantages and disadvantages of their lead-user methodol-
ogy in general contexts.
jobs and training needs from a sample of n workers engaged in the job at
which the training is directed, and (3) cluster analysisof the resulting (p x n)
matrices in various ways. Inoneanalysis,the p = 169 rows of amatrix
indicating elements performed on the job by the sample of n = 452 workers
yielded insights into clusters of elements of the job that fit together and
mightpotentially be taughttogetherasamodule.Thesehelped identify
gaps in the existing curriculum where new resources were needed. In another
analysis, then = 452 workers were clustered into groups with common train-
ing needs. The range of needs across the clusters suggested that a training
program with flexible options would be an efficient way to train the workers.
2.4. SoftwareEngineering
Carter et al. (1993) (see also Barson et al., 1995) tackle the problem of clone
detection in large telecommunications software systems. A clone is a unit of
software source code that is very similar to some other unit of code in the
same system. In large systems with a long history, it may happen that there
are several clones of the same piece of software. These can unnecessarily
inflate the size of the overall system and make it less efficient to maintain.
For example, should there be a fault in one of the clones, it would probably
be present and need to be corrected in the others as well.
The twopapersmentionedabove discussdifferentneuralnetwork
approaches to software clone detection. In Carter et al. (1993), an unsuper-
vised neural net is used to form clusters of software units based on a set of
features or variables. The variables characterize different aspects of a unit of
source code such as its physical layout. New units of code canbe compared
against existing clusters to see if they fall within one of these clusters. The
overall approach is attractive, even though it does not yet appear to have
been widely applied.
2.5. StrategicPlanning
Mathieu (1992) (see also Mathieu and Gibson, 1993) discussesan interesting
use of cluster analysis for prioritizing critical technologiesi n national policy
making and guiding the choice of an efficient system-level technology. One
of the prime difficulties in such situations is the interdependencies among the
technologies. This work claims to be the first in the literature to provide a
systematicquantitativemethodfor explicitly identifying“highperfor-
mance”technologiesforaidingnationalpolicymaking. As stated by
Mathieu, “the purpose of using cluster analysis in technology planning is
to determine natural groupings of system level technologies based upon the
scientific interdependenciesthat link thesetechnologies.” Theparticular
PatternRecognition in Industry 263
1o x
Inboard Satellite
,ommunications Equip.
Average
Sales Growth icientific
iatellites
5%
Detection and
4 4
Remote
Sensing
Tracking Transmissiol
Equipment
OX
20K 1O K 0%
3. SOMESTATISTICALMETHODOLOGICAL ISSUES
The discussion in the previous section was designed to leave the impression
that methods of pattern recognition are used in many facets of business and
arehavingconsiderableimpact onmatters of qualityandproductivity.
Indeed, if one takes a reasonably holistic view of quality management, it
is not a stretch to conclude that these methods are a potent part of the
arsenal of tools for quality improvement.
At the same time, practitioners of these methods need to be aware of
the care that is necessary for their successful use. The applications literature,
unfortunately, is notreassuring in thisregard;subtledetailsareseldom
discussed, and canned programs appear to be heavily, even totally, relied
upon.
The difficulties start at the earliest partof the analysis when a commit-
ment is made to what data and which variables to use. The temptation is to
include every variable of possible value to avoid missing out on an impor-
tant one. The price one pays for this ranges from a needlessly watered down
analysis to full-blown distortion of the results. In cluster analysis, the risk is
particularly severe: Clear-cut clustersconfined to a subspaceof the variables
can be completely overlooked.
The traditional methods of discrillinant analysis have the nice math-
ematical property of being invariant under nonsingular linear transforma-
tion of the data. However,in most cluster analysis procedures,this is not the
case. There is explicit or implicit commitmentto a metricthatatone
extrememay be invariantbutotherwisewithoutrationale(as when one
uses the total covariance matrix of the entire data set to form a weighting
matrix for the metric) and at the other may involve no reweighting of the
variablesandthereforenosuchinvariance(as in thecase of Euclidean
distance). An intermediate, and far too popular, example is autoscaling or
weighting to equalize the total sample variances of all the variables. This
works against detecting clustersby all methods that take autoscaled data, or
distances derived from them, as input. Rather than putting the variables 011
: ~ I Iequal footing. according to their within-cluster variation (which is what
PatternRecognitioninIndustry 265
4. CONCLUDING REMARKS
REFERENCES
1. GENESIS
269
270 Spiring
2. PROCESSCAPABILITYINDICES
Process capability indices are used to assess a process’s ability to meet a set
of requirements. When used correctly these indices provide a measure of
process performance that in turn can be used in the ongoing assessment of
process improvement. Indices allow statistically based inferences to be used
in theassessmentofprocesscapability as well as in the identification of
changes in the ability of the process to meet requirements.
I t is generally acknowledged that Japanese companies initiated the use
of process capability indices when they began relating process variation to
customer requirements in the form of a ratio. The ratio, now referred to as
the pr.orr.ss cupcrhilitl~ir~des,is defined to be
- USL- LSL
cy =
60
where the difference between the upper specification limit (USL) and the
lower specification limit (LSL)providesameasureofallowableprocess
spread (i.e., customer requirements) and 60, 0’ being the process variance,
a measure of the actual process spread (see Fig. I).
C,, uses only the customer’s USL and LSL in its assessment of process
capability and fails to consider a target value. The five processes depicted by
LSL Actual
SpreadProcess (6a) USL
USL - LSL
C,I,,l =
+
6[0’ ( p - T)2]’’2
USL - p
Cp1, =
30
p - LSL
CP, = 30
CPk = min(C/d, C,,,)
and
C& = (1 - k ) C ,
LSL T CL USL
min[USL - T, T - LSL]
Cp,,,=
3[0’ + (p, - T)’]”’
Cp,, =
30 USL - T
CP, = T - L
3aS L ( & IT-”>
T - LSL
and
Note that the original definitions of C,,,,,, Cll,, C,,, and C,, are special cases
+
of the generalized analogs with T = (USL LSL)/2.
The process capability indices C,, CI,,,Cl,,,, CPk, and C,,,,, and their
generalized analogs belong to the familyof indices thatrelatecustomer
requirementstoprocessperformanceasaratio. As process performance
improves, through either reductions in variation and/or moving closer to
LSL T P USL
the target, these indices increase in magnitude for fixed customer require-
ments. In each case larger index values indicate a more capable process.
Many modifications to the common indices, as well as several newly
developed indices, have been proposed but are not widely used in practice.
With remarkably few exceptions these recent developments can be repre-
sented using the generic process capability index
min[USL - T , T - LSL]
c,,,,.= 3[a2 + w(p - T)2]”2
k(2 - k )
\v =
(1 - k)2p2
min[USL - p , p - LSL]
=
Cpmk
+
3[a2 ( p - T)2]1’2
3. INTERPRETINGPROCESSCAPABILITYINDICES
2.5
0.5
0
-1 -0.5 0 0.5 1
X
Figure 5 Five processes with equivalent nonconforming but different values of C,,.
(not proximity to the target) and are therefore consistent with the square-
well loss function.
Taguchi uses the quadratic loss function (see Fig. 7) to motivate the
idea that a product imparts “noloss” only if that product is produced at its
target. He maintains that small deviations from the target result in a loss of
quality and that as the productincreasingly deviates from its target there are
largerandlarger losses in quality.Thisapproachtoqualityandquality
assessment is differentfromthetraditionalapproach,whereno loss in
quality is assumeduntiltheproductdeviatesbeyonditsupper or lower
specification limit (i.e.,square-well loss function). Taguchi’sphilosophy
highlights the need to have small variability around the target. Clearly in
this context the most capable process willbe one that produces a l l of its
product at the target, with the next best being the process with the smallest
variability around the target.
The motivation for C,,,, does not arise from examining the number of
nonconforming product in a process but from looking at the ability of the
process to be in the neighborhood of the target. This motivation has little to
do with thenumber of nonconforming,althoughupperboundsonthe
number of nonconformingcan be determinedfornumericalvaluesof
C,,,,,. The relationship between C,,,, and the quadratic loss function and its
affinity with the philosophies that support aloss in quality for any departure
from the target set C,,, apart from the other indices.
C,,k and C,,,,, are often called second generation measures of process
capability whose motivations arise directly from the inability of C,, to con-
siderthetargetvalue. The differences in theirassociated loss functions
demarcate the two measures, while the magnitudinal relationship between
Target
C, and CiIk,C,,,, are also different. C1,k and C,,,,, are functions of C, that
penalize the process for not being centeredat the target. Expressing C, and
Cl,k as
and
C/)k= (1 -
USL
2 ‘ p- T I >c/,
- LSL
I ... . . . , .. .
CP
0 1 2 3 4 5 6 7 8 4 ,
P
4. ANALYZINGPROCESSCAPABILITY STUDIES
or
A A
Cl,k. = ( 1 - k)C,
4.1. ConfidenceIntervals
Several inferential techniques have recently been developed, most of which
havehadlittleimpact onthepractice of judging a processcapable. In
defense of the practitioners, several notable texts promote the use of esti-
mates as parameterswith the proviso that large samplesizes (i.e., I I > 50) are
required. A general confidence interval approach for the common indices
can be developed using C,,,. and its associated estimator C,,,,.. The general
form of the estimator for C,,,,. is
USL - LSL
=
‘/J!Y
+
6[6’ w(S - T)2]1/2
where e2
= Cr=,[(si - .V)’/n] and S = C~=,(.Y,/H). Assumingthatthe
process
-2
a -measurements are
( U ’ / I I ) ~ ~ -(,2, ) S-
normally distributed it follows that (1)
N [ p , a2/n],and (3) .V and Z 2 / n are independent.
280 Spiring
degrees of freedom (11 - 1 and I), the noncentrality parameter ( h ) . and the
weight function (w)of the linear combination of chi-square distributions.
The functional form of the di’s for the general Qi,,(x) are
In[ll:
* * To determine the di‘s for the number of specified * *
* * i’s enter the values of 1 and w **
1= ;w=
Do[Print [Sum[Sum[Exp[-(1)/2]( ( (1)/2)-(b-k)) ( ( (b-k) ! )- -l)*
(w^(-.5-b+k))((l-w-(-l))-(k+g-b))Gamma[(.5+g-b)]*
Binomial [b-l,k]/(Gamma[(g-b+l)IGamma[.51),
~k,O,b~l,~b,O,g~ll,~g~l,i~l
AssessingProcessCapabilitywithIndices 281
In121 :
* * Approximate the value of the distribution by **
* * replacing an infinite sum with an finite sum of * *
* * i+l terms using values of n, a , 1 and w **
((Statistics'ContinuousDistributions'
;w= ;n= ;a= ;
Sum[Quantile[ChiSquareDistribution[n+2gl ,a]*
Sum[Sum[Exp[-(1)/2] (((1)/2)-(b-k))(((b-k)!)--1)*
(~-(-.5-b+k))((l-w-(-l))-(k+g-b))Gama[(.5+g-b)]*
Binomial[b-l,k]/(Gama[(g-b+l)lGamma[.51~,
{k,O,b}l,{b,O,g}l,{g,l,i}l+
(Exp[-1/2] (~~(-0.5))*Quantile[ChiSquareDistribution[n] ,a])
where Qi.,(B) represents the value of the Qi,,(x) variate for 11, h and prob-
ability B. It follows that
which implies
282 Spiring
USL - LSL
c/J,!' = ' / J
60
for
USL - LSL
cpl
ll
6(s2 + [ n / n - l](S - T ) 2 } 1 / 2
The weight function
k(2 - k )
11' =
( 1 - k)'p'
AssessingProcess Capabilitywith Indices 283
A A
assuming that p(0 < p ) , cl, and are known (Le., nonstochastic), C,,,,. = Cl1k
results in the confidence interval
A n
min[USL - T , T - LSL]
=
3[s: + n ( S , - T ) 2 / ( ~-?I)]’/’
C/l,l,
where sf is the subgroup sample variance andS, the average of the observa-
tions in subgroup t. If an X&R chart is used, consider
min[USL - T . T - LSL]
c,,,,,= 3[(R,/d2)’ + n ( F , - T ) * / ( n
- 1)]’/*
where R, denotes the range for subgroup t and d 2 the usual control chart
constant. Each subgroup in the process provides a measure of location, .TI,
and a measure of variability (either R, or s f ) . Hence an estimate of C,,,,,can
be determined for each subgroup, which results in a series of estimates for
C,,,,, over the life of the process.
A mean line as well as upper and lower limits can be created for a
capability chart using information gathered from the control chart. Similar
to Shewhart control charts, the upper and lower limits for C,,,,will represent
the interval expected to contain 99.73% of the estimates if the process has
not been changed or altered. The mean line, denoted q,
will be
- min[USL - T , T - LSL]
=
qJt11
3((S/c4)* + [ n / ( n - 1)](.?- T)’}‘/*
AssessingProcess Capability withIndices 285
when using an ,?&s chart. Assuming equal subgroup sizes, S denotes the
average of the subgroup averages S t .
where
- min[USL - T , T - LSL]
Cp,, =
3((R/d$ + [I?/(/? - 1)](.7 - T)?)I”
5. EXAMPLE
5.1. The Process
In this example 20 subgroups of size I O were gathered from a process for
which the customer had indicated that USL = 1.2, T = 1 and USL = 0.8.
I n this case T is the midpoint of the specification limits; however, all calcu-
lations use thegeneraldefinitions in determining ~,,,,,,and theassociated
limits. From the 20 subgroups we found .? = 1.1206 and S = 0.1 1, which
resulted in an uppercontrol limit of 1.230 and alower control limitof
1.014 for S and an upper control limit of 0.189 and a lower limit of 0.031
for s . Looking first at the s chart, the process variability does not appear
unusual (i.e., no out-of-controlsignals), which also seemsto be the case with
the S chart.Thecontrol limits and centerlines forthe X & s chartsare
included in Figure 9.
Since the process appears to be in control, we proceed to determine
k~,,,,for each subgroup. In the case of subgroup 1, S and s were found to be
1.15 and 0.136, respectively, resulting in
A=/? (“-
7
S/C4
T)2= Io( 1.1206- 1
0.1 1 /0.9727
) = 11.4
AssessingProcessCapabilitywithIndices 287
min[USL - T , T - LSLl
I1 -
0.2
-
- = 0.3918
1.25 -
UCL
1.20 "
l 05 -- - I
LCL
l o o + :
0 2 4 6 8 IO 12 14
18 16 20
020 - UCL
0 18 "
0.16
0.08 -- 'I
006 .- I
0.0) "
002 '1 ;
LCL
0 2 4 6 8 10 12 I 18
4 16 20
and
5.2. ObservationsandInsights
Several things are evident from Figure 9. Clearly, the estimates of the pro-
cess’s capability v a r - from subgroup to subgroup. Except for subgroup 19,
the fluctuations in C, appear to be due to random causes. In period19 the
processcapabilityappearstohaveincreased significantly andwarrants
investigation. Practitioners would likely attempt to determine what caused
the capability to rise significantly and recreate that situation in the future.
I f the estimated process capability had dropped below L,, this would
signal a change in the process, and if the process capability was not at the
level required by the customer, changes in the process would be required. In
a continuous improvement program the process capability should be under
constant influence to increase. The capability chartused in conjunction with
the traditional Shewhart variables charts will provide evidence of improve-
ment. It mayalso assist in endingtheunfortunatepractice ofincluding
specification limits on the S chart, as the additional chart will incorporate
the limits and target into the calculation of process capability.
Much like the effect of first-time control charts, practitioners will see
that process capability will vary over the life of the process, illustrating the
idea that the estimates are not parameter values and should notbe treated as
such.Theproceduresprovideevidenceofthe level ofprocesscapability
attained over the lifetime of the process rather than at snapshots taken,
for example, at the beginning of the process and not until some change in
the process has been implemented. They will also provide evidence of the
AssessingProcessCapabilitywithIndices 289
6. COMMENTS
Several ideas have been presented that address some concerns of two dis-
tinguished quality practitioners in the area of process capability, Vic Kane
(Kane, 1986) and Bert Gunter (Gunter, 1991). Unfortunately, as noted by
Nelson ( 1 992), much of the current interest in process capability indices is
focused on determining competing estimators and their associated distribu-
tions, and little work has dealt with the more pressing problems associated
with the practical shortcomings of the indices. Continuous monitoring of
processcapabilityrepresentsasteptowardmoremeaningfulcapability
assessments. However, much work is needed in this area. In particular, as
practitioners move to measures of process capability that assess clustering
aroundthetarget,the effect of non-normalitymay beless problematic.
Currently, however, meaningful process capability assessments i n the pre-
sence of non-normal distributions remains a research problem.
Boyles RA. (1991). The Taguchi capability index and corrigenda. J Qual Technol
23: 17-26,
Chan LK. ChengSW. Spiring FA. (1988). A new measure of process capability: C,,,,,.
J Qual Tcchnol 20: 162-175.
ChouYM,OwenDB, Borrego SA. (1990).Lowerconfidencelimits on process
capability indices. J Qual Technol 22:223-229.
Gunter BH. (1991). Statistics corner (A five-part series on process capability studies).
Qual Prog.
Johnson T. (1992). The relationship of C,,,,, to squared error loss. J QUA Technol
24:211-215.
Juran JM. (1979). Quality Control Handbook. New York: McGraw-Hill.
Kane VE. (1986). Process capability indices andcorrigenda. J QualTechnol
18141-52, 265.
Kotz S. Johnson NL. (1993). Process Capability Indices. London: Chapnun& Hall.
Nelson PR. (1992). Editorial. J Qual Tcchnol 24:175.
Rodriguez RN. (1992). Recent developments in process capability analysis. J Q L I ~
Technol 24: 176- 187.
290 Spiring
Spiring FA. (1995). Process capability: A total quality management tool. Total Qual
Manage 6(1):21-33.
Spiring FA. (1996). A unifying approachto process capability indices. J Qual
Techno1 29:49-58.
Vannman K. (1995). A unified approach to capability indices. Stat Sin 5(2):805-820.
17
Experimental Strategies for Estimating
Mean and Variance Function
G. Geoffrey Vining
Virginia Polytechnic Institute and State University, Blacksburg, Virginia
Diane A. Schaub
University of Florida, Gainesville, Florida
Carl Modigh
Arkwright Enterprises Ltd., Paris, France
1. INTRODUCTION
291
292 Vining et al.
overall number of design runs. Vining and Schaub(1996) note that often the
process variance follows a lower order model than the response. They sug-
gest replicatingonlyafirst-orderportionofstandardresponsesurface
designs,which significantly reduces theoveralldesign size. Thischapter
extends the work of Vining and Schaub by exploring alternative ways for
choosing the portion of the design to replicate.
7 = zy
x’WI,x
J=[ 0 z’wzzz
O I
where W I , and WZ2are diagonal matrices with nonzero elements I/$ and
(h,!/c#)/2, respectively. Vining andSchaub (1996) prefer to use M , the
expected inforlnation matrix expressed on a per-unit basis, where
1
M=-J
11
which is a block diagonal matrix with separate moment matrices for each
model on the diagonals.
One definition of an “optimal” design is that it is one that maximizes
the information present in the data. Much of optimal design theory uses
appropriate matrix n o r m t o measure the size of the information matrix.
The determinant is the most commonly used matrix norm i n practice, which
leads to D-optimal designs. I n this particulnr case, we must note that M*
depends on the ~ t ’ swhich
, in turn depend on y through the T ~ ’ S However.
.
we cannot know y prior to the experiment; hence. we encounter a problem
in determining the optimal design. One approach proposed by Vining and
Schaub (1996) assumes that T~ = to for i = 1 , 2, . . . , / I . Essentially,this
approach assumes that i n the absence of any prior information about the
process variance function, the function could assume any direction over the
region of interest. By initially assuming that the process variance function is
constant, the analyst does not bias the weights in any particular direction.
With this assumption. we can establish that an appropriate D-optimality-
based criterion for evaluating designs is
294 Vininget al.
3. COMPUTER-GENERATEDDESIGNS
I
Figure 1 The three-pdctor D-optimal design for 14 runs over a cuboidal region.
EstimatingMeanandVarianceFunction 295
Figure 2 The three-factor D-optimal design for 15 runs over a cuboidal region.
Figure 3 The three-factor D-optimal design for 18 runs over a cuboidal region.
296Vining et al.
Figure 4 The three-factor D-optimal design for 22 runs over a cuboidal region.
Figure 5 The three-factor D-optimal design for 26 runs over a cuboidal region.
Figure 6 The three-factor D-optimal design for 32 runs over a cuboidal region.
298 Vining etal.
i
1 .
Figure 7 The three-factor D-optimal design for 59 runs over a cuboidal region.
Q i
I
0 20 40 60 80 100
Figure 8 Plot of the value for D for the three-factor computer-generated design
over a cuboidal region.
EstimatingMeanandVarianceFunction 299
i
0 20 40 60 80 100
0 20 40 60 80 100
approaches some asymptote or that D may peak at some sample size larger
than the ones studied.
4. COMPARISONS OF DIFFERENTREPLICATION
STRATEGIES
Figures 11-13 use the D criterion to compare the following design strategies
for three, four, and five factors over a cuboidal region:
A fully replicated central composite design (CCD)
A fully replicated Notz (1982) design
A replicated axial design (a CCD with only the axial points replicated)
A replicated factorial design (a CCD with only a resolution 111 fraction
replicated)
A replicated 3/4 design (a CCD with only a 3/4 fraction replicated)
A replicated full factorial (a CCD with the entire factorial portion
replicated)
The fully replicated CCD should always be a "near-optimal" design for each
situation. Insome sense, it provides a "gold standard"forcomparisons.
However,replicatinga full CCD is rather expensive i n terms of overall
design size. The Notz design is a minimum run D optimal design for the
second-ordermodeloveracuboidalregion.Replicating a minimalpoint
design is one logical way to reducetheoveralldesign size. Vining and
Schaub (1996) note that the replicated Notz design performs surprisingly
well i n the joint estimation of thetwo models. Vining and Schaub proposed
thereplicatedaxial andthereplicatedfactorial as alternative designs for
reducingthetotalnumberofruns. The replicated 3/4 design is another
possible alternative. The optimal designs generated in the previous section
strongly suggest the replicated full factorial strategy.
Figure I 1 summarizes the three-factor results. In this figure, 111 refers to
the number of runs at each replicated setting. We evaluated each design
using 4, 8, and 12 replicates. As expected, the replicated CCD appears to
be the best overall design. Interestingly, the replicated full factorial actually
was better for I H = 4. The designs that replicated only a portion of their runs
a l l became less efficient as the replication increased. We believe that this is
due to an increase in theimbalance in these designs. The replicated full
factorial performed slightly better than the other partially replicated designs.
The replicated 3/4 and thereplicated factorial performed very similarly. The
replicated axial performedquitepoorly.ThereplicatedNotzperformed
almost as well as the replicated CCD.
EstimatingMeanandVarianceFunction301
- Repllcaled Faclorlal
. .....
.. Replicated Axlal
"- CCD
0
0
i ,
"
"
".
Not2
Repllcated 314
Repllcated Full Factorial
0 5 10 15
Design 4 X 12
Replicated factorial 26 42 58
Replicated axial 32 56 X0
Replicated 3/4 32 56 x0
Replicated full factorial 38 70 102
Notz 40 80 120
CCD 56 112 168
302 Vininget al.
-Replksled Fanorla
........ Replicated Axlal
"_
1,
CCD
"
No12
" Replicated 3 4
". ReplicatedFull Fanom
0 5 10 15
rn
designs that replicate only a portion of their runs decreases with greater
replication. The replicated full factorial, replicated 3/4 factorial, and repli-
cated factorial designs all perform similarly, with the replicated full factorial
performing slightly better than the others and the replicated factorial per-
forming slightly worse. The replicatedaxial performs very poorly.Once
again, the replicated Notz performs similarly to the replicated CCD.
Table 2 summarizes the total number of runs for each design. In this
case, the replicated factorial and the replicated axial require exactly the same
number of runs. They in turn require fewer runs than any other design. Once
again, taking into account Figure 12 and Table 2, the replicated factorial
appears to be a reasonable design strategy in many situations.
.'..__.
..__
___
N
O
4
i
i
- Factorial
Replicated
...*-.._
........ RePlicated Axlal
......,, '..
"- CCD
i' " .
Notz
-- Replicated314
Replicated Full Factorial
I
0 5 10 15
rn
m
Design 4 8 12
Replicated factorial 66 98 130
Replicated axial 136 56 96
Replicated 3/4 1 I4 210 306
Replicated full factorial 14 138 202
Notz 84 168 252
CCD 104 208 312
304 Vining et al.
The real message of Table 3 is that all of the design strategies require a large
number of runs. I n many situations, the total is prohibitive.
5. CONCLUSIONS
Our research suggests the following conclusions. First, the proposed D cri-
terion suggests that if we fit a second-order model to the response andfirst-a
order model to the process variance, then we need to replicate only a subset
of the base second-order design. Second, this criterion appears to prefer
replicating the full factorial as the sample size permits. Third, the replicated
factorial and the replicated 3/4 factorial designs tend to perform well for
small to moderate amounts ofreplication.Finally,for large amounts of
replication, we maywanttoconsiderreplicatingat least aresolution V
fraction (the replicated full factorial).
REFERENCES
Copeland KAF. Nclson PR. (1996). Dual response optimization via dircct function
minimization. J Qual Technol 28331-336.
Del Castillo E, Montgomery DC. (1993). A nonlinear programming solution to the
dual response problem. J Qual Technol 25:199-204.
Lin DKJ. TUW. (1995). Dual response surface optimization. J Qual Technol 27:34-
39.
Mitchell TJ. (1974). An algorithm for the construction of D-optimal experimental
designs. Technometrics 16:211-220.
Myers RH, Carter WH Jr. (1973). Response surface techniques for dual response
systems. Technometrics 15:301-317.
Notz W. (1982). Minimal point second order designs. J Stat Planning Inf 6:47-58.
Vining G G , Myers RH. (1990). Combining Taguchi and response surfacc philoso-
phies: A dual response approach, J Qual Technol 22:3845.
Vining GG, Schaub D. (1996). Experimental designs for estimating both me811 and
variance functions. J Qual Technol 28: 135-147.
18
Recent Developments in
Supersaturated Designs
Dennis K. J. Lin
The Pennsylvania State University, University Park, Pennsylvania
1. AGRICULTURALANDINDUSTRIALEXPERIMENTS
Industrialproblemstendtocontainamuchlargernumber of factors
under investigation and usually involve a much smaller total number
of runs.
Industrial results are more reproducible; that is, industrial problems con-
tain a much smaller replicated variation (pure error) than that of agri-
cultural problems.
Industrial experimenters are obliged to run their experimental points in
sequence andnaturallyplantheirfollow-upexperimentsguided by
previous results; in contrast, agricultural problems harvest all results
at one time. Doubts and complications can be resolved in industry by
immediatefollow-upexperiments.Confirmatoryexperimentation is
readily availableforindustrialproblemsandbecomesaroutine
procedure to resolve assumptions.
305
306 Lin
2. INTRODUCTION
3. SUPERSATURATEDDESIGNSUSINGHADAMARD
MATRICES
Run Row 1 2 3 4 5 6 7 8 9 10 11
~~~
I + + - + + + - - - + -
1 2 + - + + + - - - + - +
2 3 - + + + - - - + - + +
4 + + + - - - + - + + -
3 5 + + - - - + - + + - +
4 6 + - - - + - + + - + +
5 7 - - - + - + + - + + +
8 - - + - + + - + + + -
9 - + - + + - + + + - -
IO + - + + - + + + - - -
6 II - + + - + + + - - - +
12 - - - - - - - - - - -
and 11) and group I1 with the sign of -1 in column 1 1 (rows 1. 4, 8, 9, IO,
and 12). Deleting column 1 1 from group I causes columns 1-10 to form a
supersaturated design to examine N - 2 = I O factors i n N / 2 = 6 runs (runs
1-6, as indicated in Table 2). It can be shown that if group I1 is used, the
resulting supersaturated design is an equivalent one. In general, a Plackett
andBurman (1946) design matrixcanbe split into twohalf-fractions
according to a specific branchingcolumnwhose signs equal + 1 or -1.
Specifically, take onlytherows that have + 1 in thebranchingcolumn.
Then, the N - 2 columnsotherthanthebranchingcolumn will form a
supersaturated design for N - 2 N - 2 factors in N / 2 runs. Judged by a
criterion proposed by Booth and Cox (l962),these designs have been shown
to be superior to other existing supersaturated designs.
Theconstructionmethods here are simple.However,knowing in
advancethatHadarnard matricesentertainmany“good”mathenmtical
properties, the optimality properties of these supersaturated designs deserve
further investigation. For example, the half-fraction Hadanlard nlatrix of
order 17 = N / 2 = 4t is closely related to a balanced incomplete block design
with ( u , / I , I . , /<) = (2r - I , 41 - 2 , 2 t - 2, t - 1) and h = t - 1. Consequently,
the E(.?) value (see Section 4) for a supersaturated design from a half-frac-
tion Hadamard matrix is n ’ ( r 7 - 3)/[(2rl- 3 ) ( n - I)], which can be shown to
be the minimum within the class of designs with the same size. Potentially
promising theoretical results seem possible for the construction of a half-
fractionHadamard matrix.
Theoretical implications deserve
detailed
scrutiny andare discussed below. For moredetailsregardingthis issue.
please consult Cheng (1997) and Nguyen ( 1996).
Supersaturated Designs 309
4. CAPACITYCONSIDERATIONS
IO -~ 12
12 11 - 66
14 - 13 - 1 I3
16 15 ~~
42 -~
18 - 17 - 111
20 ~. 34
22 - 20 "
92 -
24 - 33 - 276
(b) Odd 11
3 3
5 4
7 7 15
9 7 12
11 11 14
13 12 14
15 37 15 15
17 50 15 17
19 33 19 19
21 19 19 34 92
2394 33 23 23
2576 32 23 23
SupersaturatedDesigns 311
5. OPTIMALITYCRITERIA
6. DATAANALYSISMETHODS
Several methods have been proposed to analyze the k effects, given only the
n ( c k ) observationsfromtherandombalance design contents (see, e.g.,
Satterthwaite, 1959). These methods can also be applied here. Quick meth-
ods such as these provide an appealing, straightforward comparison among
312 Lin
factors,but it is questionablehowmuchavailableinformationcan be
extracted using them;combining several of these methods provides a
moresatisfyingresult. In addition, three data analysis methods for data
resultingfroma supersaturated design are discussed in Lin (1995): (1)
normal plotting. (2) stepwise selection, and (3) ridge regression.
To study so many columns i n only a few runs, the probability ofa false
positivereading (type 1 error) is a major risk here. An alternative to the
forward selection procedure to control these false positive rates is as follows.
Let N = ( i l ,i z , . . . , i,,} and A = (i,, + 1. . . . , i + k } denote indexes of inert
and active factors, respectively, so that N U A = ( I , . . . , k ) = S. If X denotes
+ +
the / I x p design matrix, our model is Y = P I Xg E , where Y is the I I x 1
observable data vector, p is the intercept term, 1 is an n-vector of I's, fi is a
h- x 1 fixed and unknown vector of factor effects, and c is the noise vector. I n
themultiplehypothesistestingframework. we havenull andalternative
pairs H / : pi = 0 and H f ' : pi # 0 with H / true for .i E N and HI' true for
. I E A.
Forward selection proceeds by identifying the maximum F statistics at
successive stages. Let F,'"' denote the F statistic for testing H , at stage s .
Consequently, define
where
with simulated p values, but the errors can be made very small, particularly
with controlvariates.The analysisof datafromsupersaturated designs
along this direction can be found in Westfall et al. (1998).
7. EXAMPLES
Now, if we generate all interaction columns, AB, AC, . . ., FG, together with
+
all main-effect columns, A, B, . . ., G , we have 7 21 = 28 columns. Treat all
of those 28 columns in 12 runs as a supersaturated design (Lin, 1993) as
shown in Table 4. The largest correlation between any pair of the design
columns is f 1 / 3 . The resultsfromaregular stepwise regressionanalysis
(with a! = 5% for entering variables) yields the model
314 Lin
""""""
I I I I I I
""""""
I I l l I I
""""""
I I I I I I
""""""
I I I I I I
""""""
I I I I I /
""""""
I I I l l I
""""""
I I I l l I
""""""
1 1 I I I I
""""""
I I I I I I
""""""
I I I I I 1
""""""
I I I I l l
""""""
I / I I I I
""""""
I I I l l I
""""""
I I / I l l
""""""
I I I I I I
""""""
I I I I I I
""""""
I l l I I I
""""""
I l l I I 1
""""""
I I l l I I
SupersaturatedDesigns 315
a significantly better fit to the data than Eq. (I). An application of the
adjusted p-value method (Westfall, et al. 1998) reaches the same conclusion
in this example.
Note that the AE interaction,in general, would never be chosen under
the effect heredity assumption. Of course, most practitioners may consider
adding main effects A, E, and G to the final model because of the signifi-
cance of interactions FG and AE. The goal here is only to identify potential
interaction effects. In general, for most main-effect designs, such as Plackett
andBurman typedesigns(exceptfor 2""' fractionalfactorials),onecan
apply the following procedure [see Lin (1998) for the limitations]:
Step 1. Generate all interactioncolumnsandcombinethem withthe
+
main-effect columns. We now have k(k 1)/2 design columns.
+
Step 2. Analyze these k(k 1)/2 columns with tt experimental runs as a
supersaturated design. Data analysis methods for such asupersatu-
rated design are available.
Note that if the interactions are indeed inert, the procedure will work well,
and if the effect heredity assumption is indeed true, the procedure will end
up with the same conclusion as that of Hamada and Wu (1992). The pro-
posed procedure will always result in better (or equal) performance than
that of Hamada and Wu's procedure.
8. THEORETICALCONSTRUCTIONMETHODS
Theorem
Let H be a Hadamard matrix of order I I and B = ( h l ,. . . , h,) be an n x I’
matrix with all entries f l and V = H’B = ( u j i ) = h,(h,.Then
I . For any fixed 1 5 j 5 1’, I? = Cy=,L$.
2. In particular, let B = RH and W = H’RH = (wv). We have
a. (I/n)W is an 12 x IZ orthogonal matrix.
b. I? = Cy=lN$ = CJLIbt$.
c. wj, is always a multiple of 4.
d. If H’ is column-balanced, then fn = Cy=,1 1 ’ ; ~ = x.’=,
wv
Corollary
For any R and C such that ( I ) R’R = I and (2) rank (C) = I I - c, all X,.
matrices have an identical E(.?) value.
This implies that the popular E(.?) criterion used in supersaturated designs
is invariantforanychoiceof R and C. Therefore, it is not effective for
comparingsupersaturated designs. In fact,followingthe argument in
Tang and Wu (1993), the designs given here will always have the minimum
E(.?) valueswithintheclassofdesigns of the same size. One important
feature of the goodnessof a supersaturated screeningdesign is its projection
property (see Lin 1993b). We thus consider the r-rank property as defined
below.
Definition
Let X be a column-balanced design matrix. The resolution rank (or I’ rank,
for short) of X is defined as = d - I , where d is the minimum number
,/
These results are only the first step. Extension of these results to a more
general class of supersaturated designs in the form S K = (RIHC1,. . . , RK
HCK) is promising.
9. COMPUTERALGORITHMICCONSTRUCTIONMETHODS
More and more researchers are benefiting from using computer power to
construct designs for specific needs.Unlikesomecases from the optimal
designperspective(such asD-optimal design). computerconstruction of
supersaturated designs is not well developed yet. Lin(1991) introduced
the first computer algorithm to construct supersaturated designs. Denote
the largest correlation in absolute value among all design columns by I', as
a simple measure of the degree of nonorthogonality that can willingly be
given up. Lin (1995) examines the maximal number of factors that can be
accommodated in such a design when I' and 11 are given.
AI Church at GenCorp Companyused the projection properties in Lin
and Draper (1992,1993) to develop a software package named DOE0 to
generate designs for mixed-level discrete variables. Such a program has been
used at several sites in GenCorp. A program named DOESS is one of the
results and is currently in a test stage. Dr.Nam-KyNguyen (CSIRO,
Australia) also independently works on this subject. He uses an exchange
procedure to construct supersaturated designs and near-orthogonal arrays.
A commercial product called Gendex is available for sale to the public, as a
result. Algorithmic approaches to constructing supersaturated designs seem
to have been a hot topic in recent years. For example, Li and Wu (1997)
developed a so-called columnwise-pairwiseexchange algorithm. Such an
algorithm seems toperform well forconstructingsupersaturated designs
by various criteria.
10. CONCLUSION
REFERENCES
Booth
KHV,Cox DR. (1962). Somesystematic
supersaturated
designs.
Technometrics 4:489-495.
Chen JH, Lin DKJ. (1998). On identifiabilityofSupersaturateddesigns. J Stat
Planning Inference, 72, 99-107.
Cheng CS. (1997). E(.?)-optimal supersaturated designs. Stat Sini, 7, 929-939.
Deng LY, Lin KJ. (1994). Criteria for supersaturated designs. Proceedings of the
Section
on
Physical
and
EngineeringSciences,
American Statistical
Association. pp 124-128.
DengLY, Lin DKJ, Wang JN. (1994).SupersaturatedDesignUsingHadamard
Matrix. IBMResRepRC19470,IBMWatsonResearchCenter.
Deng LY, Lin DKJ, Wang JN. (1996a). Marginally oversaturated designs. Commun
Stat 25( 11):2557-2573.
Deng LY, Lin DKJ, Wang JN. (1996b). A measurement of multifactor orthogon-
ality. Stat Probab Lett 28:203-209.
Hamada M, Wu CFJ. (1992). Analysisof designed experiments with complex alias-
ing. J Qual Techno1 24:13&-137.
Hunter GB, Hodi FS, Eager TW. (1982). High-cycle fatigue of weld repaired cast Ti-
6A1-4V. Metall Trans l3A: 1589-1 594.
Li WW, Wu CFJ. (1997). Columnwise-pairwise algorithms with applications to the
construction of supersaturated designs. Technometrics 39:171-179.
Lin DKJ. (1991).Systematicsupersaturateddesigns.WorkingPaper No. 264,
College of Business Administration, University of Tennessee.
SupersaturatedDesigns 319
1. INTRODUCTION
321
322 SteinbergandBisgaard
eliminate potential quality problems without the large costs and delays that
are usually incurred when problems are discoveredi n the later phases of the
design-to-production cycle.
The common paradigm for prototypetesting is to build and evaluate a
single model at each stage. This approachis implicit in the excellent account
by Wheelwright andClark [ 3 , Chapter 101 ofthe role of prototypes in
product development.
It is our experience that great gains can be made by using factorial
experiments to study and improve product design at the prototype stage.
Several alternatives can be made, varying important design factors accord-
ing to a factorial plan. The results of such experiments can substantially
accelerate the path from concept development to finished product and can
significantly lower the risk of discovering serious quality problems late in the
development cycle.
A striking exampleof the importance of rapid feedback at early stages
in the design process is presented by Clark and Fujimoto [9, Chapter 71 in
their comprehensive study of auto manufacturers. They found that thelead
time for developing a new car was about 25% less in Japan than in the
United States. One major reason for this difference was that the Japanese
companies were much more successful than their American counterparts at
rapidly reducing the number of design problems early in the development
process. ClarkandFujimotocreditedthis difference totheprototyping
strategiesthat wereprevalent in thetwocountries.The U.S. companies
built few prototypes and treated them as master models; the Japanese com-
panies built manyprototypesand usedthem toprovideinformationfor
finding and solving design problems. Our approach couples the power of
statistical experiments with the Japanese strategy.
Prototype experiments have two interesting statistical features. First,it
is typically much more expensive to build a prototype than to test it. Thus
there is good reason, once a prototype is built, to test it extensively. The
relevant test conditions, which can often be laid out in a factorial plan, will
then be nested within the prototype configurations, in what is known as a
split-plot structure. Second, interest often focuses on a performance curve
rather than on a single number output. In motor testing, for example, the
test might examine fuel consumption as a functionof load or rpm, torque as
a function of rpm, compression ratio as a function of single a 360" stroke, or
the curve trace of the torque or power delivered through a gear shift cycle
from forward through neutral to reverse and back again. Other examples
include the hysteresis curve in the testing of transformers, the spectrum of
the emitted light in the testing of light bulbs, the hardness as a function
of depth in ion implantation of steel, the pressure versus time curve in a
pyrotechnic chain, and the characteristic curve in the testing of transistors.
ProductDevelopment:PrototypeExperiments 323
contacts with design engineers, we regularly see experiments used to test new
concepts, compare designs, evaluate new materials, optimize performance,
improvequalityand reliability, and verify performancespecifications.
Efficient experimentation can be a crucial tool in the quest to bring high
quality products to market ahead of the competition. Carefully planned
factorialexperimentscanprovideinvaluableknowledgethroughoutthe
development cycle. See Bisgaard and Ellekjaer [7] for a broad conceptual
account.
The prototype stage is especially well suited to experimental work.
Typically prototypes are built fairly early in the development of a new
product, when it is easiest to make design changes. Factorial experiments
on prototypes can be an ideal method for comparing design alternatives
and shaping the direction of future development. Once that direction is set
andlargeamountsoftimeandmoneyhave beeninvested,itbecomes
ProductDevelopment:PrototypeExperiments 325
3. PROTOTYPEEXPERIMENTS:SOMEEXAMPLES
3.1. AirplaneWing
Initial prototype development often takes the form of CAD drawings rather
than actual physical mock-ups. Software that simulates the proposed oper-
ating environment can then be used to study the performance of the design
on the computer. The experimentin question here was carried outby a team
of engineers at the“conceptdesign”stage.Thetwomaingoals were to
improve the performance of thewing. as measured by thrust per unit weight,
and to minimize the cost per unit performance. Five different aspects of the
wing were studied: the sizes ofthree physical dimensions, the number of
strength supports on the wing, and the type of material used in construction.
Two possible values were considered for each of these factors, and eight
prototypes were thendefined, in accordwith a standard z5” fractional
factorialexperiment.Eachprototype was carefullydrawn by the design
team using CAD software. The weight and cost of each prototype wing
was then calculated and finite elementanalysiswas used to compute the
thrusts.
3.2. EngineThrottleHandle
Bisgaard [I41 described an experiment to improve the performance of the
throttle handle for an outboard motor. The goal of the experiment was to
derive appropriatetolerancesfor seven physical dimensions by studying
their effects onfriction in thehandle.Thethrottlehandle is assembled
from three parts: a knob. a handle, and a tube. Of the dimensions studied,
three were related to the knob, three to the handle, and one to the tube. An
interestingfeature of thisexperiment is thatseparateexperimentalplans
were set up for making prototypes of each of the three components (a 23
plan for the knobs. a 2’” plan for the handles, and a 2’ plan for the tubes).
All possible matchings of the prototype components were then assembled
and tested for friction.
326 SteinbergandBisgaard
3.3.EngineExhaust
Taguchi [IO, p. 13 11 described an experiment to reduce the CO content of
engineexhaust. Sevendifferentcharacteristicsoftheenginedesignwere
studiedusingasaturated two-leveldesign that specified eight prototype
engines. Each engine was then run at three different driving modes, which
constitutedthe test conditionsfor this study. Bisgaard andSteinberg [8]
analyzed the results from this experiment and found that one of the factors
had an interesting, and statistically significant, effect on the shape of the
response curve, as shown in Figure 2. With this factor at its low level, the
response curve was lower at the middle driving mode buthigher at the high
mode. The engineering significance of this effect depends on which driving
modes will be encountered most often. The lower driving modes likely cor-
respond to the sort of stop-and-start traffic common in large cities, and it
might then be desirable to choose the factor at its low level to reduce the CO
content at these modes.
3.4. Kitchen
Mixer
Ott [I51 described an experiment to improve a kitchen mixer. Each mixer
was assembled from three components: a top unit, a bottom unit, and gears.
An experiment was run to determine which of these three components was
the cause of inefficient operation. Forty-eight mixers were used in the study,
half of them efficient and half inefficient. Each mixer was disassembled, and
1.6
1.4
12
1.o
I
1.o 2 .o 3 .O
Driving Mode
Figure 2 The estimated response curves forCO exhaust versus driving mode at the
two levels of factor A for the engine exhaust experiment. The response curve with A
at its high level (solid line) is lower than the curve with A at its low level (dashed linc)
across most of the driving modes but shows a sharp increase at high driving modes.
ProductDevelopment:PrototypeExperiments 327
then 48 new mixers were assembled,swapping parts from the original mixers
to form a 23 factorial design whose factors were the three components. The
two levels for each factor were determined by the source of the component
in an efficient (or inefficient) mixer. The experiment clearly pointed to the
tops as the source of the problem.
3.5. PyrotechnicDevice
Milman et al. [16] reported on an experiment to improve the safety ofa
pyrotechnic device. It was known that the safety improvements could be
achieved by using a new type of initiator, but there was concern that this
change would adversely affect the performance of the device. An experiment
was run to test 24 prototype devices, mating eachof three safe initiators with
four types of main charge and two types of secondary charge. The observed
response for each prototype device was a trace of pressure against time.
3.6. FluidFlowController
Bisgaard and Steinberg [8] described an experiment to study how prototype
fluid flow control devices respond to changes in electrical input and flow
rate. The controller was assembled from two components. Two experimen-
tal factors described dimensions of the first component, and a third factor
described a dimension of the second component. As in the engine throttle
experiment, the eight prototype controllers were formed by making four
versions of the first component (following a 22 plan) and two versions of
the second component and then mating all possible pairs of components.
Each prototype was subjected tosix test conditions formed by crossing three
levels of the electrical input with two flow rates.
3.7. HearingAid
A remote control unit developed to permit easy control of a new, miniatur-
ized hearing aid suffered from poor reception. A factorial experiment was
carried out to test several conjectures as to the source of the problem, in
particularthatthedifficult-to-controlvariation in thereceptor coil was
causingvariations in thetransmissionfrequencyandthatthetypeof
cover used was affecting reception. The experiment showed that coil varia-
tion was the major problem and that it could be easily remedied by exploit-
ingalarge interaction betweenthe coil and thetransmissionprogram
(another factor in the experiment). The choice of cover was found to have
no effect at all.
328 SteinbergandBisgaard
3.8. BearingManufacture
Although we have emphasized throughout this chapter the use of factorial
experiments for prototype products, the same ideas can be applied to pro-
totype process development. Hellstrand [17] described an experiment con-
ducted at SKF, oneof the world's largest manufacturers of ball bearings, to
improve a production process. The goal of the experiment was to improve
bearing life, and threefactors were studied in a 23 plan: heat treatment,
osculation, and cage design. The experiment uncovered a large interaction
effect between heat treatment and osculation thatled to a fivefold increase in
bearing life.
4. ANALYSIS OF PROTOTYPEEXPERIMENTS
4.1. StandardExperimentalPlans
Some prototype experiments are standard factorials or fractional factorials
(e.g., the airplane wing and throttle handle experiments). No special meth-
ods are needed for the analysis of these experiments.
where the sum runs over all the test settings. The scaling guarantees thatall
the coefficients (except the constant) will have the same variance, a property
that is important at the second stage of the analysis.
The use of orthogonal polynomials with our scaling convention leads
to simple coefficient estimates. If we denote by yl’ = (vi,,. . . , J’;,~)the obser-
vations on the ith prototype at each of the s test conditions, the least squares
estimates of the coefficients are given by
The constant term is the average of the s observations, and the polynomial
coefficients are simple linear contrasts.
At the second stage of the analysis, each of the polynomialcoefficients
foundabove is treated as aresponsevariableandaseparateanalysis is
carried out for the coefficients of each degree. The analysis of the constant
terms reveals which factors affect the mean level of the performance curve,
the analysis of the linear coefficients shows which factors affect slope, etc.
Important effects that stand out from error can be identified with standard
tools such as normal probability plots and analysis of variance (ANOVA).
Note that the effects onthemean level include“wholeplot”error,but
effects on other aspects of the performance curve, including average coeffi-
cients, involve only “subplot” error. ANOVA can account for this situation
by doing a split-plot analysis.For the graphical analysis, separate plots must
be prepared for the two sets of effects. Our scaling convention from stage1
implies that all the performance curve coefficients have the same variance.
330 SteinbergandBisgaard
We take similar care at the second stage to ensure that the effects have the
same variance and can thus be combined on a single probability plot. We
recommend computing the average value of each coefficient (for ease of
interpretation) and then scaling all the design Factor contrasts to have the
same variance as the average. This property canbe checked by setting up the
regression matrix Z for the design factor effects with all elements in the first
column equal to 1 and then verifying that Z’Z = n l , where I is the identity
matrix. Each row of the matrix (Z’Z)”Z’ = (l/n)Z’ then gives one of the
factor effects.
Orthogonal polynomials are a convenient choice to describe a perfor-
mance curve, but other sets of orthogonal functions could alsobe used. For
some of the engine testing applications described in Section 1, we would
naturally expectperiodicbehavior.Inthat case, trigonometricfunctions
could be used to generate orthogonal contrasts in the test conditions.
Some experiments involve more than one test factor. Examples above
are the fluid flow controller and the engine startingsystem studies. For these
experiments, the natural approach is to estimate theeffect of each test factor
for each of the prototypes. Interactions among the test factors can also be
included if the test array permits their estimation. The analysis will then
reveal which product characteristics can be used to affect the dependence of
theresponseonthevarious test factors. For example, in the fluid flow
controller experiment, one important goal was to obtain accurate predic-
tions of the relationship between the response and thetest conditions so that
controllers could be designed to meet any desired response pattern.
Thetwo-stageanalysishasanappealing simplicity. I t canalso be
justified more formally using theory developed for growth curve models in
our performance curve context. Bisgaard and Steinberg [8] showed that, for
thesemodels,thetwo-stageanalysisactually computes generalized least
squares estimates of the parameters (maximum likelihood estimates if the
data are normally distributed).We refer interested readers to that article for
details on the statistical model and its analysis.
Our analysis approach shares some common ground with that recom-
mended by Taguchi [IO] for robust design experiments, but there are some
important differences that we would like to point out. The approach taken
by Taguchi is to compute, for each prototype, a single summary measure
across all the test conditions.Thissummarymeasure, whichhe calls a
signal-to-noise ratio, is then taken as a response variable much as in our
stage 2 analysis. The major difference between Taguchi’s approach and ours
is that we compute a complete, multicoefficient summary at our first stage,
as opposed to Taguchi’s use of a univariate summary. This difference may
appear small but is in factsubstantial.Thesingle-numbersummarycan
throw away much valuable information that is captured by the complete
ProductDevelopment:PrototypeExperiments 331
5. EXAMPLE:THEENGINESTARTINGSYSTEM
EXPERIMENT
In this section we show how our two-stage analysis method can be applied
to an experiment on engine starting systems that was described by Grove
andDavis [12, p. 3291. Foradditionalexamples, we refer the interested
reader to Bisgaard and Steinberg [SI.
The goal of the engine starting system experiment was to reduce the
sensitivity of the system to variations in ambient temperature. The perfor-
mance of the system was evaluated via the relationship between the air-to-
fuel (AF) ratio atthe tip of the spark plug and the fuel mass pulse, which is
controlled by the electronic engine management system. This measure was
adopted because the automotive engineers knew that it was a key indicator
of ignition success. The experiment studied seven components of the starting
system:injector type, distancefrominjectortiptovalvehead,injection
332 SteinbergandBisgaard
timing, valve timing, spark plug reach, spark timing, and fuel rail pressure.
Six different injector types were used; three levels were used for each of the
remaining factors. The L 1 8orthogonal array was used to define the experi-
mental plan for the prototype starting systems. Each of the 18 systems was
then tested at six conditions, formed by crossing three fuel mass pulses (30,
45, and 60 msec) with two temperatures ( - 1 5 T and +15"C). Two tests were
run at each condition, so there are 12 results for each prototype.
The full data set, additional details on the experiment, and a number
of alternative analyses can be found in Grove and Davis [ 121. We proceed
here only with our approach.
Increasing the fuel mass pulse (FMP) injects more fuel into the engine,
and initial plotsof the data for each prototype show, as expected, a negative
correlation between the A F ratio and the FMP. They also show that the AF
ratio is typically higher at - 15°C than at +15"C. A number ofpossible
models might be considered linking the A F ratio to the FMP, and there is
not clear evidence in the experiment to prefer one model over another. For
some prototypes, the A F ratio is almost a linear function of the FMP; for
others the inverse of the A F ratio is nearly linear, and for others the log of
the ratio is mostnearly linear. We elected to work withtherelationship
between thelogarithm of the A F ratioandthelogarithm ofthe FMP,
whichseemed to be most appropriate for the full set of prototypes both
for achieving linearity and for reducing the dependenceof residual variation
on the mean level of response. But we caution that other metrics could also
be used and might lead to somewhat different conclusions.
The first stage of our analysis is to estimate for each prototype the
effects of log FMP and temperature, including their interactions, on log A F
ratio. The levels of FMP were equally spaced (30, 45, and 60 msec), and if
we had kept FMP on its original scale we could have used standard poly-
nomial contrasts to compute its linear and quadratic effects. For example,
the linear effect would be proportional to the average of the results at 60
msec minus the average of the results at 30 msec. The logarithms of the
FMP levels are 3.40, 3.81, and 4.09, and the resulting scaled contrasts are
(-0.372, 0.040, 0.332) (linear) and (0.169, -0.406, 0.237) (quadratic). The
main effect contrastfortemperature is (-0.289,0.289). Theinteraction
contrastsaresimilartothe FMP contrasts,but multiplied by 1 or -1,
according to whether the temperature is high or low, respectively. Each of
the contrasts, when squared and summed over the12 test points, gives a sum
of I , in accord with our scaling convention.
The second stage of our analysis estimates the effects of the design
factors on each of the first-stage coefficients. Since there are 18 prototypes,
the "average" contrast in the effects computation has each element equal to
1/18. ,411 the remaining factor effect contrasts are scaled to have the same
ProductDevelopment:PrototypeExperiments 333
sum of squares. The linear contrast for each three level factor is (-0.068, 0,
0.068), andthequadraticcontrast is (0.0393, -0.0786, 0.0393). Injector
type, the 6 level factor, is represented byfive orthogonal contrasts. These
contrasts are formedby taking the maineffects and interactions of the 2 and
3 level columns that were used at the design stage to assign the levels of this
factor.
Figure 3 shows a normal probability plot of the effects on mean level
(i.e., on the constant terms from the within-prototype regressions). None of
the contrasts sharply deviates from a straight line through the origin. Only
the two lowest values hint at statistical significance. The strongest contrastis
one that corresponds to injector type and indicates that types 4, 5, and 6
have lower average AF ratios than do types I , 2, and 3. The other large
contrast is forthelinear effect of fuel rail pressureandindicates lower
average AF ratios with higher pressure.
-.
0
. *
..
8- ..
N
?-
7
-2 -1 0 1 2
Quantiles of Standard Normal
Figure 4 shows a normal probability plot for the effects related to the
performance curve. The contrasts for the linear effect of log FMP and for
the effect of temperature are clearly significant and dominate all the others.
Figure 5 shows a normal probability plot without the two very large con-
trasts and helps to clarify which contrasts stand out from noise. The only
contrasts that appear to be statistically significant are the three largest and
the two smallest, all of which correspond to interaction effects with tem-
perature. The factors that interact with temperature are the injector type
(two significant contrasts), the distance from the injector tip to the valve
head (both the linear and quadratic components), and the valve timing (the
linearcomponent).The nextlargestnegativecontrast is theinteraction
between temperature and the quadratic component of the valve timing, so
it seems prudent to also take account of this effect in developing a model for
the system.
21
..
N .
r
-2 -1 0 1 2
Quantiles of Standard Normal
Figure 4 A normal probability plot of the factor effects on the performance curve
from our stage 2 analysis of the engine starting system experiment.
ProductDevelopment:PrototypeExperiments 335
..
.’ ..
7
0-
-2 -1 0 1 2
Quantiles of Standard Normal
Figure 5 A normal probability plot of the factor effects on the performance curve
from our stage 2 analysis of the engine starting system experiment, after deleting the
two large effects due to the linear contrasts for fuel mass pulse and temperature.
temperature effect at each level of the relevant factors, which are listed in
Table 1. The average temperature effect was - 1.173. Since the goal of the
experimentwas to reduce sensitivity totemperaturevariation, we seek
levels ofthethreefactorsthatmakethetemperature effect closer to 0.
The best choice is to take an injector of type 6 and use the middle tip-to-
head distance and the low level of valve timing (the middle level is almost
equally good). If we assume that the design factors have additive effects on
the temperature effect, the estimated increases in that effect from each of
thesechoices are 0.296 (frominjector type), 0.225 (fromthetip-to-head
distance),and 0.153 (fromthe valvetiming). The estimatedtemperature
effect is then -0.499, about 60% closer to 0 than its average value. Thus
the experiment has identified factor settings that substantially reduce the
sensitivity totemperature,resulting in less variation in productresponse
and more uniform starting performance.
It is worthnotingthat if we place themean level effects andthe
performance curve effects on the same probability plot (after appropriate
scaling of the mean level effects), many of the mean level effects stand out
from the line through the origin, contrary to our earlier conclusion that at
most two contrasts are significant and then just barely. Thisfinding suggests
that the within-prototype error, on which we base the statistical significance
of the performance curve contrasts, is too small for judging the mean level
contrasts. That, in turn, implies that a substantial amount of the variability
i n thedatamay be attheinterprototype level. Thisinformationcould
be valuable for future efforts to make the performance curves still more
uniform.
6. CONCLUSIONS
ACKNOWLEDGMENTS
The research of D. M.Steinberg was carried outin part while he was visiting
theCenterforQualityandProductivityImprovement,Universityof
Wisconsin-Madison.He is gratefultotheCenterforproviding excellent
research facilities. The research of S. Bisgaardwascarried out in part
undergrantnumber DM1 950014 fromthe U S . National Science
Foundation.
REFERENCES
1. INTRODUCTION
339
340 GaffkeandHeiligers
where
I
[ Gf,/+2)
( s E [N, h] : B,(s) > 0) = ( I , , t,+l,+l) if i = 2, . . . , k - 1 (5e)
(tkI hl if i = k
We note that the small support property (5e) is a particular feature of the
basic splines Bi. Figure 1 shows the B-splines for a special case.
A further favorable property of the B-spline basis, Eq. (4), is its u p i -
wricmce under affine-linear transformation of the knot vector K . That is, if
the interval [a,b] (and its knots K , , i = 0, . . . , e ) are transformed to another
‘ I B2 1 B3
1 BG
Hence Eq. (6) allows us to standardize the interval [a, h], e.g., to [0, I].
The spline regressionmodel states that a regression function y is a
member of the space S ~ ( Ks),, i.e.,
for some coefficient vector 8 = (e,, . . . , Ok)’, which has to be estimated from
the data (the prime denotes transposition). Under the standard statistical
assumptions that the observationsof the regression function at anyx values
are uncorrelated and have equal (but possibly unknown) variance c2,the
ordinary least squaresestimator of 8 willbe used. So fordesigningthe
experiment,i.e.,forchoosingthe s values at which theobservations of
y ( x ) aretobetaken,theconcepts of optimallinear regressiondesign
apply.Formathematicalandcomputationaltractability we restrictour-
selves to the approximate theory. An approximate design 6 consists of a
finite set of distinct support points-xI, . . . , .x,. E [ a , h] (where the support size
r 2 1 may depend on 6)and corresponding weights <(xl),. . . , c(x,) > 0 with
6(.xi) = 1. The design 6 calls for C(.x,) x 100% of all observations of the
regression function at x, for all i = 1 , . . . , r . The moment matrix (or infor-
mation matrix) of 6 is given by
Q o ( M )= [det(M)]"'k, @-m(M) = I / h l ( M )
2. COMPUTINGNUMERICALLYOPTIMALDESIGNS
The basic algorithm we used is that of Gaffke and Heiligers [3], with neces-
sary adaptations to the present situation of polynomial spline regression as
are described in detail in Ref. 4. So we only briefly outline the method.
A sequence of moment matrices M,,, 11 = I , 2, . . ., is computed, corre-
sponding to some approximate designs t,,,11 = 1,2, . . .. The current design
t,!,however, is not computed (except for the final iteration when the algo-
rithm terminates). Thus an increasingset of support points calling for some
clustering or elimination rules is avoided. For twice continuously differenti-
able optimality criteria @ having compact level sets (as, e.g., the @/, criteria
with "co < p < I), the generated sequence of moment matrices M,, have
been shown to converge to an optimal solution to
Minimize @ ( M ) (Sa)
Subjectto A4 E Conv{B(s)B(s)' : s E [N, b]} n PD(k)
(8b)
Here we have denoted by lowercase letters m,,, r??, and m ( s , ) the moment
vectors obtainedfrom M,,, M , and M ( s , ) = B ( s , ) = B(s,)B(s,)’,respec-
tively, by a usualvectoroperationturningmatricestocolumnvectors.
Owing to the symmetry and the band structure of the moment matrices it
suffices to apply the vector operation to the main diagonal and the d diag-
onalsabovethemaindiagonal. So thevector operatorconsideredhere
selects that part of a symmetric matrix A and arranges the entries in some
+
fixed order, resulting in a vector vec(A) E RE;,where K = ( d l)(k - d / 2 ) .
In Eqs. ( I O ) we have
where s I , . . . , .x, are certain points from[ a , h] to be described next (note that
these points including their total numberr depend on r7, but this dependence
is dropped here to simplify the notation), and G,, denotes the gradient of Q,
at M,, in the space of symmetrical k x k matrices endowed with the scalar
product ( A , B ) = tr(AB). The matrix V occurring when vectorizing the gra-
dient is a fixed K x K diagonal matrix with diagonal entries equal to 1 or 2,
such that those components of vec(G,,) coming from the diagonal of G,,
receive weight 1 while the off-diagonal elements are weighted by 2. This is
to ensure that g,, is the gradient at m , , of the function
SSpline RegressionwithMultipleKnots 345
The points s,, i = 1, . . . , 1', in (lob) are most crucial for obtaining a good
search direction by solving the quadratic problem. Their choice is guided
by the equivalence theorem, i.e., the first-order optimality conditions for
problem (8). A feasible moment matrix M* is an optimal solution if and
only if
B(s)'(-G*)B(s) 5 tr(-G*M*)
for all s E [ a , h ] (12)
with some
one has
From this it appears reasonable to choosein (lob) the local maximum points
s I , . . . , s,.of
the function
where
3s -
2s -
15 -
5.
Figure 2 The function (13) for iterate I I = IO (dotted line) and for the final iterate
11= 43 (solid line). Under consideration is the cubic spline model as in Figure I , and
the optimality criterion is the A criterion (I, = -1).
5Spline RegressionwithMultipleKnots 347
but this is used only in the final step (see below). Let M,,
= vec”(E,,). Now,
a search along the line segment
(with some fixed a! < I , usually close to 1) is performed to obtain the next
iterate M,,+l.
To summarize,themethodforsolving (8) is amodified quasi-
Newtonmethod.Thesearchdirection is based onalocalsecond-order
approximation of the objective function 0.The constraint set in (lob) over
which thequadraticapproximation is minimizedmay be viewed asa
polyhedral neighborhood of the current vector iterate m,,. It may appear
more natural to minimize that quadratic approximation over the set of d l
moment vectors
that is, the final vector iterate m,,is removed from the generator set in (lob).
This has proved to be favorable, since otherwise a positive w O may occur in
(14) that could prevent the identification of a corresponding design. We thus
obtain an optimal solutionm*, say, to that quadratic problem, a non-empty
subset I of indices from [ I , . . . , I ) , and positive weights It,;, i E I , summing
up to 1 and such that
348 GaffkeandHeiligers
In all our numerical experiments we observed that 117* is very close to the
final vector iterate m,, and shares numerically the same value of 4. Hence, a
numerically optimal design is given by (* supported by s i , i E I , and weights
(*(x;) = w;.
The algorithm shows good convergence behavior,in particular a good
Ioccrl convergence rate as it is usually observed by a quasi-Newton method.
For instance, the D-optimal designs for spline degree ri = 2 and one single
interior knot (i.e.? = 2 , s I = 1) derived theoretically in Ref. 5, page 43, and
in Ref. 6, Theorem 2, arefound very accurately by thealgorithm.For
degrees d = 3,4, 5 and one single interior knot, D-optimal designs within
the class of designs with minimum support size k were found numerically by
Lim [6]. The presentalgorithmcomputed precisely thesedesigns asthe
numerically D-optimal ones in the class of d l designs (up to two printing
errors in the tables on page 176 of Ref. 6).
D A E
Support Weight Support Weight Support Weight
0.00000 0.14286 0.00000 0.08848 0.00000 0.07361
0.00000 0.14286 0.09128
0.00000 0.00000 0.07361
Table 1 shows a few numerical results for the D and A criteria and the
approximate E criterion i n thecubicsplinemodelas in Figures 1 and
2. The designsaddressed in Table 1 by italics arethe D-, A - , andE-
optimal designswithinthesubclassofthosedesigns concentrated on the
Chebyshev points, Le., supported by the k extrema1 points of the equioscil-
lating spline in S,/(K,s) (cf. Ref. 7, Section 2). For the D and A criteria these
are computed by a simplified variant of the above algorithm, fixing s I , . . . ,
x,.(r = k ) to those Chebyshev points, while the E-optimal designis from Ref.
7, Theorem 4. By that theorem the E-optimal design (among crll designs) is
supported by the Chebyshev points. We see from Table 1 that the
optimal design numerically coincides with the E-optimal design. For the D
and A criteria the Chebyshev restricted designs do not differ much from the
numerically optimal designs. The D efficiency of the former with respect to
the latter is 0.99335, and the A efficiency is 0.99476. Similar results hold true
for other spline setups.
In all thecases we considered,thenumericallyoptimaldesignhas
minimum support size and the boundary points CI and h are support points.
For D optimality, the minimum support size property has been conjectured
in Ref. 5 , page 45, Conjecture I . In our final section we present some first
results toward a theoretical foundation of the observed phenomena.
The B-spline basis B,, . . . , Bk of S(K,s) defined by (4) enjoys the fundamen-
tal property of totnl positivit),; i.e., for any points s I , . . . , .q. such that CI 5
s i < . . . < s k 5 b the collocation matrix
Lemma 1
For any design [, the moment matrix of ( from Eq. (7) is nonsingular (and
hence positive definite) if and only if there are support points z I < ... < z,,.
of 6 such that Bj(z,)> 0 for all i = 1, . . . , k .
Proof. Arrange the
support
points of [ in increasing
order,
c[ 5 .x1 < . . . < .x, p b, say. We may write
where
N ( 6 ) = (B;(.X-,)),=L
.k and W ( [ )= diag(((sl), . . .((x,))
/=I. .I
Note that admissiblity of a design does not depend on the particular choice
of the basis of the spline space S ~ ( K s)., For, if we choose another basisf =
( f l , . . . ,fi)’ (e.g., the truncated power basis as in Ref. 8). then this is related
to our B-spline basis B = ( B , ,. . . , Bk)’ by a linear transformation, i.e.,
f ’ = TB, for some nonsingular k x k matrix T . Hence the resulting moment
matrices of designs under basis f ,
&SplineRegressionwithMultipleKnots 351
(s,,. . . , x,.being the support pointsof () are related to the moment matrices
M(6) under the B-spline basis by
A s B G+ T A T ’ S TBT’
e
where supp(6) denotes the supportof and L s J is the largest integer 5 s . For
the case that s i E (0, 1) for all i = I , . . . ,C - I , the observed minimum sup-
port size property of @,-optimal designs (where p < 00) is explained by the
following result (cf. also Ref. 8, pp. 1558-1559).
Lemma 2
Let s, E (0, I } for all i = 1, . . . , C - 1. If ( is admissible for S(/(K,
s) and the
moment matrix M ( e ) is nonsingular, then the support size of 6 is equal to k
[the dimension of S ~ ( K s)],, and the boundary knotsK O , K~ and all the interior
knots K ; with smoothness s i = 0 are in the support o f t .
Proof: By ( I ) , k = Crl+ 1 - a, where a denotesthenumber of
+
interior knots K ; with s i = 1. Consider the B = C 1 - a knots
which are the end knots of the interval and the interior knots with smooth-
ness zero. By Eq. (20), for all u = I , . . . , B - I ,
Lemma 3
The D-optimal design for S(/(K, s) (with arbitrary degree, knots, and asso-
ciated multiplicities) has both boundary points K() = CI and K~ = h among its
support points.
P ~ m f Let 6 be any design with nonsingularmomentmatrix M ( c ) ,
and let s I < . . . < x, be the support points of 6. Consider the representation
(17) of M ( < ) .I n the following we denote by N t ( ; : : : : : ; lthe
) submatrix of N ( 6 )
with respective row and column indices 1 p i, < . . . < i, 5 li and 1 5 j1 <
. . . <,j,, 5 I' (where 1 5 p 5 k ) , i.e.,
SSpline RegressionwithMultipleKnots 353
det M ( c ) - det M ( i )
Moreover, by (5b), the last inequality in (24) is strict whenever the matrix
N6
(i:;;;;.lk is nonsingular. In fact, such indices 2 5 j 2 < ' . . < . j k 5 I' exist.
For, by (21), since M(<) is nonsingular,there exist indices 1 a,j, <
'.:.")
j z < . . . < . j k II' such that NE(/I .I".....lk is nonsingular, and againby apply-
ing the Hadamard-type inequality to the latter totally positive matrix we
obtain
354 Gaffke and Heiligers
where x* = (ST, . . . ,s k* ) IS
’ an optimal solution to the problem
Maximizedet N(x)
Subject to xI = CI < s 2 < . . . < s k - l < .yk = h
Lemma 4
Let io E { 1, . . . , e - I ] be such that the interior knot K,, has smoothness
.Y;,= 0. Then the support of the D-optimal minimum support design for S,
( K , s ) is the union of the supportsof the D-optimal minimum support designs
for S,/(K('), & I ) ) and for S[,(K(*),
s ( ~ ) ) respectively,
, where
K(l) - -
- (Kg, K I , . . . , K,,), - (sI , . . . , "in
-I )
and
Let x = ( x I , . . . , .xk) satisfy (25) and such that the collocation matrixN ( x ) is
nonsingular, i.e, B,(.x,) > 0 for all i = I , . . . , k. By (27a)-(27c), s ~ , < - ~K,,
and N ~ , , +>~ K,,,. Hence the Hadamard type inequality for totally positive
matricesentails,usingnotationsforsubmatrices as in theproof of
Lemma 3 ,
356 Gaffke and Heiligers
5 det N,
I,...,k()-
l,...,ko- 1
'>Bk,l(.~ku)
det N, (""'
...,k
+
ko+1, . . . , k
1, ..., k o - 1 k"+1 ,...,k
5 det N,
I,...,/<()- 1
Hence
Equation (29) ensures that an optimal solution x* of (25) must be such that
x*(') is an optimal solution to the problem
Now the assertion follows by observing that the matrices N l ( x " ' ) and
N2(-t?') are the collocation matrices to x") and x(?)under special bases of
the spline spacesS(/(K('), ,$I)) and S,[(K(?),
,+')), respectively. For, note thatby
(26) and ( I ) , /io is the dimension of the space SI/(^('), ,s(')). The B-splines
B I ,. . . , Bk restricted totheinterval [ ( I , K,,,] are clearly members of
S,I(K('), .s(')i; they are linearly independent [since by (29) there is a nonsin-
gular collocation matrix in these splines], and hence they are a basis of the
space S,/(K('), .+I)). Similarly, it can be seen that the dimension O f S,/(K'",
+
is equal to k - ko I , and the B-splines Bk,,,. . . , B k restricted to the inter-
val [K,,,, h] forma basis of space
the s(/(~(?), s(:)). 0
Repeated application of Lemma 4 for merely continuous polynomial
spline regression yields
Corollary 5
The support of the D-optimal minimum support design for S,/(K, 0) is the
union of the supports oftheD-optimaldesignsforordinarytlth-degree
polynomial regressions over the subintervals [ K , , K , + ~ ] ,i = 0, . . . , L' - I .
REFERENCES
Anders Hynen
AB6 Corporate Research, Vaster&, Sweden
1. INTRODUCTION
+ +
where f ( x ) is the expectation of y, .f’(x) g(z) h ( x , z) is the conditional
expectation of y given z; here h(x, z) corresponds to the interactionbetween
359
360 BergmanandHynen
The two terms in the variance of J' can be interpreted as follows. The first
term on the right, E[Var[l*lz]], portrays how the variance of J', given z, is
affected by dispersion effects, i.e., factors affecting the spread of the data.
The second term on the right,Var[E[ylz]]. portrays how the variance of J, is
affected by parameters in the location model-including fixed effects of z
such as design by environmental interaction effects. The approach is moti-
vated by the incorporation of dispersion effects, since direct location mod-
eling of both design factors and environmental factors is allowed; thus this
standpoint reduces the risk of confounding location effects and dispersion
effects. Theoreticaljustificationfortheapproach is alsoprovided by
Shoemaker et al. (1991), Box and Jones (1992), and Myers et a l . (1992).
In this chapter we discuss how to identify control factors, i.e., product
or process parameters, having dispersion effects; in particular, we discuss
howdispersion effects can be identified usingunreplicatedexperimental
designs i n the 2k"' series of fractional factorial designs (see Bergman and
Hynen. 1997). For some extensions to more general designs, see Blomkvist
et al. (1997) and Hynkn and Sandvik Wiklund (1996).
modeling, but many of these are merely anecdotes or aimed at making the
estimation of location effects as efficient as possible. During the past decade
this problem area experienced a rapid growth of interest, as shown by the
number of applicationsandpublishedpapers.Ingeneral,thereare two
approaches for dispersion effects modeling: Either the experiment is repli-
cated, or it is not. Major emphasisin this chapter is placed on the lattercase;
however, for the sake of completeness both approaches are considered.
In a replicated experiment, identification of dispersion effects is fairly
straightforward. Depending on the error structure of the experiment, e.g.,
on whether or not the replicates are carried out fully randomized, the iden-
tified dispersion effects are effects measuring variability either between or
within trials. Some may use the terms rqdiccrres and cl~rplicates,or genuirw
and , f C / s c ~ replicates, respectively. If we compute sample variances, under
each treatment combination, on which new effects can be computed, the
analysis is rather uncomplicated. Taking the logarithm prior to computing
the effects improves estimation (see Bartlett and Kendall, 1946). The new
effects, which can be seen as dispersion effect estimates, can be plotted on
normal probability paper to discriminate between large and small effects or
analyzed with other techniques such as analysis of variance.For more back-
ground on this topic, see Nair and Pregibon (1988) and Bisgaard and Fuller
(1995).
If the problem of dispersion effect modeling is a fresh arrival. identi-
fication of dispersion effects from unreplicated experiments is of even more
recent date. Rather pioneering, Box and Meyer (1986b) published a paper
addressing dispersion effect identification from unreplicated two-level frac-
tional factorial experiments in the 2k-/' series. Their contribution was not
entirely unique; Daniel (1976). Glejser (1969), and many others had touched
upon the subject earlier, but Box and Meyer were the first to propose dis-
persion effect identification from unreplicated experiments as an important
aspect of parameter design. In a paper by Bergman and Hynkn (1997), in
which the problem area is surveyed and a new method is introduced; dis-
persion effects fromunreplicateddesigns in the 2k-/' series cannow be
identified withwell-knownstatistical significance testingtechniques (see
SO Section 3). It is still tooearlytojudgethe significance ofthe new
method,butcomparedto existing methodsthe new proposaldoesnot
rely on distributional approximations or model discrimination procedures
that are entirely ad hoc. There is, however, an assumption of llorlllality that
is rather critical (see Hynen, 1996). Moreover,themethodproposed in
Bergman andHynen (1997) is generalized toexperilnental designs other
than the two-level designs from the 2"/' series by Blolnkvist et a]. (1997)
and to the inner and outer array setup by Hynln and SandvikWiklund
(1996). The use of normalprobabilityplottingandtransfornlatiolls in
362 BergmanandHynen
3. DISPERSIONEFFECTS IN TWO-LEVELFRACTIONAL
FACTORIAL DESIGNS
3.1. LocationEffects
Let y be the (12 x 1) response vector from a complete or fractional factorial
experiment with an ( n x n ) design matrix X with column
vectors
xo, . . . , x,,+I.Column x. is a column of I’s, and the remaining columns
representcontrastsforestimatingthemainandinteraction effects. We
Dispersion Effects and Their Identification 363
As noted by Box and Meyer (1993), the “vital few and trivial many” prin-
ciple suggested by Juran (the Pareto principle) ensures that in most cases
only a few B’s are nonnegligible. Therefore, we can use the normal plotting
techniquesuggested by Daniel(1976) to find these B’s (see alsoDaniel,
1959). Of course, there may be problems due to confoundings when highly
fractionated designs are used, but this issue is not discussed further here.
See, forexample, Box and Meyer (l986a, 1993), who give an interesting
approach to these problems using Bayesian techniques.
Under the Pareto principle, only a few degrees of freedom are used to
estimate nonnegligible /3 values. Therefore, the remainder of the contrasts
can be used to estimate the variance ai, i.e.,
1 1
X/li+ =?(x,+x,.,) and xjli- =-(x, - x,.,)
2
364 Bergman and Hynen
E[xlyl = E[x,!;y] = 0
Note that there are ( n - 1)/2 members of TI if all location effects are
judged to be negligible, i.e., if we have E[xjy] = 0 for allj. It is straightfor-
ward toshowthatthecontrastscorrespondingto (I), ziIj+ = x,lli+yand
have variances
zili- = X~;~-Y,
Vur[zili+]= n-aj+
2 11 7
and V ~ W [ ; ~=
~ ,-a;-,
-] respectively (3)
2 2
Now, let xibe associated with a studied factor,i.e., let x,'y estimate one
of the main effects. If a;+ and 0-; are different, this factor has a dispersion
effect. Therefore,
the difference
between and gives information
about the magnitude of this dispersion effect. If we can find many indices
belongs to r,, then all the corresponding &+ and
.j such that (x,, x,:i}
can be used to estimate the difference between ai?, and c$-. Moreover, since
the column vectors xili+ are orthogonal, the contrasts ~/21~+ are independent.
Therefore, we can use an F test fortesting Hi" : D;+ = 0-; against
Hi, : a
:+ # c$- with the test statistic
3.3. AlternativeExpressions
The intuitive understanding of the above expressions might be somewhat
vague. However, more intuitive expressions exist.Compute new “residuals,”
F,,, z[ = I , . . . , I ? , based on a location model including the active location
effects expanded with the effects associated with column i and all interaction
terms between i and the active location effects. Then the statistic 0;” may
be computed a s
4. AN ILLUSTRATIONFROMDAVIES (1956)
without replicates involving five factors, labeled A-E. The defining relation
was chosen as I = -ABCDE. The quality of the dyestuff was measured
by aphotoelectricspectrometer,whichgaveaqualitycharacteristicof
“the smaller the better” type; i.e., the lower the value recorded, the better
the quality. Responses and all 15 orthogonal columns concerning location
main and interaction effects are given in Table 1.
Since no independent error estimateis available, the normal probabil-
ity plot of contrasts suggested by Daniel(1976) is a convenient tool for
analysis (see Fig. 1). From this plot it appears that factor D is theonly
location effect present in thedata; hence columnsotherthan D can be
used for estimating dispersion effects. In Davies (1956) only location effects
were considered, but Wiklander (1994) detected and showed evidence of a
dispersion effect from factor E. Further investigations willbe conducted
using our method.
An estimate of the dispersion effect from factor E becomes available
on combining certain columns according to Eq. (1). The pairs of columns
included must be judged to belong to the set rE,Le., judged not to corre-
spond to active location effects. These new contrasts and their calculated
values appear in Table 2.
For illustration, the contrastzAIE+is derived by combining columns A
and AE, i.e.,
cr:+
Furthermore, testing Hi, : c r j2+ = 0;-against H i , : # 0-; for other factors
will require calculations analogous to those in Table 2 but based on other
contrasts. The results from such a procedure are presentedin Table 3. Note
however, that the five F tests are not independent.
FromTable 3, wesee thatfactor E hasadispersion effect that is
difficult to disregard. Wiklander (1994) detected this dispersion effect and
found it significant. However, she used only (3, 3) degrees of freedom in a
similar test. Furthermore, even factor D might have a dispersion effect that
was not detected by Wiklander (1994). However, a complete analysis ofdata
shouldalwaysinvolveresidualanalysis,whichhere reveals apossible
abnormality in observation 11. Treating y I I as a missing observation and
recalculating it by setting some negligible contrast to zero (see Draper and
Stoneman, 1964) shows that the dispersion effect from D becomes insignif-
icant.Furthermore.thedispersion effect from E is fairly insensitive to
changes in y II , and it is therefore reasonable to considerE as the only active
dispersion effect on the dyestuff data.Of course, there is also always therisk
of overestimating the significance due to the multiple test effect.
Table 1 Design Matrix, Responses, and Confounding Structure up to Two-Factor Interactions for the Dyestuff Data
u A B C D AB AC AD BC BD CD -DE -CE -BE -AE -E j',,
1 - - - - + + + + + + - - - - + 201.5
2 + - - - - - - + + + + + + - - 178.0
3 - + - - - + + - - + + + - + - 183.5
4 + + - - + - - - - + - - + + + 176.0 g
5 - - + - + - + - + - + - + + - 188.5
6 + - + - - + - - + - - + - + + 178.5 3
7 - + + - - - + + - - + + - + 174.5 8
8 + + + - + + - + - - + - - - - 196.5 =
9 - - - + + + - + - - - + + + - 255.5 2
10 + - - + - - + + - - + - - + + 240.5 (D
2
11 - + - + - + - - + - + - + - + 208.5
12 + + - + + - + - + - - + - - - 244.0 2
13 - - + + + - - - - + + + - - + 274.0
14 + - + + - + + - - + - - + - - 257.5 2
2.
15 - + + + - - - + + + - - - + - 256.0
16 + + + -1 + + + + + + + + + + + 274.5 E
(D
z=!
0
%
5.
3
368 Bergman and Hynen
99.99 i 1 1 1 I I I I ' l l [ I l l I I 1 I I I i I Ill I l l I I '
90 j
80
70
.........................................
-....... ..........
4
i.......;: i ..........e .......-..e..........
0.....0....6..........1..*.....4..........,...................-
I
-
A A..
-.......6 ..........6f...........f
A..
c I G
.......... .......j...........i...................-
.........;.. .....8.;..........6 ..........;...................-
t....
10 -....... ..........
A .....*..:.......... ..........:..........:........-
............A G
5 -.......e ..................... I
.......... ........-
:...........:..........<...........I
0 ;
;
-80-60-40-20 0 2040 60 80
Contrast Values
Table 2 Contrasts of Use for Estimating the Dispersion Effect from Factor E
Contrast E = '' f " E = '""
-7.5 11.0
0.5 -61 .0
37.5 75.0
9.5 124.0
26.5 "2.0
35.0 6.5
Table 3 F Ratios for the Five Factors from the Dyestuff Data
d.f. F ratio P value
0.36
2.83
0.37
4.47
0.14
DispersionEffectsandTheirIdentification 369
5. GENUINEREPLICATESANDSPLIT-PLOTDESIGNS
ACKNOWLEDGMENTS
REFERENCES
Anbari FT, Lucas JM. (1994). Super-efficient designs: How to run your experiment
andhigherefficiencyandlowercost.In1994ASQC48thAnnualQuality
Congress Proceedings. May 24-16, 1994, Las Vegas, Nevada.
Bartlett MS, Kendall DG. (1946). The statistical analysisof variance-Heterogeneity
and the logarithmic transformation. J Roy Stat SOC Ser B 8:128-138.
Berman B, Hynen A. (1997). Dispersion effects from unreplicated designsin the 2k-p
series. Technometrics 39(2).
BergmanB, HolmqvistL.(1988). A Swedishprogrammeonrobustdesignand
Taguchi methods. In: Bendell T, (ed.) Taguchi Methods. Proceedings of the
1988 EuropeanConference.London, 13-14 July1988.Amsterdam:Elsevier
Applied Science.
Bisgaard S, Fuller HT. (1995). Quality quandaries-Reducing variation with two
level factorial experiments. Qual Eng 8(2):373-377.
Blomkvist 0, HynCn A, Bergman B. (1997). A method to identify dispersion effects
from unreplicated multilevel experiments. Qual Rehab Eng Int 13(2).
Box GEP, Jones S. (1992). Split-plot designs for robust product experimentation. J
Appl Stat 19( 1):3-26.
Box GEP, Meyer RD. (1986a).Ananalysisforunreplicatedfractionalfactorials.
Technometrics 28(1):11-18.
Box GEP, Meyer RD. (1986b).
Dispersioneffects
from
fractional
designs.
Technometrics 28(1): 19-27.
Box GEP, Meyer RD. (1993). Finding the active factors in fractionated screening
experiments. J Qual Techno1 25(2):94-105.
Dispersion Effects and Their Identification 371
Nelder JA, Lce Y. (199 I). Generalized linear models for the analysisof Taguchi type
experiments. Appl Stochastic Models Data Anal 7: 107-120.
Phadke MS. (1989). Quality Engineering Using Robust Design. EnglewoodCliffs,
NJ: Prentice-Hall.
Shewhart WA. (1931). E c ~ m m i cCor~trol
~ of' QLIoII'I~ Prothrc.1. New
of' Mnr~tIfirct~rrc~cl
York: Van Nostrand. (A 1981 reprint is available from thc Anlerican Society
for Quality Control.)
Shoelnaker AC, Tsui K-L, Wu CFJ. (1991). Economical experimentation methods
for robust design. Technometrics 33(4):415427.
Taguchi G. (1981). On-Line Quality Control During Production. Tokyo: Japanese
Standards Association.
Taguchi G. (1986). Introduction to Quality Engineering. Tokyo: Asian Productivity
Organisation.
Taguchi G. WuY. (1980). Introduction to Off-Line Quality Control. N21goy:1. Japan:
Central Japan Quality Control Association.
Wang PC. (1989). Tests for dispersion effects from orthogonal arrays. Comput Stat
Data Anal 81109-1 17.
Wiklandcr K . (1994). Models for dispersioneffects in unrcplicated two-level factorial
experiments. Thesis No. 1994: IjISSN 1100-2255, The University of
Gothenburg, Sweden.
22
A Graphical Method for Model Fitting in
Parameter Design with Dynamic
Characteristics
Sung H. Park
Seoul National University, Seoul, Korea
Je H. Choi
Samsung Display Devices Co., Ltd., Suwon, Korea
ABSTRACT
Detecting the relationship between the mean and variance of the response
and finding the control factors with dispersion effects in parameter design
and analysisfordynamiccharacteristics areimportant. Inthis paper,a
graphical method, called multiple mean-variance plot, is proposed to detect
the relationship between the mean and varianceof the response. Also to find
the control factorswith dispersion effects, the analysisof covariance method
is proposed,and itspropertiesarestudiedcompared withthedynamic
signal-to-noise ratio. A casestudy is presented to illustratetheproposed
methods.
1. INTRODUCTION
373
374 Park and Choi
t
Figure 1 Dynamic system of parameter design.
3. UNKNOWNVARIANCEFUNCTIONANDDETECTION
Let y v k denote the response corresponding to the ith setting of the control
factors, ,jth level of the signal factor, and kth noise factor or repetition, for
i = 1, ..., I; j = 1, ..., 177; and k = I , ..., n. Then the data structure of the
response in the dynamic system is assumed to be expressed as
3.1 DetectingtheRelationshipBetweenMeanand
Variance by Using a Multiple Mean-Variance Plot
T o detect the relationship V ( . ) between the variance and the mean of the
response, a multiple mean-variance plot (MMVP) is suggested. Nair and
Pregibon (1986) proposed the mean-variance plot, and Lunani et al. (1995)
proposed the sensitivity-standard deviation (SS) plot for the dynamic char-
acteristic problems. Lunani et al. considered the model where the variance
structure satisfies the relationship
log(s;) = log(0J
e A
+ -log((3,) (2)
2
where si and si are obtained from the regressionfitting for each control
factor setting i. Lunani et al. plotted [log (pi),log(sj)] for each control factor
and visually examined the plots to check the nature of the relationship. They
noticed that when some control factors have dispersion effects, the inter-
cepts log ( 0 , )can vary from one control factor setting to another, makingit
possible to have several parallel lines with a common slope8/2 in the SS plot
under model (2).
The MMVP is proposed for model (1). It is the combination of the
mean-variance plot and the multiple SS plot. Under model ( l ) , there is a
logarithmic relationship between Jii and ,s;.,
ParameterDesign with DynamicCharacteristics 377
where ,s; = Ck (yuk - ju)2/(n - I). Note that the expected values of j , , and
Engel (1992) noticed that the parameter log (a)is a nonconstant term in the
Logothetismodeland replaced it by thetermlog ( a f ) ,which is alinear
function of the control factors.
When the logarithm is taken on the variance termin model ( I ) , the following
equation is obtained.
Here x, is the row vector of the control factors, andy is a parameter vector.
When this model is applied to practical applications, the fitting model (7) is
used as the form of ANCOVA. Here the sample variance .sf on log is the
dependent variable, the control factors are the factor given in the vector xi,
andthesamplemean Ti, or itsfunction / ~ ( j j j ~is) thecovariate,where
/l(.)ol = V ( . ) :
WLS method is applied for each control factor setting i to adjust the sensi-
tivity of the response to the signal factor M .
4. AN EXAMPLE:CHEMICALCLEANINGEXPERIMENT
In this section, the data set from the chemical cleaning process for Kovar
metal components (American Supplier Institute, 1991) is reanalyzed to show
how to use the ANCOVA method and multiple mean-variance plot pro-
posed in Section 3 to find the control factors with dispersion effects and the
functional relationship between the mean and the variance.
The response y is the amountof the material removed as a result of the
chemical cleaning process.The inner array is L18including a two-level factor
A and three-level factors B, C, D,E, F, and G. The outer array consists ofa
three-level signal factor M crossed with L, for a compound array of three
two-level noise factors X , Y , Z . The signal factor M is the acid exposure
time, which is known to have a linear impact on the expected value of the
response. By imposing the linearityof the signal factor, the processbecomes
predictable andmorecontrollablefromtheengineering knowledge. The
information about the experimental factors and the raw data are given in
Tables 1 and 2.
Signal factor
I 0
"
0 1 1 0
M?
0 1 1 0 0 1 1
Noise
factor
x
Control factor 0 1 0 1 0 1 0 1 O I O I ~ Y
A B C D E F G 0 1 1 0 0 1 I O 0 1 1 0 z
C o l . 1 2 3 4 5 6 7 1 2 3 4 I 2 3 4 1 1 2 3 4 1
I 0 0 0 0 0 0 0 9 1 11 5 1 1 14
17
22
14 19
2926 18
2 0 0 I I 1 1 I 2 7 3 13 0 2 5 43556343 67638843
3 20 2
20 2 2 26
36 3 8 2 6 40
57
71
44 59
8202
59
4 0 I 0 0 1 1 2 27
39 5024 51 8692
48 78 113 12368
5 0 I I 1 2 2 0 1 1 3 2 7 1 4 24 3337
22 32
34
51
31
6 0 I22 0 0 I 1 4 2 0 2 3 1 6 27 304420 38
505928
7 0 2 0 I 22 2 1 2 1 8 2 3 1 3 26 31
32
23 29
4246
29
208 1 2 0 0 0 25 3 3 4 5 3 6 34 45
9555 4266127 80
9 02 2 0 1 1 1 1 9 2 4 2 7 1 7 33 4646
27 40
61
7033
10 I 0 02 2 I 0 2 5 4 34 0 3 2 385362
41 56
739551
II 1 0 1 0 0 2 1 2 5 4 2 3 6 2 1 42 49
6336 47
58
81
56
12 1 0 2 1 I 0 2 1 6 2 0 1 7 8 22 3236 20 34 43 5333
13 1 1 0 12 2 I 2 7 3 9 2 8 1 9 46 625536 58
8478
48
14 1 I 1 2 0 0 2 2 6 4 87 7 3 7 428410460 52 1 1 I 10978
15 I 1 2 0 1 1 0 31 5 6 5 9 3 2 58 88 9850 77 115 12871
16 1 2 02 2 0 I 17 1 7 3 4 1 8 23 3156
30 32426740
17 1 2 1 0 0 1 2 21 2 5 6 1 2 0 30 439536 4060 I24 60
18 I 2 2 1 1 2 0 1 3 2 6 2 2 1 0 23 27
2720 26
5241
28
Log (Mean)
T-
Io o
20 40 60 80 100
Mean
Figure 2 Plots of [log ( j i i ) , log (si)] and LFji, log (s;)] for chemical cleaning
experiment.
either of those may be selected. In the other frames of Figure 3. the points
are not divided into separate lines according to the levels of factors D,E, F,
and G.
The results from the analysisof the dynamic S/N ratio are presentedin
Table 4. These results show that A , B, C, and D are the important factors
with respect to the S/N ratio. The best level selected is A o , Bo, C, and D l ,
which is similartothe selected level in theresultsfromthe ANCOVA
method except for factor D. But in the ANCOVA method, factors C and
D are not very significant (their p values are 0.053 and 0.057, respectively),
and other levels of these factors may be selected.
Factor A Factor B
+#
0
0
Factor E Factor F
0
+ J 7
K
0
r
0 +
Figure 3 Multiple plots of [log (Fo). log (s;)] for chemical cleaning experiment.
384 Parkand Choi
Table 3 ANOVA Table for the ANCOVA Method with Covariate log (Pii)
Source DF Adjusted SS F p Vulue
Covariate 100.12
1 4.81013 0.000
A I 13 0.323 5.66 0.022
B 2 0.91742
0.002 7.01
C 2 0.35064
0.053 3.15
0.057 D 3.06 2 0.30866
E 2 0.09540
F 2 0.14475
G 2 0.05400
A x B 2 0.18098
le) 37 2.59806
Pooled error 45 3.091 12
T 53 17.92377
REFERENCES
AmericanSupplier
Institute. (1991). Taguchi Symposium: CaseStudiesand
Tutorials, Dearborn, MI: ASI.
Box GEP. (1988). Signal-to-noise ratios, performance criteria, and transformations
(with discussion). Technometrics 30: 1-40.
DavidianM,Carrol RJ. (1987). Variancefunctionestimation.AmStat Assoc
82:1079-1091.
Engel J. (1992) Modellingvariation in industrialexperiments.ApplStat 41(3):
579-593.
ParameterDesignwithDynamicCharacteristics 385
John A. Nelder
Imperial College, London, England
1 . INTRODUCTION
387
388 Lee and Nelder
establishedstatisticalapproachasresponsefunctionmodeling(RFM).
They,ofcourse,recommend RFM. However,whattheyactually do
seems to be closer to the PMM approach. The major difference is that
they
considerstatistical
modelsfor
responses
before
choosingPMs.
Because of the initial data reduction to PMs, their primary tool for analysis
is restricted tographicaltools such asthenormalprobabilityplot.
Interpretation of such plots can be subjective. Because information on the
adequacy of the model is in the residuals, analysis using PMs makes testing
for lack of fit difficult or impossible.
In 1991, we (Nelder and Lee, 1991) published a paper giving a general
method that allows analysis of data from Taguchi experiments in a statis-
tically natural way. exploiting the merits of standard statistical methods. In
this chapter, we provide a detailed exposition of our method and indicate
how to extend the analysis to Taguchi experimental data for dynamic sys-
tems.
2. THE MODEL
Taguchi robust parametric design aims to find the optimal settingof control
(i.e., controllable) factors that minimizes the deviation from the target value
caused by uncontrollable noise Factors. Robustness means that the resulting
productsare then less sensitive tothe noise Factors. Supposearesponse
variable J * can be modeled by a GLM with E()*,)= p, and
var (1.;) = @;V(p,), where 4; are dispersion parameters and V ( ) the var-
iance function. The variance of y; is thus the product of two components;
V ( p i )expresses the intrinsic variability due to the functional dependence of
thevarianceonthemean p,, while @; expresses theextrinsicvariability,
which is independent of therange of meansinvolved.Suppose we have
control factors C1, ..., Cl, and noise factors N , , ..., N ( / . In our 1991 paper
we considered the following joint models for the mean and the dispersion
where g ( ) is the link function for the mean, and.fi(CI,.... C , , N 1 , ..., N , ) are
linear models for experimental designs, e.&., the main effect of the model is
C , + ...+C , + N I + ...+N,. The log link is assumed for the dispersion as a
default; there are often insufficient data to discriminate between different
Joint Modeling of the Mean and Dispersion 389
link functions. We need to choose for each model a variance function, alink
function,and terms in thelinearpredictor. By choosinganappropriate
variance function for the mean, we aim to eliminate unnecessary complica-
tions i n themodel duetofunctionaldependence between themean and
variance [the separation of Box (1988)l. It is useful if the final mean and
dispersion models have as few common factors as possible. The link func-
tion for the mean should give the simplest additive model [the parsimony of
Box ( 1 988)].
Control factors occuring in.fi( ) only or in both,/;( ) and/;( ) are used
to minimize the extrinsic variance, and control factors occurring in./, ( ) only
are then used to adjust the mean to a target without affecting the extrinsic
variability.
If we analyze PMs such as SNRs, calculated over the noise factors for
each combination of the control factors, it is then impossible to make infer-
ences about the noise factors in the model for the mean. This reduction of
data leads to the number of responses for the dispersion analysis being only
a fraction of those available for the mean. We do not have such problems
since we analyze the entire set of data; see Lee and Nelder (1998).
3. THEALGORITHM
where I/,
For given
= -2
l:+;,
(19, - u)/V(u)du denotes the GLM deviance component
the EQL is, apart from a constant, the quasi-likelihood
(QL) of Wedderburn (1974) for a GLM with variance function V(p,). Thus
maximizing Q+ with respect to will give us the QL estimators with prior
weights 1 /+,,
satisfying
390 Lee and Nelder
In summary, our model consistsof two interlinked GLMs, one for the
mean and one for the dispersion as follows. The two connections, one in
each direction, are marked. The deviance component from the model for the
mean becomes the response for the dispersion model, and the inverse of the
fitted values for the dispersion modelgive prior weights for the mean model.
(See Table I.) In consequence, we can use all the methods for GLMs for
inferences from the joint models, including various model-checking proce-
dures.
Recently, there has been an emphasis on making the system robust over a
rangeofinputconditions, so therelationshipbetweentheinput (signal
factor)andoutput(response) is ofinterest.FollowingLunanietal.
(1997), we refer to this as a dynamic system. Miller and Wu (1996) and
Lunani et al. (1997) have studied Taguchi's method for dynamic systems.
Suppose we have a continuoussignal factor M , measured at m values. These
researchers consider models analogous to the mean and dispersion models
and
where g( ) is the link function for the mean and I ( ) is the function describ-
ing the relationship between the input (signal factor) and output (response).
Table 1
GLM
Response
Mean
Variance
Deviance component
Prior weight 1
392 Lee and Nelder
5. ADVANTAGES OF THEGLMAPPROACH
6. CONCLUSION
REFERENCES
1. INTRODUCTION
395
396 Hirotsu
the usual omnibus F test for interaction is not very useful, and row-wise
and/or columnwisemultiple comparisonprocedures have been proposed
(Hirotsu, 1973, 1983a, 1991a). Those procedures are also useful for model-
ing andanalyzing contingencytables andmultinomialdistributionsnot
restricted narrowly to the analysis of variance (Hirotsu, 1983b, 1993).
Another interesting problem is detecting a two-way changepoint for
the departure from a simple additive or multiplicative model when there are
intrinsic naturalorderingsamong the levels ofthetwo-wayfactors.
Detecting a change in the sequence of events is an old problem in statistical
process control, and there is a largebody of literature dealing with this.
These works, however, are mostly for univariate series of independent ran-
dom variablessuch as normal,gamma,Poisson,or binomial [e.g., see,
Hawkins (1977), Worsley (l986), and Siegmund (198611. Therefore in this
chapter I discuss an approach to detecting a two-way changepoint.
with (up);,= 0, (crp),, = 0 and (~rp);~ = ( ~ r ~ ) ~if, , ,i,, i' E G,, and j , j ' E J!.,
where G I , ,I I = I , ..., A and J ( , ,I I = I , ..., B denote the homogeneous sub-
groups of rows and columns, respectively. We use the usual dot bar notation
throughout the paper. Model ( I ) may be called the block interaction model
with df(A - l ) ( B - 1 ) for interaction. The row-wise and/or columnwise m u l -
tiplecomparisons seem particularly useful fordealingwithindicative or
variational factors: see Hirotsu (1991;~ 199lb,1992) for details.
3. THE GENERALIZEDINTERACTION
Rank
Dose 1 2 3 4 5 6 7 8 9 1 0
~~ ~ ~ _ _ _ _ _ _ _ _ _ _ _ _ ~
25 1 1 1 1 0 0 0 0 0 1
200 0 0 0 0 1 1 1 1 1 0
AF3lng 7 4 33 21 10 I 76
6 AF6mg 5 21 16 23 6 77
Table 4 Outcome of Bernoulli Trial with Probability Change at the 1 Ith Trial
Run
PA P22
<-s ... 5 -P 2 k
-
PI1 PI? Plk
taking into the account the natural ordering in columns. In (3) we assume
that at least one inequality is strict. It then includes as its important special
case a changepoint model,
4. A SAMPLEPROBLEM
Given half-life data (1.21, 1.63, 1.37, 1S O , 1.81) at a dose level of 50 mg/
(kg . day) in addition to Table 1, we obtain Table 6. We also have placebo
data in the doseeresponse experiment, with which we obtain Table 7.
Next suppose that the products from an industrial process are classi-
fied into three classes(lst, 2nd, 3rd) andtheir probabilities of occurrence are
changed from (1/3, 1/3. 1/3) to (2/3, 1/6, 1/6)at the 1 lth trial. An example of
theoutcome is shown in Table 8. This is regarded a s anindependent
sequence of trinomials.
I t should be notedthat in all threeexamplestherow-wise and/or
columnwisemultiplecomparisonsareessential.Notingtheexistence of
the natural orderings in both rows and columns, we are particularly inter-
ested in testing the null hypothesis
3rd I I ( 1 0 0 0 I 0 0 I n o o o o o o I o o
2nd 0 0 0 I I I O I I O 0 0 I 0 0 I 0 0 0 0
IS l o o ~ o o o o o n o ~ ~ o ~
Total 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
with at least one inequality strict. Again the alternative hypothesis includes
as its special case a two-way changepoint model such that the inequality
+
holds only when i 5 I , i' L I 1 and , j 5 J.,j' > J I , where ( I , J ) is the +
unknown changepoint. This is a natural extension of the one-way change-
point model (4).
The analyses of interaction in the analysis of variance model andin the log-
linear model are parallel to some extent. at least for two-waytables [see
Hirotsu (1983~1,1983b)], and here we give only the procedure for the latter
for brevity.
5.1. ComparingTreatments
The most popular procedure for comparing treatments is Wilcoxon's rank
sun1 test. I n that procedure theJth category is given the score of the mid-
rank,
2
and the rank sum of each treatment is defined by
W, = w j ~ * l j , i = 1.2
I
402 Hirotsu
where ~ is theobservedfrequency
3 , ~ in the(ij)th cell. Thestandardized
difference of the rank sums is then defined by
*) =
1
I2 -
(x) 1
(jj + - Z)
where
and
v = (k - I)/d
When the y J are all equal as in Table 4, x*2 iswell characterized by the
expansion
where x:,x:, . . . are the linear, quadratic, etc. chi-square components each
with one degree of freedom (do and are asymptotically mutually indepen-
dent; see Hirotsu (1986) for details. More specifically, xfl) is just the square
of thestandardizedWilcoxonstatistic.Thusthestatistic is used to test
mainly but not exclusively the linear trend in p 2 j / p I ,with respect to j . For
the data of Table 4, = 30.579 and constants are obtained as d = 6.102
and v = 3.114. The approximated two-sidedp-value is then obtained as
0.183.
5.2 ChangepointAnalysis
The maximal component of the cumulative chi-square statistic
x a = maxj x,2
is known as the likelihood ratio test statistic for changepoint analysis and
has been widely applied for the analysis of multinomials with ordered cate-
gorical responses since it is a very easy statistic to interpret. Some exact and
efficient algorithms have been obtained for calculating its p-value, which is
based on the Markov property of the sequence of the chi-square compo-
nents, x:, . . . , xi-l [see Worsley (1986) and Hirotsu et al. (1992)l. Applying
thosealgorithmstoTable4, we obtainthe two-sidedp-value0.135 for
xM2 = 5.488, which givesmoderate evidence for the changein the probability
of occurrence.
Incomparingthethreestatisticsintroducedabovefortestingthe
orderedalternatives(3),theWilcoxonstatistictestsexclusivelyalinear
trend,max x2 is appropriate for testing thechangepointmodel(4),and
x*2 keeps a high power over a wide range of the ordered alternatives. As
an exampleofcomparingtwomultinomialswithorderedcategorical
responses, the three methods are applied to the data of Table 3, and the
results are summarized in Table IO. For reference, the usual goodness-of-fit
404 Hirotsu
chi-square value is shown at the bottom of the table: it does not take the
natural ordering into account and as a consequence is not so efficient as the
other three methods for the data.
x3
so that Y(,Il= J - , , is the grand total of observations. The (i..j)th component
is the goodness-of-fit chi-square value for the 2 x 2 table obtained in the
same way as Table 9 by partitioning and pooling the original N x /< data at
the ith row and the ,jth column.
The statistic x**’ is again well approximated by(/x: with
Modeling and Analyzing the Generalized Interaction 405
6- 1
tl = til X (12 = 1.5125 X 1.2431 = 1.8802, v = (3 - 1)- = 5.3 I9
1.X802
+
Then the p-value of x**? = 0.00773 ... + 1.41212 = 31.36087 is evaluated
a s 0.0065 by the distribution 1.8802 x:,3l9. This is highly significant, suggest-
ing the dose dependence of responses.
Ordcrcd category
+
W (1 , ..., i; i 1. ...,k). The statistic S can also be based on the cumulative
chi-square statistic, which we denote by 1 , ..., i; i + 1, . . . , k ) . They are
x*’(
calculated a s two-sample test statistics between the two subgroups of rows
+
(1, . . . , i) and (i 1, ..., k). The formula to obtain the asymptotic p-value of
max W (1 , .... i; i + 1, .... k) is given in Hirotsu et al. (1992), and the one for
+
max x*’( 1 , ..., i; i 1. ..., k) in Hirotsu and Makita (1992), where the max-
imum is taken over i = 1, ..., k - 1. The multiple comparison approaches
applied to the data of Table 7 are summarized in Table 12.
7. SOME EXTENSIONS
7.1. GeneralIsotonicInference
Amonotonicityhypothesis in adose-responserelationship,say can be
naturallyextended to theconvexityhypothesis (Hirotsu, 1986) and the
downturnhypothesis (Simpson andMargolin, 1986), which arestated in
the one-way analysis of variance setting as
H,. : 112 - PI IP3 - p2 F .” 5 Po - Pu-I
and
H” : pI F ... 5 Pr+l 2 p,+z 2 ... 2 T = 1, ..., p,l-l
respectively. InHirotsu (1986) astatistic is introducedfortestingthose
hypotheses, and an application of its maximal component is also discussed
in Hirotsu and Marumo (1995). These ideas can be extended to two-way
tables, and a row-wise multiple comparisons procedure was introduced in
Hirotsu et al. (1996) for classifying subjects based on the 24 h profile of their
blood pressures, which returns to approximately its starting level after 24 h,
wherethecumulativechi-square and linear trend statistics are obviously
inappropriate.For a more generaldiscussion fortheisotonicinference,
one should refer to Hirotsu (1998).
7.2. HigherWayLayout
The ideas of the present chapter can be naturally extended to higher way
layouts. As oneof those examples, a three-way contingency table with ageat
four levels, existence of metastasis into a lymph node at two levels, and the
soating grade at three levels, is analyzed in Hirotsu (1992). An example of
highly fractional factorial experiments with ordered categorical responses is
given in Hamada and Wu (1990); see alsothediscussionfollowing that
article.
8. CONCLUSION
The analysis of interaction seems to have been paid much less attention than
it deserves. First, the character of the two-way factors should be taken into
account in makingstatisticalinference to answeractualproblemsmost
appropriately. Row-wise and/or columnwise multiple comparisons are par-
ticularly useful when one of the factors is indicative or variational. Second,
analysis of the generalized interaction is required even in the one-way ana-
lysis of variance framework if the responses are ordered categorical, which
408 Hirotsu
includes rank data as an important special case. Then testing the ordered
alternatives for interaction is of particular interest, and the cumulative chi-
square statistic and its maximal component are introduced in addition to the
well-known rank sum statistic. Based on these statistics, a method of multi-
ple comparisons of ordered treatments is introduced as well as an overall
homogeneity test. Third, the independent sequence of multinomials can be
dealt with similarly to the multinomial data with ordered categories. For
example, a sequence of Bernoulli trials can be dealt with as two multino-
mials with cell frequencies all zero or unity. In this context we are interested
in changepoint analysis, for which the maximal component of the cumula-
tive chi-square statistic is useful. When there are natural orderings in both
rows and columns, the maximal component of the doubly cumulative chi-
square statistic is introduced for detecting a two-way changepoint. Finally
those row-wise and/or colurnnwise multiple comparisons are useful not only
for comparing treatments but also for defining the block interaction model.
REFERENCES
Bradley RA, Katti SK, Coon TJ. (1962). Optimal scaling forordered categories.
Psychometrika 27: 355-374.
Hamada M, Wu CFJ. (1990). A critical look at accumulation analysis and related
methods (with discussion). Technometrics 32: 119-130.
Hawkins DM. (1977). Testing a sequence of observations for a shift in location. J
Am Stat Assoc 72: 180-186.
Hirotsu C. (1973). Multiple comparisons in a two-way layout. Rep Stat Appl Res
JUSE 1-10,
Hirotsu C. ( 1 982). Use of cumulative efficient scores for testing ordered alternatives
in discrete models. Biometrika 69: 567-577.
Hirotsu C. (1983a). An approach to defining the pattern of interaction effects in a
two-way layout. Ann Inst Stat Math A 35: 77-90.
Hirotsu C. (1983b). Definingthepattern of association in two-waycontingency
tubles. Biometrika 579-589.
Hirotsu C. (1986). Cumulative chi-squared statistic as a tool for testing goodness of
fit. Biometrika 73: 165-173.
Hirotsu C. (1990). Discussion on Hamada and Wu's paper. Technometrics 32: 133-
136.
Hirotsu C. (1991a). Statistical methods for quality control-Beyond the analysis of
variance. Proc 2nd HASA Workshop, St Kirik, pp. 213-227.
Hirotsu C. (1991b). An approach to comparing treatments based on repeated mea-
sures. Biometrika 75: 583-594.
Hirotsu C. ( 1 992). Analysis of Experimental Data, Beyond Analysis of Variance (in
Japanese). Tokyo: Kyoritsu-Shuppan.
Modeling and Analyzing the Generalized Interaction 409
Elsie S. Valeroso
Montana State University, Bozeman, Montana
1. INTRODUCTION
411
412 Khuri and Valeroso
2. METHODS OF MULTIRESPONSEOPTIMIZATION
2.2 AnalyticalTechniques
Analyticaltechniquesapplymainly to linear multiresponse models. Let I’
denote the number of response variables, and let x = (sI,s 2 , ..., sk)’ be a
vector of 1-, related control variables. The model for theith response is of the
form
sponding design settings ofx are denoted by xI,x2, ..., x,,. From (1) we have
+
j’,,,= f ( ( x , , ) P ; E,,,, i = 1, 2, ...( r ; Li = 1, 2, ...)I ? (2)
s = [X’(E-l @ Z , , ) X ] - ’ X r ( ~ -@
’ I,,)y
b
In general, depends on the variance-covariance matrix E, which is
unknown-andmusttherefore be estimated.Zellner(1962)proposedthe
estimate E = (6q),where
f:
Srivastava and Giles (1987, p. 16) showed that is singular if I’ > 11. They
demonstrated thatAr5 11 is a, necessary, but not sufficient, condition for the
x
nonsingularity of E. Using in place of in Eq. (5) produces the following
estimate of fL
In this case, the BLUE of pi coincides with its ordinary least squares (OLS)
estimate, which does not depend on C, that is,
This special case occurs when the response models in ( I ) are of the same
degree and form and are fitted using the same design.
From Eqs. ( I ) and (7), the ith predicted response,?(,;(x), at a point x in
a region R is given by
where is theportion of p,
in Eq. (7) thatcorrespondsto pi.
Now by a multiresponse optimization of the responses we mean find-
ing an x in R at which ?(,;(x), i = I , 2, ..., r, attain certain optimalvalues. The
term “optimal” is defined accoding to some criterion. In the next two sec-
tions, two optimality criteria are defined and discussed.
I1 otherwise
I 1 otherwise
r&) =
0 otherwise
wheref(x) is the common form off,(x), i = I , 2, ...,I’, and oji is the (i,,j)th
element of C, the variance-covariance of the responses. Hence, i f j ( s ) LCl
(x) : .&(x) : ... : ?,.(x)]’ is the vector of predicted responses, then its variance-
covariance matrix is given by
where &,;; is the ith diagonal element of Eo(i= 1, 2, ..., r ) . The metric pI is
appropriate whenever the responses are statistically independent.The metric
measures the total relative deviation of >(x) from $. It can be used when
Eo is ill-conditioned.
MultiresponseSurfaceMethodology 419
where the elements of 2 are given in (6). The metric p defined in (16) is now
replaced by
420 Khuri and Valeroso
2.3. OtherOptimizationProcedures
Thereareotheroptimizationproceduresthat involvemore thanone
response. Some of these procedures, however, are not truly multivariate in
nature since they do not seek simultaneous optima in the same fashion as in
Section 2.2.
minimized. This MSE is the sum of the estimated process variance and the
square of the differnce between the estimated process mean and some target
value. Copeland and Nelson (1996) proposed using direct function minimi-
zation based on Nelder and Mead's (1965) simplex method. Lin and Tu
(1995, p. 39) made an interesting comment by stating that the use of the
DRA for solving the mean-variance problem can work well only when the
mean and variance are independent.
The values of )~;''I1 and )y'" can be chosen as the individual optima of .?:(x)
over a region R . We note that the definition of this function is similar to that
of the desirability function. Simultaneous optimization of the responses is
422 KhuriandValeroso
3. EXAMPLES
Inthissection, we illustratetheapplicationoftheextendedgeneralized
distance approach (GDA) and the desirabilityfunction approach (DFA)
ofSection 2.2 andthedual response approach(DRA) usingthe G R G
algorithm of Section 2.3. We present two examples, one from the semicon-
ductor industry and the other from the food industry.
3.1 A SemiconductorExample
An experiment was conducted to determine the performance of a tool used
to polish computer wafers. Three control variables were studied: .x1 = down
force, .x2 = tablespeed,and -x3 = slurry concentration.Themeasured
responses
were
removal rate ofmetal
(RR), oxide
removal
rate
(OXRATE), and within-wafer standard deviation (WIWSD). The objective
of the experiment was to maximize y1 = selectivity and minimize y 2 = non-
uniformity, where
WIWSD RR
and y2 =
= OXRATE RR
A Box-Behnken design with eight replications at the center and two replica-
tions at each noncentral point was used. Each treatment run required two
wafers. The first wafer was used to measure RR and WIWSD. The second
wafer was used to measure OXRATE. Thedesign points and corresponding
values of y1 and y 2 are given in Table 1.
Before determining the optima associated with y l and y 2 , we need to
select models that provide good fits to theseresponses.Sincethemodels
are fittedusingZellner’s(1962)seeminglyunrelatedregression (SUR)
parameterestimation [see formula (7)], measuresofthegoodnessof fit
for SUR models should be utilized. These include Sparks’ (1987) PRESS
statisticand McElroy’s(1977) R2 statistic. Thelatter isinterpretedthe
same way as the univariate R2 in that it represents the proportion of the
totalvariationexplained by the SUR multiresponsemodel.Thesemea-
suresprovidetheuserwithmultivariatevariableselectiontechniques,
which, in general,requirescreeningalargenumberofsubsetmodels.
To reducethe number of models considered, Sparks (1987) recommends
using the univariate R2, adjusted R2, and Mallows’ C, statistics to identify
424 Khuri and Valeroso
“good”subsetmodels.Foreachcombination ofsuchmodels,Sparks’
PRESS and McElroy’s R’ statisticsarecomputed.The“best”multire-
sponsemodel is theone withthesmallest PRESS statisticvalueand a
value of McElroy’s R2 close to 1. On this basis, the following models were
selected for
and y2:
MultiresponseSurfaceMethodology 425
The SUR parameter estimates, their estimated standard errors, the values of
the univariate R’, adjusted R’, and C,, statistics and values of McElroy’s R’
and Sparks’PRESSstatisticsare given in Table 2. NotethattheSUR
parameters estimates were obtained using PROC SYSLIN in SAS
(1990a), and the univariate R’, adjusted R’, and C,, statistics were computed
using PROC REG in SAS (1989). From Table 2 it can be seen that models
( 1 9) and (20) provide good fits to the two responses.
On the basis of models (19) and (20), the individual optima of jc,l(x)
and j,?(x) over the region R = ((sI,s 2 , s 3 ) l x:=l
.Y: I2) are given in Table
3. These values were computed using a Fortran program written by Conlon
(1992), which is based on Price’s (1977) optimization procedure. The simul-
taneous optima of j , l ( x ) and Tr2(x)over R were determined by using the
extension of the GDA (see Section 2.2). The minimization of pe in (18) was
Table 2 SUR Parameter Estimates and Values of C,,, R’, and Adjusted R’
(Semiconductor Example)
Responses“
Parameter ?’<,I J’t.2
0.4410(0.0190) 0.2727(0.0135)
0.2382(0.0155) -0.2546(0.0 135)
0.1109(0.0155) 0.0014(0.0135)
-0.0131(0.0155) -0.01 14(0.0135)
-0.0773(0.0219)
0.1216(0.0191)
0.0429(0.0219)
0.09 12(0.02
19)
6.17 4.39
0.91 0.93
0.89 0.91
“The number in parentheses is the standard error.
Note: McElroy’s R’ = 0.9212: Sparks’ PRESS statistic = 103.9.
426 Khuri and Valeroso
carried out using a program written in PROC IML of SAS (1990b). The
results are shown in Table 3.
T o apply the DFA, we use formulas (12) and (13) for t l l ( x ) and d 2 ( x ) ,
respectively, where
I1 otherwise
and
if jc,22 1.0
I’ otherwise
Note that the values 0.95 and 0.20 in d l and d 2 , respectively, are of the same
order of magnitude as the individual maxima and minima of jc,l and j,,,,
respectively. Notealsothat,onthe basis ofa recommendation by Del
Castillo et al. (1996, p. 338), we have used the SUR predicted responses, j e l
(x) and j e 2 ( x ) , instead of j ; ( x ) and j ; ( x ) . The latter two are the ones nor-
mally used in the DFA and are obtained by fitting the models individually
[see formula (1 l)]. The overall desirability function d(x) = [ d , ( ~ ) d ~ ( x ) ] ” ~
wasmaximizedover R using theFortranprogramwritten by Conlon
(1992). Alternatively,Design-Expert(Stat-Ease, 1993) softwarecanalso
be used to maximize d ( x ) . The DFA results are given in Table 4.
The results for the DRA are given in Table 5. In applying this proce-
dure to the present example, each of the two responses was considered as the
Optimum Response
Individual optima
.PPI(1) Max = 0.8776 (0.7888,0.9031, -0.7479)
ir2(X) Min = 0.1302 (0.9443, -0.0468.0.9689)
Simultaneous optima (GDA)
d
i (4 Max = 0.8641 (0.9976,0.9127, -0.3961)
5.Z(X) Min = 0.1463 (0.9976,0.9127, -0.3961)
Note: Minimum value of pr in Eq.18 is 0.8610.
MultiresponseSurfaceMethodology 427
primary response. Its optimum value was then obtained over R using the
constraint that the other response is equal to its individual optimum from
Table 3. Values of the DRA optima in Table 5 were computed on the basis
of the G R G algorithm using the “solver” tool, which is available in the
Microsoft Excel (Microsoft, 1993) spreadsheet program. For more details
on how to use this tool, see Dodge et al. (1995).
The resultsofapplying GDA, DFA, and DRA are summarized in
Table 6. We note that the results are similar to one another. The maxima
of j p , ( x ) under GDA and DFA are close, and both are higher than the
maximum under DRA. Their overall desirability values are also higher.
Table 6 Comparison of GDA. DFA. and DRA Results for the Semiconductor
Example
Method
~~
GDA
DRA DFA
Optimal response valuc (0.86.0.1 5) (0.88,O. 16) (0.76.0.15)
Optin1al settings (1.0,0.91,-0.40) (0.84.0.80.-0.82) See Table 5.
Minimum metric (p(,) 0.8610 1.1768 Not applicable.
Ovcrall desirability 0.9537 0.9609 0.8967
Original
control
variables
Coded
control
variables
Responses
The estimated standard errors for the parameter estimates, the values of the
univariate R', adjusted R', and C,, statistics, and values of McElroy's R'
Sparks' PRESS statistics are given in Table 8. We can see that the fits of the
three models are quite good.
The individual optima and the GDA simultaneous optima over the
region R = ((sI.x ? , .x3)\ x:=l
.x: 5 3 ) are given in Table 9.
The results of the DFA are presented i n Table IO. Here, the desirabil-
ity values were computed using the functions
Tl>,(X)- 1.3
ti,@) = if 1.3 < jc,l
< 2.5
2.5 - 1.3
otherwise
d'(s)= ?:,'(X)-51
I7 - 51
if 17 < 51
otherwise
Table 8 SUR Parameter Estimates and Values of C,,, R', and Adjusted R'
(Food Industry Example)
Responses"
Table 9 Individual and GDA Simultaneous Optima for the Food Industry
Example
mum Response
Individual optima
.?‘,I(x) Max = 1.9263 (-0.4661, -0.3418, 1.6276)
!;‘,?(X) Min = 18.8897 (-0.5347.1.1871.1.1415)
.?&) Min = 17.4398 (-0.2869, 0.2365, 0.4970)
Simultaneous optima (GDA)
.?d (x) Max = 1.9136 (-0.5379,1.0435,0.8622)
.PPZ(X) Min = 19.3361 (-0.5379, 1.0435, 0.8622)
.?Ax) Min = 17.9834 (-0.5379, 1.0435. 0.8622)
Note: Minimum value of p<,In Eq. (18) IS 0.9517.
= if 14 < jC,3< 30
14 - 30
otherwise
mum Response
(x)
?;?I Max = 1.9127 (-0.4504, 0.6176, 0.8081)
?;<,?(X) Min = 19.8768 (-0.4504. 0.6176, 0.8081)
?;&) Min = 17.6386 (-0.4504, 0.6176. 0.8081)
Note: The maximum of (/(x) over R is 0.7121.
MultiresponseSurfaceMethodology 431
Location
Response Optimum
(-0.5617, 1.1228. 0.9415)
(-0.3514, 0.2824, 0.5605)
(-0.5077, 1.0716, 1.2625)
Table 12 Comparison of GDA, DFA, and DRA Results for the Food Industry
Example
RA DFA GDA
Optimal response values (1.91.19.34,17.98) (1.91.19.88.17.64) (1.91.20.55,18.61)
Optimal settings (-0.54. 1.04, 0.86) (-~0.45,0.62, 0.81) See Table 1 1
Minimum metric (p,,) 0.95 I7 1.2832 Not applicable
Overall desirability 0.7098 0.7121 0.6885
ACKNOWLEDGEMENT
REFERENCES
Biles WE. (1975). A response surface method for experimental optimization of multi-
response processes. Ind Eng Chem, Process Des Dev 14:152-158.
Box GEP. (1954). The exploration and exploitation of response surfaces: Some gen-
eral considerations and examples. Biometrics IO:16-60.
Box GEP, Youle PV. (1955). The exploration and exploitation of response surfaces.
An example of the link between the fitted surface and the basic mechanism of
the system. Biometrics 11: 287-323.
Box GEP, Hunter WG, MacGregor JF,Erjavec J. (1973). Some problems associated
with the analysis of multiresponse data. Technometrics I5:33-5 I .
Chitra SP. (1990). Multi-response optimization for designed experiment. Am Stat
Assoc Proc Stat Comput Sect, pp. 107-1 12.
ConlonM. (1988). MR:Multipleresponseoptimization.TechRep No. 322,
Department of Statistics, University of Florida, Gainesville, FL.
Conlon M. (1992). The controlled random search procedure for function optimiza-
tion. Commun Stat Simul Comput B21: 919-923.
432 Khuri and Valeroso
Copeland KAF, Nelson PR. (1996). Dual response optimization via direct function
minimization. J Qual Technol 28: 331-336.
Del Castillo E. (1996). Multiresponse process optimization via constrainedconfi-
dence regions. J Qual Technol 28: 61-70.
Del Castillo E, Montgomery DC. (1993). A nonlinear programming solution to the
dual response problem. J Qual Technol 25: 199-204.
Del CastilloE,MontgomeryDC, McCarville DR. (1996).Modifieddesirability
functions for multiple response optimization, J Qual Technol 28: 337-345.
Derringer G C . (1994). A balancing act: Optimizinga product’s properties. QUA Prog
27: 51-58.
Derringer GC, Suich R. (1980). Simultaneous optlmization of several response vari-
ables. J Qual Technol 12:214-219.
Dodge M. Kinata C. Stinson C. (1995). Running Microsoft Excel for Windows 95.
Washington, DC: Microsoft Press.
Draper NR. (1963). “Ridge analysis” of response surfaces. Technometrics 5: 469-
479.
Fichtali J, Van de Voort FR, Khuri AI. (1990). Multiresponse optimization of acid
casein production. J Food Process Eng 12: 247-258.
Floros JD. (1992). Optimization methods in food processing and engineering. In:
HuiYH.ed. Encyclopediaof Food Science andTechnology. Vol. 3. New
York: Wiley, pp 1952-1965.
Floros JD, Chinnan MS. (1988a). Seven factor response surface optimization of a
double-stage lye (NaOH) peeling process forpimientopeppers. J Food Sci
53:631-638.
Floros JD, Chinnan MS. (1988b). Computer graphics-assisted optimization for pro-
duct and process development. Food Technol 42:72-78.
Guillou AA. Floros JD.(1993). Multiresponse optimization minimizes salt in natural
cucumber fermentation and storage. J Food Sci 58: 1381-1389.
Harrington EC. (1965). The desirability function. Ind Qual Control 21:494498.
Hill WJ, Hunter WG.(1966). A review of response surface methodology: A literature
survey. Technometrics 8: 571-590.
Khuri AI. (1996). Multiresponse surface methodology. I n : Ghosh S, Rao CR. eds.
Handbook of Statistlcs, Vol. 13. Amsterdam: Elsevier Science, pp 377406.
Khuri AI. Conlon M.(1981). Simultaneous optimization of muhiple responses repre-
sented by polynomial regression functions. Technometrics 23:363-375.
Khuri AI, Cornell JA. (1996). Response Surfaces. 2nd ed. New York:Marcel
Dekker.
Khuri AI. Myers RH. (1979). Modified ridge analysis. Technometrics 21: 467-473.
Kim KJ, Lin DKJ. (1998). Dual response surface optimization: A fuzzy modeling
approach. J Qual Technol 3O:l-IO.
Lin DKJ, TUW. (1995). Dual response surface optimization. J Qual Technol 27: 34-
39.
Lind EE, Goldin J, Hickman JB. (1960). Fitting yield and cost response surfaces.
Chem Eng Prog 56: 62-68.
MultiresponseSurfaceMethodology 433
McElroy MB. (1977). Goodness of fit for seemingly unrelatcd regressions: Glahn’s
R:,.v and Hooper’s F’. J Econometrics 6:381-387.
Microsoft (1993). Microsoft Excel User’s Guide, Version4.0. Redmond,WA:
Microsoft Corporation.
Mouquet C, Dumas JC. Guilbert S. (1992). Texturizatlon of sweetened mango pulp:
optimization using response surface methodology. J Food Sci 57: 1395-1400.
Myers RH, Carter WH. (1973). Response surface techniques for dual responsc sys-
tems. Technometrics 15:301-3 17.
Myers RH, Khuri AI, Carter WH. (1989). Responsesurfacemethodology: 1 9 6 6
1988. Technometrics 31:137-157.
Myers RH, Khuri AI,Vining G . (1992). Response surface alternatives to the Taguchi
robust parameter design approach. Am Stat 46: 13 1-1 39.
Nelder JA. Mcad R. (1965). A simplex method for function minimization. Comput J.
7:308-313.
Price WL. (1977). A controlled random search procedure for global optimization.
Comput J. 20:367-370.
SAS (1989). SASjSTAT User’s Guide. Vol. 2, Version6. 4th ed. Cary, NC: SAS
Institute. Inc.
SAS (1990a). SAS/ETS. Version 6. Cary, NC: SAS Institute, Inc.
SAS (1990b). SAS/IML Software, Version 6. Cary, NC: SAS Institute, Inc.
Sparks RS. (1987). Selecting estimatorsandvariables in the seemingly unrelated
regression modcl. Colnmun Stat Simul Colnput B16:99-127.
SrivastavaVK. Giles DEA. (1987).Secmingly Unrelated Regression Equations
Models. New York: Marcel Dekker.
Stat-Ease (1993). Design-Expert User’s Guide, Version 4.0. Minneapolis, MN: Stat-
East, Inc.
Tseo CL. Deng JC, Cornell JA, Khuri AI, Schmidt RH. (1983). Effect of washing
treatment on quality of minced mullet flesh. J Food Sci 48: 163-167.
Valeroso ES. (1996). Topics in multiresponse analysis
and
optimization.
UnpublishedPhD Thesis. DepartmentofStatistics, Universityof Florida,
Gainesville. FL.
Vining GG, Myers RH. (1990). Combining Taguchi and response surface philoso-
phies: A dual response approach. J Qual Tcchnol 22:38-45.
Zellner A. (1962). An efficient method of estimating seemingly unrelated regressions
and tests for aggregation bias. J Am Stat Assoc 57348-368,
This Page Intentionally Left Blank
26
Stochastic Modeling for Quality
Improvement in Processes
M. F. Ramalhoto
Technical University of Lisbon, Lisbon, Portugal
1. INTRODUCTION
435
436 Ramalhoto
and that of product supply have to be looked for and considered equally
important.
Qualityimprovement of theproductsupply is linked tostochastic
maintenance, reliability, quality control, and experimental design techniques.
Furthermore, an important problem is how to achieve a high-quality product
supply without increasing cost.I n many situations the study of interactions
among maintenance, reliability, and control charts, through a total quality
management (TQM) approach, might help toreach that goal. However, that
is not the concern of this chapter, which deals only with the product service.
It has been recognized by several authors including Deming (see, e.&.,
Ref. 1 ) that people who work in queuing systems are usually not aware that
they too have a product to sell and that this product is the service they are
providing. The product service is frequently invisible to the operators. They
have difficulties in seeing the impact of their performance 011 the success or
failure of the organization that employs them, on the security of their jobs,
and on their wages. Perhaps it would make sense to propose a quality index
(based on some of the quality dimensions tobe defined next) for most of the
relevant queuing systems of common citizens’ everyday life (that would also
help their operators to understand better the importance of their mission).
Just imagine all the queuing systems relevant to our everyday life operating
underthecustomersatisfactioncriterion efficiently, adequately,andat
controlled costs.
with the same equipment and for the same required service, because of the
mood of the operator(if the operator is a human), the productservice might
be of poor quality today even if usually it is not. Queuing and waiting in
general are at the same time personal and emotional. Qualitative and quan-
titative aspects of human behavior toward waiting have to be addressed. In
most cases if customers are pleasantly occupied while waiting (entertain-
ment, socially relevant information, opportunity to make interesting con-
tacts, job opportunities, extra information about the queuing system itself,
etc.), their perception of the length of the waiting time and of whether it is
“reasonable” may differ substantially. Unlike the product supply, which can
usually be sampled and tested for quality, the product service cannot, at
least not easily. The record of an inspection of the product service cannot be
assumed to be a “true” reflection of its quality. For instance, duringinspec-
tion the operator (if a human) might be quicker, more courteous, and more
responsive to customers than if left alone. (However, if the operator feels
pleasure in providing a high quality product service and is proud of con-
tributing to the higher standardsof the queuing system, he or she works well
even without any kind of inspection.) Moreover, unlike the controlof qual-
ity in the product supply [l], the quality of the product-service depends both
on the operator and on the customer. Also, product service can be classified
a s poor by someandgood by others.Indeed,itsqualification,goodor
faulty, need not be consistent.
On assessing the effectiveness of a product service, quantitative and
qualitative factors have to be taken into consideration. It is also expected
that different individuals will have different judgments and different opi-
nions about many factual issues. Nevertheless, if the process continues long
enough, the observers are expected to independently arrive at very similar
interpretations. That, obviously.encouragesthedevelopmentofmechan-
isms of communicatiotn between the system’s management and their custo-
mers. Moreover, product service is delivered at the moment it is produced.
Any quantification or measurement taken is thus too late to avoid a failure
or defect with that particular customer. However, that situation might be
alleviated if acommunicationmechanism is already in operation(for
instance,atthe exit thecustomercould be asked, or given a short and
clear questionnaire, to quantify the product service just received according
to the quality dimensions to be defined in the next section and tobriefly state
what he or she would like to see improved in it; means of contacting the
customer for mutually relevant communication in the future should also be
recorded if the customer is interested). The success of the communication
mechanismdependsheavilyonshowingcustomersthattheyhavebeen
heard by thesystem managersandthat their relevantopinions really
make a difference.
StochasticModelingforProcess Quality 439
3. QUALITYDIMENSIONS
3.2. InternalQualityDimensions
We need “measures” thatwill help us to deliver what the customer expects or
to improve the queuing system beyond customers’ expectationsat reasonable
prices. For that, the quality dimensions timeliness, integrity, predictability,
and“customersatisfaction,” called here internalqualitydimensions,are
adopted. The quality dimension timeliness has been referred to, by several
authors. as oneof the most influential components in the quality of a product
service, because the product service has to be produced on demand.
StochasticModelingforProcessQuality441
Tinle1itles.s is formed by the necess time, which is the time taken to gain
attention from the system; time qumit1g, which is the time spent waiting
for service (and which can be influenced by the length of the queueand/or
itsintegrity);and rrctior~time, which is thetimetaken to providethe
required product service.
Integrity deals with the completeness of service and must set out what
elements are to be includedin order for the customer to regard theservice
as satisfactory. This quality dimensionwill set out precisely what features
are essential to the product service.
Prcvfictahility refers to the consistency of the service and also the per-
sistence or frequency of the demand. Standards for predictability identify
the proper processes and procedures that need to be followed. They may
include standards for the availabilityof people, materials, and equipment
and schedules of operation.
Customer satisfaction is defined here as the way to provide the targets
of success, which may be based on relative market position for the provi-
sion of a specific queuing system.
A
CUSTOMERS
MANAGERS
/
MARKET OPERA7TORS
OPERATORS
COMPETITORS (Production Process)
Figure 1 Manager tctrahedron.
442 Ramalhoto
Some “product service failures or defects are very often linked to “unaccep-
tableaccesstime,” “unacceptablequeuingtime,”“unacceptableaction
time,” and “unacceptable sojourn time in the system.” All are clearly mea-
sured in queuing theory terms. Those failures or defects, as already men-
tioned, might ruin the rankingof most df the other quality dimensions. The
way to prevent those failures or defects rests in the quality of the design of
theprocessdeliveryofthe queuingsystem.Often, if nothing is done to
spread out the arrival pattern or to change the service rate or to modify
the service discipline, thequeuing systemexperiences very uneven traffic
flows and serious failures or defects occur in theproduct service. All of
those possible failures or defects have costs. Very often the cost of delay
is to lose customers.
ing system specially to meet the peak demands is not always the best action
to take, because it can be costly and the excess capacity can have negative
psychological effects on the customer.
On theotherhand,apoorrank in flexibility might lead topoor
ranks in almost all the other quality dimensions. Most traditional queuing
models are unable to respondquickly to changes in theirenvironment.
(The basic queuing parameter, namely, the number of operators, is usually
assumed to be unchangednomatterwhat is happening in thequeuing
system.)The result is unacceptablequeue sizes and waitingtimes.Long
queues are, with few exceptions(e.g.,the restaurant with excellent food,
product supply at a good price), always considered an indication of poor
product service.
Ramalhotoand Syski [8] showhowqualitymanagementconcepts
ofsatisfyingthecustomercan be incorporated into the design of queu-
ingmodels.Theyproposeandstudyaqueuingmodelthataims to
providemanagerswith a way of dealingwithsometemporarypeak
situations,that is to say, to have high ranking in the flexibility quality
dimension.The model is essentiallya G/G/c/FCFS(or a G/G/c/c+d/
FCFS, i.e. first come first served queuingmodelwith e, c operators
andd waitingposition;d is omittedwhenequal to zero or infinite)
queuingmodelunderthefollowingadditionaldecisionrule,calledhere
rule 1.
Rule 1. If the queue size exceeds h (the action line), introduce another
server (or k servers, k 2 1); when it falls below CI (the prevention or ‘*1 1arm
line), withdraw one server (or k servers, I< 2 I), h > a .
For the M/M/ queuing model c, (i.e., first come first served
queuing model with Poisson arrival
process andexponential service
times
distribution
with
cooperators and infinite waitingpositions)
under rule 1, the equilibrium distribution
of
the
state
of
the
two-
dimensionalMarkov process thatcharacterizesthequeuing model is
derived.Somefirst-passage-timeproblems useful in thequality design
ofthequeuing system are solved.Severalextensions of these analytical
results tomore general settings,
including
nonhomogeneous Poisson
arrivals,are discussed.
For the M/M/c queue underrule I , where the arrival rate is denoted by
+
h and the service rate by p, p = h / [ ( c li)p], z = h/(cp), p < r , p < I , and
+
for i = 0, 1 , 2 , ...; tz = c, c k , denote the steady-state probability of
having i customers in thequeuingsystemand tI operators serving,
Ramalhotoand Suskiprove, amongother results. that [Ref. 8, p. 163,
Eqs. (9) and (IO)]
444 Ramalhoto
+
A measure of preference to use c k operators for a short period of
time (Ref. 8, p. 164, Eqs. (18) and (19)], is given by D ( h , [ . + k ) , the entrance
probability to the set of states ( i , c) for i = 0, ..., CI - 1, before entering the
set of states ( i , c + k ) for i = h + 1, h + 2, ..., when starting from the bound-
+
ary state (h, c k ) . The value of D ( h , c + k ) gives an indication of the tendency
+
toward c operators, when starting with c k operators.
+
with p = c/(c k ) .
Other rules could be considered as alternatives to rule I ; for instance:
R u k 2. When the queue size exceeds h (the action line), shorten the
service time (for instance,by deferring some tasks to be worked out later, by
dividing and scheduling when the service can be provided in multiple sepa-
rate segments, or by reducing the quality of service).
Ruko 3. Identify classes of service needed by customers(eachclass
requiring a different service time and being of different "value"), and treat
the customers in separate queues, when the total queue length exceeds h (the
action line).
Which rule is preferable? Section 5 addresses this question.
Stochastic Modeling for ProcessQuality 445
Affinity Operators
Thereare several importantexamples of queuing systems in the service
industry, where it is “more efficient” to have a customer serviced by one
operator than by any other. Thus the system schedules customers on the
queue of their affinity operator. To address the inevitable imbalance in the
number of customers assigned to each operator, there are several policies
that can be considered. Any conventional queuing model under rule 3 might
also be seen as a related model. Nelson and Squillante [9] consider a general
threshold policy that allows overloaded operators to transfer some of their
customers to underloaded operators. They vary four policy control para-
meters. Decomposition and matrix-geometric techniques yield closed-form
solutions. They illustrate the potential sojourn time benefits even when the
costs of violating affinities are large and experimentally determine optimal
threshold values. One of the important applications of those models isin
maintenance after sales, which has become a significant portion of manu-
facturing quality.
tackle the problemof obtaining approximations and bounds for the M,/G/r/
r + d queuing model. A lot of research work is still needed on this queuing
model. Its great importance in the service industry has alreadybeen shown,
for instance, in Ref. 15.
Figure 2 Mean value of the waiting time in the M/M/l/I queuing modelwith
constant retrial rate and h = 1.5.
40 "
Figure 3 Variance of thewaitingtimeinthe M/M/l/l queuing model with con-
stant retrial rate and h = 1.5.
Figure 4 Mean value of the waiting time in the M/M/l/l + 1 queuing model with
constant retrial rate and h = 1.5.
400
448
Stochastic Modeling for Process Quality 449
Figure 6 Mean value of the waiting time in the M/M/2/2 queuing model with con-
stant retrial rate and h = 1.5.
variance of the waiting time, as functions of01 and v, for the M/M/1/1 (one
server and no waiting position), M/M/1/1+ 1 (one server and one waiting
position), and M/M/2/2 retrial queuing models with constant retrial rate a
and for different ergodicity intensities. Resultsof this kind help to evaluate
the range of arrival, retrial, and service rates that provide consistentlyhigh
product service quality in an increasingly successful queuing system. (Also,
for example, providing k extra servers, as in Section 4.1, when h or/and 01
increase beyond a certain threshold might be an adequate short-term policy
to maintain the high quality of the productservice in an increasingsuccessful
queuing system).
40"
Figure 7 Variance of the waiting tlme in the M/M/2/2 queuing model with con-
stant retrial rate and h = 1.5.
450 Ramalhoto
Usually more than one queuing model is capable of responding to the need
to improve or redesign a particular service delivery process. Each queuing
model option might lead to different levels of reduction of delay and dis-
comfort, impact on customer satisfaction, and costs. The aim, in most cases,
is to find the“optimal”solutionthatbalancesthecustomer delay and
discomfort against operator idleness at the same cost.
Ramalhoto [ 181 formulated a practical simulation decision framework
that considers and evaluates alternative queuing model options and makes
the necessary decisions by selecting those particular options that provide the
best prqjectedperformancescores, in termsof specified scoringcriteria,
based on measures linked to the quality dimensions selected. The queuing
model options are defined as “control parameters” in this framework. For
instance, the queuing models corresponding to rules 1,2, and 3, respectively,
defined i n Section 4. I , can be represented quantitatively by the following
three basic control parameters: X , , the regular size of the service staff; X 2 ,
thepercentage by whichthe service times foreachcustomerareto be
reduced or expedited (as a function of queue length or any other relevant
quantity); X 3 , the amount by which the regular service staff is augmented by
other personnel (such as secretarial or clerical staff to meet periods of heavy
demand); X,, the number of different classes of service needed by customers;
and X , , the percentage of the regularservice staff to allocate to each of those
different classes of service. Thisframework is called totalqualityqueue
management.
452 Ramalhoto
Studies have shown that indicators often distort a program from the begin-
ning by forcing a focus on the indicators rather than on the true underlying
goals. The result is generally a lack of sustained success. And i n many cases
there is no success at a l l save in the artificial indicators, which can often be
manipulated with little effect on the underlying process. Unfortunately, in
several situations the harm caused by those artificial indicators is very pain-
ful. That is indeed aserious risk to be avoided.Therefore, an effective
process of judging the costs and consequences of thechoices necessarily
incorporates a learning process. An important result of such learning is a
shared vision withthemanagers, operators (many operators know a lot
about their jobs and about the queuing system they are working with and
also have the capability of taking direct action), and customers about how
the process works and how it should work in order to confront the chal-
lenges it faces.
In this chapter, product service is treated a s “manufacturing in the
field.” It is advocated that it should be carefully planned, audited for quality
control, and regularly reviewed for performance improvement and customer
reaction. The methodology presented is an attempt to construct a learning
queuing system that is able to assess (internally andexternally) itsown
actions and judge and adjust theprocess through which it acts. It relies
onteamworkamongcustomers,operators,andnlanagersto unify some
goals, on a scientific approach, and on decision making based on reliable
data. In fact, it is based on analysis. simulation, data, policies, and options.
The idea is also to question policies whenever appropriate. Adequate data
have to be collected and studied statistically, and options have to be ana-
lyzed, including the option to change policies.
In real life, changes are very oftencostly in terms of money. time.
psychological tensions, and so on. for many reasons (e.g., the new changes
454 Ramalhoto
ACKNOWLEDGEMENT
TheAuthorthanksProfessorGomez-Corralfortheelaboration of the
figures 2 to 7.
The author has also benefited from discussions with colleagues from
Princeton University, Maryland University, and Rutgers University during
her half-year sabbatical in 1998 at Princeton University.
Thisworkwascarried out with thesupportfromthe“Fundaqao
paraa CiEncia eaTecnologia”underthecontractnumber 134-94 of
theMarineTechnologyandEngineeringResearchUnit-ResearchGroup
on QueueingSystemsandQualityManagement, and the project INTAS
9 M 8 2 8 , 1997-2000, on “Advances in Retrial Queueing Theory”.
REFERENCES
1. INTRODUCTION
457
458 Neff and Myers
methods, statistical graphics, robust fitting, new design ideas, Bayesian sta-
tistics, optimal designtheory,generalizedlinearmodels,andmanyother
advances. Researchers in all fields are able to focus on applicationsof RSM
because of thesubstantialimprovement in thesoftwarethat isused for
RSM. There is no doubt that high qualitysoftware is one of thebetter
communication links between the statistics researcher and the user.
In this chapter we discuss and review some of the recent developments
in RSM and how they are having and will continue to have an impact on
applications in industry.
Independent oftheapproachtaken,however,theabilitytoincorporate
robustness to noise factors into a process design depends on the existence
and detection of at least one control x nose interaction. It is the structure of
these interactions that determines the nature of nonhomogeneity of process
variance that characterizes the parameter design problem. For illustration,
consider a problem involving one control factor, x, and one noise factor, z .
Figure 1 shows two potential outcomes of the relationship between factors.x
and z and their effects on the response, y . In Figure la, it can be seen that
the response y is robust to variability in the noise factor z when the variable
x is controlled at its lowlevel. When .x is at itshigh level, however, the
change in z has an effect of 15 units on the response. I n other words, the
presence of the .x= interactionindicatesthatthere is anopportunityto
reduce the response variability through proper choice of the level of the
control factor. In contrast, Figure l b shows that when there is no control
x noise interaction, the variabilityin y induced by the noise factor cannotbe
“designed out” of the system, since the variability is the same (i.e., homo-
geneous) at both levels of the control factor.
While the estimation of control x noise interactions is important for
understanding how best to control process variance, the control factor main
effects, as well as interactions among control factors, are equally important
for understanding how to drive the response mean to its target. The dual
responsesurfaceapproach, whichaddressesbothprocessmeanandvar-
iance, begins with the response model,
In the response model, x and z represent the y,. x 1 and I‘,x 1 vectors of
control and noise factors, respectively. The I’, x I’, matrix B contains coeffi-
20 ..................................... x =+I
20
10 .....
-1
/ Z +I
x=-1
Y
10
-1 Z +I
Figure 1 (a) Control by noise interaction. (b) No control by noise interaction.
460 Neff and Myers
-
previously defined response model will accommodate many real-life appli-
cations. It is assumed that E N ( 0 , ozl),implying that any nonconstancy of
variance of the process is due to an inability to control the noise variables.
The assumption on thenoise variables is such that the experimentallevels of
each z, is centered at some mean p, with the f l levels set at p , f co:,, where
i,
c is a constant. 1 , etc. As a result, it is assumed that
2
E ( z ) = 0. Var(z) = ozIrt
thus implying that noise variables are uncorrelated with known variance.
Taking expectation and variance operators on the response model in
( I ) , we can obtain estimates of the mean and variance response surfaces as
~:[I(x)]= hox’b + X’BX
and
Var[1(x) = (f+ A’x)’V(f + A’x) + 6:
An equivalentform ofthevariancemodel,undertheassumptionthat
V = 0’1, is given by
v a r , ~ ~ * ( x=) lb t ~ ’ ( x ) ~ (+
x )6:
where I(x) = (f + A’x), which is the vector of partial derivatives of )(x, z)
with respect to z. In these equations, b, f, B, and Acontain regression
coefficients fromthe fittedmodel of Eq. (1). with 6; representingthe
error mean square from this model fit. Notice the role that Aplays in the
variance model, recalling that it contains the coefficients of the important
control x noiseinteractions.Runningthe process atthe levels of xthat
minimize Ill(x)llwillin turn minimize theprocessvariance. If however,
A = 0, the process variance does not depend on x, and hence one cannot
create a robust process by choice of settingsofthecontrolfactors(illu-
strated previously with the simple example in Figure 1).
Various analytical techniques have been developed for the purpose of
process understanding and optimization based on the dual response surface
models. Vining andMyers (1990) proposedfindingconditions in xthat
minimize Var,l~*(x)]subject to,&-J(X)] being held atsomeacceptable
level. Lin and Tu (1996) consider a mean squared error approach for the
“target is best” case. Other methods, given in Myers et al. (1997), focus on
the distribution of response values in the process. These include the devel-
opment of predictionintervals forfuture response values as well asthe
development of tolerance intervals to include a t least 100% of the process
Response Surface Methodology 461
with R’ = 0.9668 and 6, = 4.5954. Note that there are two control x noise
interactions present in the model, indicating that the variability transmitted
fromtemperaturefluctuationscan be reduced throughproper choice of
formaldehyde concentration (.y2) and stirring rate (r3). Pressure (.Y,) was
found to haveno significant effect on filtrationrate CY). Theestimated
mean and variance models are therefore given by
E[I(.Y~,
.\-3)] 70.02 + 4.9375.yz + 7.3125~3- 0 . 5 6 2 5 ~ 2 ~ 3
and
+ +
Var,[lf.y2, s3)] = (10.8125 - 9 . 0 6 2 5 . ~ ~8.3125~3)’ (4.5954)’
Figure 2 shows the overlaid contour plots for theresponse surface models of
the process mean and standard deviation. Thetrade-off between maximizing
filtrationrate while attempting to minimizevariance is evident.Figure 3
contains a contourplot of mean filtration rate along with the locus of points
l ( s z , sj) = 0, defining a line of minimum estimated process variance. The
shadedregionrepresentsa 95% confidenceinterval around this line of
minimum variance. From Figure 3, the mean-variance trade-off becomes
even more clear, since we can achieve barely more than 73 gal/hr for the
estimatedprocessmean while minimizingtheprocessvariance(withco-
ordinates s? = I , s 3 = 0.2).
In Figure 4 we see lower 95?4 one-sided prediction limits, while Figure
5 depicts lower 95% tolerance h i t s on filtration rateswith probability 0.95.
Both of these illustrations indicate that the process should be operated at a
462 Neff and Myers
-1 .o 0.0 1.o
x2
Figure 2 Contour plot of both the mean filtration rate and the process standard
deviation.
I .o
x3 -o..o
-1.0
-I .o o..o I .o
x2
I .o
-1.0
-1.0 o..o 10
x2
Figure 5 Contour plot of 0.95 content lower 95% one-sided tolerance limits.
464 Neff and Myers
ware packages such a s Design-Expert and Minitab (version 12) have built-in
features for generating these overlaid plots.
There are also graphical techniques that are extremely useful for eval-
uating the prediction capabilityof experimental designs. TWOsuch graphical
methods that are discussed here are variance dispersion graphs and predic-
tion variance contour plots. Both of these graphical techniques enable the
user to visualize the stability of prediction variance throughout the design
space, thus providing a mechanism for comparing competing designs.
The graphical technique referred to as the variance dispersion graph
(VDG) was developed by Giovannitti-Jensen and Myers (1 989) and Myers
et al. (1992b). A variance dispersion graph for an RSM design displays a
“snapshot” of the stability of the scaled prediction variance, v(x) = N Var
y(x)/o’, and how the design compares to an “ideal.” For a spherical design
[see Rozum (1990) and Rozum and Myers (1991) for extensions to cuboidal
designs], the VDG contains four graphical components:
1. A plot of the spherical variance V‘ against the radius I’. The sphe-
rical variance is essentially v(x) averaged (via integration) over the
surface of a sphere of radius I’.
2. A plot of the maximum .(X) on a radius I’ against I’.
3. A plot of the minimum .(X) on a radius I’ against I’.
4. A horizontal line at 74x) = p , to represent the “ideal” case.
"1"_
- CCD
.. ._ BED
Figure 6 Variance dispersion graphs for CCD and Box-Behnken designs fork = 3
design variables.
DESMN-EXPERT Plot
A d u o l Facto:
X=A
Y=B
Adupl Constants:
c = 0.00 m
DESIGN-EXPERT Plot
Actual Factors:
X=A
Y=B
Actual Constants:
c = 0.00 m
Figure 7 Contours of standard error of prediction for (a) hybrid 31 IA and (b) D-
optimal design.
468 Neff and Myers
where J’; E (0, 1 ) indicates whether the ith subject responded to dose x, of a
given drug. I t is thereforeassumedthat E , is approxinlately Bernoulli (0,
p i ( 1 - p i ) ) , where
1
Note that the information matrix is a function of the unknown p’s. This
makes it impossible to directly use traditional design optinlality criteria for
generating an efficient design, since they depend on being able to optimize
some norm on the Fisher information matrix. For example, construction of
the D-optimal design for the above model would require that the doses s I ,
s 2 , ..., x,, be chosen such that Det[N-II(P)] is maximized. I n order to dothis,
Response Surface Methodology 469
the scientist would be forced to make his or her best guess at the values of Po
and PI. The resulting design will be D-optimal for the specified values, which
unfortunately may be very differentfromthe truth, thus resulting in an
inefficient design.
Chaloner and Verdinelli (1995) review a Bayesian approach to design-
optimality that incorporates prior information about the unknown para-
meters in the formof
probability
a distribution.
This provides
a
mechanism for building in robustness to parameter misspecification, since
a distribution of the parameteris specified, not merely a point estimate. The
resulting Bayesian designoptimalitycriterion is afunctionoftheFisher
informationmatrix.integratedoverthepriordistributiononthepara-
meters. For example,the Bayesian D-optimal design for thepreviously
defined logistic model is found by choosing the levels of s that will maximize
the expression
Max Det[(N;'I(P)]B=b,
D
470 Neff and Myers
where D is now the set of all possible designs of size N2 and I l ( p ) is fixed
after the first stage.
Letsinger (1995) and Myers et al. (1996) evaluatedthe efficiency of
two-stage procedures relative to their single-stage competitors. In doing so,
they showed thatthe best performance of thetwo-stage designs was
achieved when the first-stage design contained only 30% of the combined
design size, thus reserving 70% of the observations for the second stage,
when more parameter information is present.
Even for the normal linear model,successful implementation of design
optitmality criteria is often difficult in practice. This is due to the fact that
the model content must be known a priori. In other words, the experimenter
must be able to specify which regressors are needed to model the response,
in order to generate the most efficient design for constructing the specified
model. If too many regressors are specified, some design points (and con-
sequently valuable resources) may be wasted on estimatiton of unimportant
terms. If too few regressors are specified, then some terms that are needed in
the model may not even be estimatable.
Suppose anexperimenter identifies a set of regressors, s , containing all
p + q regressors he or she believes might be needed in modeling the behavior
+
of a response J-.The linear model is written as y = X p E, with y denoting
-
the I? observations to be collected in an experiment, under the assumption
+
that y ( p , o2 N ( X P , 0’1). The model matrix, X , has dimensions 11 x (p q ) ,
+
with the p y columns defined by the set of regressors, x. Quite often, the
experimenter has knowledge of the process or system that allows him or her
to identify p of the regressors as prin?ar.y ternw. These are the terms that the
experimenterstrongly believes are needed in modelingtheresponse. The
remaining terms are the poterzticrl torms, i.e., those terms about which the
experimenterhasuncertainty. For example,theexperimentermayknow
frompast experience thatcertain process variablesmust be included in
themodel as main effects (i.e.,linearterms) but is uncertain if higher
ResponseSurface Methodology 471
where cjj is the,jth diagonal element of ( I/o’)V1. Since the estimated effect
of any regressor .x/ is proportional to its standardized estimated coefficient,
the relative importance of the various model terms can be estimatednby the
relative sizes of the @‘s (in absolutevalue).Nornlalizing these 4;’s (in
absolute value) produces a set of discrete scores or “weights of evidence”
that quantify the relative importance of each model term. In other words, a
, ..., z,,+[,},
new set of z’s, ( T ~ ‘cZ, is produced based on this updated prior
information. Going into the second stage, beliefs about the relative impor-
tance of thep + (I model terms are expressed as plo-, z-, y,(O, 02T),where T
7 7
distributionotherthanthenormal.Consider,forexample,aquality
inlprovement program at a plastics manufacturer focused on reducing the
number of surface defects on i11jection-molded parts. The response in this
case is the defect count per part, which most naturally follows a Poisson
distribution, where the variance is not constant but is instead equal to the
mean.Consideralsoapplications in the field of reliability, in which the
equipment’s time to failure is the quality response under study. Again, the
most natural error distribution is not the normal, but instead the exponen-
tial or gamma, both of which have nonconstant variance structures. These
types of problems nicely parallel similarproblems that exist in thebio-
medical field, particularly in the area of dose-response studies and survival
analysis.
Regression models based on distributions such as the Poisson, gamma,
exponential,andbinomial fall intoa family of distributionsandmodels
known as generalized linear models (GLM). See McCullough and Nelder
(1989) for an excellent text on the subject. In addition the reader is referred
to Myers and Montgomery (1997) for a tutorial on GLM. In fact, all dis-
tributions belonging to the exponential family are accommodated by GLM.
These models have already been used a great deal in biomedical fields but
arejust now drawinginterest in manufacturingareas. I n thepast,the
approach has been to normalize the response through transformation, so
thatOLS modelparameterestimatescould be calculated. Hamadaand
Nelder (1997) show several examples in which the appropriate transforma-
tion either did not exist or produced unsatisfactory results compared to the
appropriate GLM model. They also spoint out that with the progress that
has been made in computing in this area, the GLM models are just aseasily
fit as theOLS modeltothetransformed data. A few examplesoftware
packages with GLM capability areGLIM, SAS PROCGENMOD, S-
plus, and ECHIP.
I t is interesting that some work has been done that provides a con-
nective tissue between generalized linearmodels androbustparameter
design. This relationship between the two fields is extremely important, as
it allows the response surface approach to Taguchi’s parameter design to be
generalized to clearly non-normalapplicationsthat were previously dis-
cussed in this section. Engel and Huele (1996) build a foundation for this
important area, and there will certainly be other developments.
The difficultycomes in designingexperiments forGLM models.
Design optimality criteria become complex, and designs are not simple to
construct even in the case of only two design variables. See, for example,
Sitter and Torsney ( 1 992) and Atkinson and Haines (1 996). One most con-
stantly be aware that even if an optimal design is found it requires parameter
Response Surface Methodology 475
3.5. NonparametricandSemiparametricResponseSurface
Methods
Consideraresponsesurfaceproblem in which thequalitycharacteristic
(response) of interest is expected to behave in a highly nonlinear fashion
as a function of a set of processvariables.Althoughthemodelform is
unknown, the model structure is of less importance than theability to locate
the process conditions that result in the optimum response value. The pri-
mary interest is in prediction of the response and understanding the general
nature of theresponsesurface.Additionally, in many of these kinds of
problems the ranges in the design problems are wider than in traditional
RSM in which local approximations are sought.
In the problem above, greater model flexibility is required than can be
achieved with a low-order polynomial model. Nonparametric and semipara-
metric regression modelscan be combined with standard experimental
design tools to provideamore flexible approachtotheoptimization of
complexproblems.Some of thenonparametricmodelingmethodsthat
may be considered are thin-plate spline models, Gaussian stochastic process
models, neural networks, generalized additive models (GAMS), andmultiple
adaptive regression splines (MARS). Thereader is referred to Haaland et al.
(1 994) for a brief description of each model type. Vining and Bohn (1 996)
introduced a semiparametric as well as a nonparametric approach to mean
and variance modeling. The semiparametric strategy involved the use of a
nonparametricmethodtoobtain varianceestimates which thenbecame
inputs to modelingtheresponsemean via weighted least squares. As an
alternative approach they suggested utilizing a nonparametric method for
modeling the response mean as well as the variance.
Haaland et al. (1996) point out that the experimental designs used for
nonparametric response surface methods can include some of the traditional
designs. For example, one may execute a series of fractional factorials fol-
lowed by a central composite design, then develop a global model using a
nonparametric method. An alternative to this design approach is to execute
a single spcrce-filling design, which covers the entire region of operability in
one large experiment. This type of design is not based on any model form
but instead contains points that are spread out uniformly (in some sense)
over the experimental region. The intent is that no pointin the experimental
region will be very far from a design point. Space-filling designs have pri-
marily been used in computer experiments but have also been applied in
physical experiments in the pharmaceutical and biotechnology industries.
476 Neff and Myers
See Haaland et al. (1994) for references. Among the space-filling designs is a
class of distance-based design criteria that focus on selectionofa set of
design points that have adequatecoverage and spread over the experimental
(or operability) region. Two software packages that will construct distance-
based designs are SAS PROC OPTEX and Design-Expert.
-
since the errorassumptions associatedwiththe
model[i.e.,all E, N ( 0 , o’)]are no longervalid. Let
basic responsesurface
be thewhole-plot
error variance and o: the subplot error variance resulting from the first and
seonc randomization, respectively. The model and error assumptions then
become
y=xp+6+E
where
Response Surface Methodology 477
6+E"N(O,V)
and
+
V =o ~ J o ~ I
Assuming that there a r e i whole plots, then J is a block-diagonal matrix with
nonzeroblocks of theform lbixl x and hi is thenumberof observa-
tions in theithwholeplot, i = 1 , 2 , ...,.j. Notethat while observations
belonging to different whole-plot EUs are independent, those h i , observa-
tions within a given whole plot are correlated.
Practitionersmay be tempted to ignorethebirandomizationerror
structure, analyzing the data as if they came from a completely randomized
design (CRD). The analysis of a split-plot design as a CRD, however, can
lead to erroneously concluding that whole-plot factors are significant when
in fact they are not, while at the sametime erroneously eliminating from the
model significant subplot terms including whole-plot-subplot interactions.
Unlike model estimation for the CRD, the error variances play a major role
in the estimation of coefficients in the birandomization model. Under the
assumption of normal errors, the maximum likelihood estimate (MLE) of
themodel is now obtainedthroughthe generalized least squares (GLS)
estimation equations
(b) = ( x ' v - ' x ) - ' x ' v - ' y
and
Var(b) = (X'V-'X)-'
Notethatboth estimatingequationsdepend onand CY:throughthe
matrix V; therefore proper estimation of these error variances becomes a
priority.
Appropriateness of variousmodel and error estimation methods is
dependent on the structure of the birandomization design (BRD). The gen-
eral class of BRDs is divided into two subclasses: the crossed and the non-
crossed. The distinguishing characteristic is that in the case of the crossed
BRD, subplot conditions (i.e., factor level combinations) are identical across
whole plots. This is the familiar split-plot design, which may result from
restricted randomization of a 2 k ,3 k , or mixed-level factorial design. In the
case of the noncrossed BRD, each whole plot may have a different number
of subplot EUs aswell as different factor combinations. Such a design could
result from restricted randomization of a 2"-" fractional factorial design or
second-order design such as the central composite design (CCD) or Box-
Behnken design.
478 Neff and Myers
For the crossed BRD, Letsinger et al. (1996) show that GLS = OLS
under certain model conditions, and therefore error variance knowledge is
not essential for model estimation. Model editing, however, does depend on
the availability of estimates of 0; and o,?. One approach to estimating these
variances makes use of whole-plot and subplot lack of fit. See Letsinger et
al. (1986) for details.
In general,modelestimation and editing are more complex forthe
noncrossed BRD. I t is interesting topointout, however, that when the
model is first-order,parameterestimationcan be accomplished using the
equivalency of GLS = OLS (as in the crossed case). Once again, model lack
of fit can be used to develop estimators for the error variances, although the
procedure is more complex than that for the crossed BRD. Both estimation
and editing of a second-order model, however, depend on estimates of oi
and 0: through the matrix V . Three competing methods arementioned here:
OLS,iterated reweighted least squares(IRLS),and restricted maximum
likelihood (REML).
One can argue that in some cases OLS is an acceptable method, even
though it ignores the dependence among observations within each whole
plot of the BRD. Infact, OLS provides an unbiased estimator of 0. Also, for
designs that provide little or no lack-of-fit information (for estimation of o,?
and o:),the researcher may be better served by not trying to estimate V than
by introducing more variability into the analysis. The IRLS method begins
with an initial OLS estimate of 0, then uses an iterativeprocedurefor
7 7
estimating oi, 0; and p untilconvergence is reached in fi. TheREML
method, first developed by AndersonandBancroft (1952) and Russell
and Bradley (1958), is similar to MLE except that it uses the likelihood of
a transformation of theresponse, y. Refer to Searle et al. (1992) for a
discussion on REML and its relationship to MLE. The PROC MIXED
procedure in SAS (1992) can be adapted to calculate REML estimators.
Letsinger et al. (1996) give details on the use of PROC MIXED forthe
analysis of a BRD.
The recent reminder that many RSM problems are accompanied by
designs that are not completely randomized will hopefully produce new and
useful tools for the practitioner. In that regard it is of great interest to note
thesimilarity between thesplit-plotRSMproblem(asfar as analysis is
concerned) and the approach taken with generalized estimating equations
that find applications in the biostatistical and biomedical fields. The analysis
is ver similar, though in the longitudinal data applications there generally is
no designed experiment.Liang and Zeger (1986) andothers extendthis
work to generalized linear models and indeed assume various correlation
structures rather than the exchangeable correlation structure inducedby the
Response Surface Methodology 479
4. CONCLUSION
REFERENCES
Abdelbasit KM, Plackett RL. (1983). Experimental design for binary data.J Am Stat
Assoc 78:90-98.
AndersonRL.BancroftTL. (1952) StatisticalTheory in Research. New York:
McGraw-Hill.
AtkinsonAC.HainesLM. (1996). Designs fornonlinearand generalized linear
models. I n : Gosh S, RaoCR, eds. Handbook of Statistics, vol. 13.
Amsterdam: Elsevier, pp. 437475.
Box GEP, Jones S. (1992). Split-plot designs for robust product experimcntxtion. J
Appl Stat 1913-26.
Chaloner K. Verdinelli I. (1995). Bayesian experimental design: A review. Stat Sci
10:273-304.
Dumouchel W, Jones B. (1994).Asimple Bayesian modificationof D-optimal
designs to reduce dependence on an assumed model. Technometrics 36:3747.
Engel J, Huele AF. (1996). A generalized linear modeling approach to robust design.
Technometrics 38:365-373.
Giovannitti-Jensen A, Myers RH. (1989). Graphical assessmentof the prediction
capability of response surface designs. Technometrics 31: 159-1 71.
Haaland PD, McMillan N., Nychka D, Welch W. (1994). Analysis of space-filling
designs. Comput Sci Stat 26: 1 1 1-120.
HaalandPD.ClarkeRA, O’Connell MA,NychkaDW. (1996). Nonparametric
response surface methods. Paper presentedat 1996 ASA Meeting, Chicago,1L.
Hamada M. Nelder JA. (1997). Generalized linear models for quality improvement
experiments. J Q L ITechnol
~ 29292-308.
Khattree R. (1996). Robust parameter design: A response surface approach. J Qual
Technol 28: 187-1 98.
Kiefer J. (1959). Optimum experimental designs (with discussion). J. Roy Stat Soc.
Ser B 21:272-319.
480 Neff and Myers
483
484 Index
Improvement:
Effect sparsity. 306 circle, 22
Efficient design strategy, 309 continuous, 38
Elliptical, 196 process, 36
Employee satisfaction indices (ESI), 24 Information matrix. 342
Employees’ ideality, 64 expected, 293
Environmental factors. 360 Interaction, 395
Environmental interaction effect: Isotonic inference. 407
dcsign by. 360 Iterated reweightcd least squares
Environmental variables, 359 (IRLS). 478
486 Index